In particular, the class prediction problem attempts to establish differences between the different classes. Knowing these differences allows the system to predict the type of cancer a patient has before any other tests are made.It should be noted that the class prediction problem is composed of two distinct phases. The first step is supervised learning wherein patterns are established based on previously known results and the algorithm is trained to match the given results. Once the system has been trained, it is then possible to use the system to evaluate any given input without knowing a priori the expected results. The goal of the class prediction problem is to achieve a 100% accuracy rate in the evaluation.The datasets available for the study contain DNA samples from AML and ALL cancer patients. Each sample is composed of a set of genes which have been extracted. It should be noted that the genes in use in the datasets comprise a small portion of the total genes in an individual. Mapping a complete set of genes is impractical both in laboratory testing (obtain the expression of the gene) as well as data analysis.Since an individual strand of DNA cannot be isolated from a person, data for genes represent the mean values for the samples taken from a person. A quantitative value of how much a specific trait manifests in the genes is given as the expression level. To equally represent different genes, the expression level is normalized for the entire dataset. In the class prediction problem, each gene is sampled for the entire dataset and the normalized expression level is compared between the classes to establish patterns.In contrast to the class prediction problem, the class discovery problem is significantly more complicated. The basic premise of class discovery is that certain similarities may be present between samples.