Cancer research

In Ross et al. (2000), an important dataset for the molecular classification of different types of cancer was introduced, namely the NCI60. The data corresponds to gene expression in 64 cell lines using DNA microarrays robotically spotting 9,703 cDNAs. The cDNAs included approximately 8,000 different genes. At the time of presenting this dataset, 3,700 of the genes represented previously characterized human proteins and 2,400 were identified only as ESTs. Our work is based on the authors' website supplement with the gene expression of 6,831 genes.

There are several good reasons to use this instance for our studies. In their original paper, Ross et al. have identified several groups of genes that correspond to some of the tissue characteristics of the cell lines. Of particular interest for our objectives are two groupings named "Leukaemia Cluster" and "Melanoma Cluster".

These have been visually identified from a hierarchical clustering as a highly-expressed group of genes in the leukaemia-derived and in most of the melanoma-derived cell lines. It is, however, very difficult to identify, from a hierarchical clustering, an analogous group of genes that is highly under-expressed and that is a robust significant marker of differential expression within the same cell-line and that at the same time discriminates well all other types of lines. The approach that we used in this study has been designed to uncover such groups if they exist. To our knowledge, no other method has been able to identify some of the key genes that allow such an interpretation linking both the highly expressed or under expressed gene expression of groups of genes on this dataset.

In addition to finding large signatures for the Leaukaemia and Melanoma cell lines, we also found signatures for other three types of cancer represented in the NCI60 dataset: Central Nervous System, Renal and Colon. They are shown in the figure below.



For more information about the results and the list of genes in each of the signatures, please refer to the publication:

Molecular Classification of Cancer using Integer Programming Models and Algorithms
R. Berretta, A. Mendes and P. Moscato, submitted to the Journal of Research and Practice in Information Technology

Or go directly to the supplementary material webpage.




















Contact US      Opportunities
School of Electrical Engineering and Computer Science - Faculty of Engineering and Built Environment - The University of Newcastle, Callaghan, NSW, 2308, Australia
Phone: +61 2 4921-7758 / +61 2 4921-6056       FAX: +61 2 4921-6929       Webmaster: Alexandre Mendes