Microarray analysis

From Bioinformatics.Org Wiki

Jump to: navigation, search


Assume that a researcher has microarray expression values for two disease subtypes: A and B. Within these samples, some are known to respond to drug X, while the rest do not respond to any known drug. The names and functions of the gene products are known and in a database. Also, the gene expression data do not already include expression under any drug.

First, knowing the names and functions of the gene products can help validate the predictive capabilities mentioned here.

The approach taken by Eisen et al (1998)[1] and Perou et al (2000)[2] can be followed. The program Cluster can be used to obtain a hierarchical clustering of the genes and samples, using average linkage clustering to get coarser clusters. The results of the clustering could then be examined using the auxiliary program TreeView.

Samples of the same type (diseased subtype or normal) would have expression profiles that are characteristic of their type and would thus be clustered together by the clustering algorithm. As mentioned in the articles, this gives us the ability to classify the samples and predict the classification of new samples.

Diseased samples having characteristic expression profiles means that these samples have genes which have certain levels of expression. The program Cluster will also cluster genes of similar expression profiles, thus revealing which clusters of genes, with certain expression levels, are associated with which samples -- some samples being "diseased" and some being of a certain disease subtype.

Another microarray experiment can then be performed where the diseased samples are treated with drug X. Reclustering these new data with those from the untreated samples will then reveal which genes change their expression levels under treatment. Those genes are likely targets of drug X and can be examined alone (without any of the other genes), for a patient, for changes in expression levels (matching the normal samples) under the drug.

As mentioned above, knowing the identities of the samples and genes can help validate predictive capabilities. If sample and gene identities are not known, validation can be done by performing a microarray experiment with one or more known samples or genes, and seeing how they cluster.

It might be desirable to know if drug X is more effective on disease subtype A or B. This could be discerned from the experiment performed using the drug, mentioned above.



  1. Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D. 1998. Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95:14863-14868.
  2. Perou, C.M., Sorlie, T., Eisen, M.B., van de Rijn, M., Jeffrey, S.S., Rees, C.A., Pollack, J.R., Ross, D.T., Johnsen, H., Akslen, L.A., Fluge, O., Pergamenschikov, A., Williams, C., Zhu, S.X., Lonning, P.E., Borresen-Dale, A.L., Brown, P.O., Bolstein, D. 2000. Molecular portraits of human breast tumors. Nature 406(6797):747-752.
Personal tools
wiki navigation