This project illustrates how we can use genome expression profiles and hierarchical clustering techniques to better understand the similarities and differences between drugs (see the BB material on this website associated with Example 8.1.2 for further details on how genome expression profiles are measured). The data in this project are fictitious and designed to make the concepts as transparent as possible. Nevertheless, this very same procedure has been used in many real-world contexts, including the diagnosis of cancers. 


For example, diffuse large B-cell lymphoma is a type of cancer of the blood. It has been noted that patients diagnosed with this form of cancer vary widely in their outcome, with 40% of them responding to treatment and 60%  succumbing to the disease. It was suspected that this difference in prognosis is due to differences among patients in the nature of the cancer. Unfortunately, most existing diagnostic techniques don’t allow doctors to see any such differences among the cancers of patients with these different outcomes. 


To explore this idea further researchers therefore measured the genome expression profile of the cancer in several patients (the data set involved more than 1.8 million measured expression levels!) The set of expression profiles (one from each patient) were then clustered hierarchically by similarity, just as we did for the drugs in this project. The researchers found that patients who tended to cluster together into groups also tended to have a similar disease outcome. This means that there are distinct patterns of genome expression in B-cell lymphoma associated with different clinical outcomes, and that these expression profiles might even be used to refine the diagnosis of cancer patients and predict their likelihood of responding to treatment.


References

Alizadeh, A.A. et al. 2000. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403:503-511

© James Stewart and Troy Day, 2014