CMMB 461 Lecture Notes - Lecture 9: Dna Microarray, Hierarchical Clustering, Dendrogram

27 views3 pages

Document Summary

Microarrays generate thousands of data points for each experiment (high dimensional data) Viewing/analyzing such volumes of data is overwhelming in spreadsheets and graphs. One approach to reduce the number of data points is to cluster and group objects (eg. genes) based on their similarity to each other. Genes that tend to be in the same pathway or have the same function should show the same properties. Euclidean distance: absolute distance between two points in space. Correlation distance: similarity of the directions in which two vectors point. How close the vectors are in terms of the direction in which they are moving. For microarrays, we are clustering genes with similar expression. Do not have to specify (cid:1688)k(cid:1689) number of clusters. Deterministic: cluster together the two closest genes and then the next closest. If you take the same data and use the same clustering metric, you"ll always get the same results. Have to specify (cid:1688)k(cid:1689) number of clusters.

Get access

Grade+20% off
$8 USD/m$10 USD/m
Billed $96 USD annually
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
40 Verified Answers
Class+
$8 USD/m
Billed $96 USD annually
Class+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
30 Verified Answers

Related Documents