BCHM 4400 Lecture Notes - Lecture 10: Information Retrieval, Non-Coding Rna, Sequence Database

31 views3 pages

Document Summary

Genbank: annotated collection of publicly available dna sequences. Contains single pass cdna sequences or expressed sequence tags (ests) from many organisms. Provide nonredundant set of gene transcripts for organisms. All ests and other expressed sequences from an organism are used to create clusters of sequences. Reference sequence database: aim to provide high quality, comprehensive, nonredundant set of sequences. Refseq datasets: used for functional annotation of genome sequencing projects. Uniprot: resource for protein sequence and functional information. Swiss-prot : protein records w manual annotation based on lit and computational results. Pir (protein information resource): produced the protein sequence database. Proteomes: proteome is set of proteins thought to be expressed by organism. Uniprot provides proteome for species w complete sequenced genomes: uniprot supports text search, blast search and ftp download. Uniref clusters and proteomes: uniprot reference clusters (uniref): Uniref100: derived by combining identical sequences and sub fragments. Uniref90: built by clustering sequences w at least 90% identity and 80% overlap.

Get access

Grade+20% off
$8 USD/m$10 USD/m
Billed $96 USD annually
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
40 Verified Answers
Class+
$8 USD/m
Billed $96 USD annually
Class+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
30 Verified Answers

Related Documents