BSC 4434 Lecture Notes - Lecture 3: Fasta Format, Genbank, National Center For Biotechnology Information

69 views3 pages

Document Summary

Describe how dna and protein sequences are identified in databases. Discuss the types of formats used for dna and protein sequences. Explain how to read a genbank entry. Show different ways of extracting dna protein sequences from ncbi. Find the data, download the data, reformat the data. Collect the samples, run molecular analysis, filter the data. Run analysis software, collect and sort results, publish/data sharing. Store as a string, code as binary numbers. Starts with > with a [return] at the end. All other characters are part of sequence. Other types of important medical and genetic data may not have universal standards. Much of the routine work of bioinformatics involves messing around with data files to get them into formats that will work with various software. Explicitly linked nucleotide and protein sequences updates to reflect current knowledge of sequence data and biology. Data validation and format consistency distinct accession series.

Get access

Grade+20% off
$8 USD/m$10 USD/m
Billed $96 USD annually
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
40 Verified Answers
Class+
$8 USD/m
Billed $96 USD annually
Class+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
30 Verified Answers

Related Documents