BINF 511 Lecture Notes - Lecture 4: Raw Score, Substitution Matrix, National Center For Biotechnology Information

51 views4 pages
Lecture 4C: Sequence processing and analysis
January 31, 2018
Sanger sequencing recap
Aka "chain termination sequencing" or "dideoxy sequencing"
This is the gold standard for sequencing
A hydroxyl group (deoxynucleotide) needs to be present for this to work
When you run the gel
oThe larger fragments are slower
oThe smaller fragments are faster
oThen you call bases for the sequence
Dye chemistry sequencing
4 reactions are run in the same lane
96 samples can be run on one gel
The trace or chromatogram is automatically read to a file
Why not just genome sequences?
Need to predict open reading frames
Introns?
Is it random open reading frame?
Alternative splicing?
Non functional pseudo-gene? (definition: a section of a chromosome that is an imperfect copy
of a functional gene)
We need expression data to know that a gene is functional
ESTs (Expressed Sequence Tags)
oEST is a quick and high throughput method -> we get dirty data
Microarrays built from EST sequences
SAGE data similar to ESTs
oBut we still need ESTs to know to which gene they belong
Example of Sanger sequencing
Phred = the base caller; it gives the quality score (need to remember the name of this tool for
the midterm)
oPhred score < 20 is not good, most likely an error
oPhred score > 30 is good
oPhred error probabilities (quality values/phred scores)
"On the basis of the trace characteristics, phred computes a probability of an
error in the base call at each position, and converts this to a quality value q using the
transformation"
The quality of 30 corresponds to an error probability of 1/1000; 40 corresponds
to an error probability of 1/10000
find more resources at oneclass.com
find more resources at oneclass.com
Unlock document

This preview shows page 1 of the document.
Unlock all 4 pages and 3 million more documents.

Already have an account? Log in

Document Summary

Aka "chain termination sequencing" or "dideoxy sequencing" This is the gold standard for sequencing. A hydroxyl group (deoxynucleotide) needs to be present for this to work. When you run the gel o o o. 4 reactions are run in the same lane. 96 samples can be run on one gel. The trace or chromatogram is automatically read to a file. Non functional pseudo-gene? (definition: a section of a chromosome that is an imperfect copy of a functional gene) We need expression data to know that a gene is functional. Est is a quick and high throughput method -> we get dirty data. But we still need ests to know to which gene they belong. Phred = the base caller; it gives the quality score (need to remember the name of this tool for the midterm) o o o. Phred score < 20 is not good, most likely an error.

Get access

Grade+20% off
$8 USD/m$10 USD/m
Billed $96 USD annually
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
40 Verified Answers
Class+
$8 USD/m
Billed $96 USD annually
Class+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
30 Verified Answers

Related Documents