Class Notes (1,100,000)
US (470,000)
UM (4,000)
BIL (400)
BIL 250 (20)
Lecture 10

BIL 250 Lecture Notes - Lecture 10: Celera Corporation, Human Genome Project, Fastq Format


Department
Biology
Course Code
BIL 250
Professor
Kevin Mc Cracken
Lecture
10

This preview shows half of the first page. to view the full 3 pages of the document.
Lecture 10 b : Shotgun DNA Sequencing
Point is to be done quickly
No requirement for high res linkage or physical map
Collect the Data and deal with it later
Reverses the way genetic studies proceeds
Break the genome into lots of pieces before sequening
1. Extract DNA
2. Get different sizes
3. Electrophoresis
4. Purify DNA from gel
5. Prepare clone library
6. Get end sequences of DNA inserts
7. Put into computer
8. Short decoded segments
9. Find overlaps and assemble into contigs
Dideoxy method:
First time genome was sequenced
Begin with DNA
Shear the DNA into 2 kb oberlapping fragments
Isolate on agarose , purify, and cloned into vectors
They would sequence 500 bp from the end of each insert
So from the middle, 1,000 bp of each insert is obtained from overlapping
They would sequence the genome 5 times over
This would give 97% coverage of genome, what is missing is repeated DNA
What about the other 3% of DNA?
Repeated DNA is a problem creates ambiguity in assembly
To solve that is to create a second library made of 10,000 bp clones and find unique
DNA to aid in assembly
You had a 2 kb library and a 10 kb library
Human Genome
1. HGP( Human genome project)
Dideoxy sequencing and mapping approach
2. CRA (Celera Genomics Corporation)
Direct shotgun approach and dideoxy sequencing
Used Method A and B
We have:
32,000 Genes
50% repeated DNA
1-1.5% codes for proteins
Share 223 genes found in bacteria
Lookup:
1. Subclone
2. Cytogenetics
Read over SLIDE 17
Consequence, sequencing is easy to generate. Handling the data becomes the big challenge
How much dad is the human Genome? 2CDs
It does not have any info on base quality, or the accuracy of that genome.
When it comes off a sequencer, it is about 2- 30 Terabites for one X of the genome
find more resources at oneclass.com
find more resources at oneclass.com
You're Reading a Preview

Unlock to view full version