BCH 4101 Lecture Notes - Lecture 4: Sanger Sequencing, Oligonucleotide, Parallel Array
September 25, 2017
Next Generation Sequencing Methods
Genome Sequencing
Sequencing the human genome for the first time took 13 years and cost $3 billion
There were many technical improvements, but techniques still weren’t adequate to sequencing large numbers of
human genomes
There was a need to develop new/improved methods (faster, less expensive)
Lead to genome “re-sequencing” projects
Old vs. New techniques:
-Automated Sanger sequencing = first generation technology
-Newer methods = next-generation sequencing (NGS - post 2008)
Sequencing is now much more cost-effective:
-
-Moore’s law: theoretical law that says that every 24 months, our computing capacities double
•2001-2008: sequencing technologies followed Moore’s law
•2008: now an exponential decreasing in cost - bypassing Moore’s law
-Due to the development of next-generation sequencing (NGS)
•2016: roughly $1000 and 3 days to sequence a genome
-Due to:
•Integration: having something that is a continual workflow
-Possible due to increased automation
-Integrates all of the steps - minimized human interaction between steps
•Parallelization: being able to sequence more than one sequence at a time, be able to scan the fluorescence of
many molecules at once
1
September 25, 2017
•Miniaturization: decreased size of technology
-Scaling everything down in order to be able to sequence many molecules at once
Shared Common Workflow between NGS Methods
NGS = aka massively parallel sequencing
1. Library preparation (massively parallel arrays): prepare library of DNA to be sequenced
-Method:
•Genomic DNA is fragmented using sonication: use sound waves to shear and break up the DNA
-NB: DNA must be broken up randomly because we’re using paired end reads
-Need to create every end possible
•Creation of a massive parallel array
-Need some kind of solid support to attach your DNA molecules
-Can be as small as a microchip
2. Sequencing: obtaining order of nucleotides in the piece of DNA
3. Imaging
Library Preparation
Method 1: base-paring to oligonucleotides on a glass slide
-Assemble genome using a whole genome shotgun approach
-Method:
•Double stranded oligo adaptors are ligated to the ends of the DNA fragments
•Denature the fragments of DNA and hybridize them to a glass slide that is coated with oligonucleotides
that are complimentary to the adaptors
•Bridge amplification of ssDNA fragments: DNA attached to the glass slide will bend over and base pair
with the oligonucleotides on the surface (both ends of the DNA hybridize to the slide
•Oligonucleotides attached to the surface are used as a primer
•DNAP extends from the primer (resulting DNA strand is immobilized to the array)
-Original strands can be washed away and the new copy will be attached to the array that can then be
copied again
•End up with immobilized clusters of identical fragments (thousands of copies per individual fragment of gDNA)
-Increases the signal when sequencing - now you are detecting thousands of the same fluorophore instead of
only one
2
September 25, 2017
-Gives a clear signal that can be measured
-NB: need to dilute the genomic DNA enough so that every strand that hybridizes to the glass slide is relatively far
away from its neighbour
•Molecules need to be separate from each other
Method 2: base-pairing to metallic beads
-Solid support = metallic bead
-Ligate oligonucleotide adaptors to the end of the genomic DNA
-Dilute the DNA so that the ratio is 1 fragment/bead using emulsion PCR
•Take a mixture of oil and water, mix in DNA, and vortex it all
•Bubbles will form and each bubble “traps” a bead along with either zero or one DNA strand from the library
•Will end up with bubbles that each contain a bead and a different fragment of DNA
•Allows you to make many “micro-PCR” reactions
-Oligonucleotide adaptor sequence is complementary to the bead, causing the genomic DNA to associate with the
bead
-Bridge PCR will then allow the sequence of the genomic DNA to be amplified - causes the copy to remain attached
to the bead
•DNA template will dissociate and stick with another sequence on the same bead and will be copied again
•Will continue until the surface of the bead is covered with identical genomic DNA sequences
-Beads can be taken and deposited on a chip
Next Generation Sequencing
Various methods have been designed - greatest speed in generating accurate data with the lowest cost
Short-read sequencing: only sequence up to 1000 bp
Two kinds:
-Sequencing by synthesis (SBS): dependent on DNAP
-Sequencing by ligation (SBL): dependent on DNA ligase
Sequencing by Synthesis
454 Pyrosequencing (454 Roche): reads up to 1000 bp in length (700 Mb per run)
-Template is attached to metallic bead
-Method:
•1st cycle:
3
Document Summary
Sequencing the human genome for the rst time took 13 years and cost billion. There were many technical improvements, but techniques still weren"t adequate to sequencing large numbers of human genomes. There was a need to develop new/improved methods (faster, less expensive) Automated sanger sequencing = rst generation technology. Newer methods = next-generation sequencing (ngs - post 2008) Moore"s law: theoretical law that says that every 24 months, our computing capacities double: 2001-2008: sequencing technologies followed moore"s law, 2008: now an exponential decreasing in cost - bypassing moore"s law. Due to the development of next-generation sequencing (ngs: 2016: roughly and 3 days to sequence a genome. Due to: integration: having something that is a continual work ow. Integrates all of the steps - minimized human interaction between steps: parallelization: being able to sequence more than one sequence at a time, be able to scan the uorescence of many molecules at once.