BIOL 200 Lecture Notes - Genetic Algorithm, Genetic Code, Dna Replication

32 views5 pages
6 Apr 2012
of 5
Naveen Sooknanan McGill Fall 2011
Genomics and Bioinformatics:
Genomics involves the sequencing pieces of DNA within a genome (particularly the human
genome) and mapping the location of genes and other genomic entities
This mapping is called annotation
To this day, we are still trying to figure out the exact number of genes within the human
Many different branches of genomics have arisen during the “-omics” era:
Functional genomics is the study of annotations
Proteomics is the study of gene products (such as proteins)
Evolutionary genomics has recently developed over the past 3-4 years and studies
evolutionary trends and patters with respect to the genome
Transcriptomics look at the gene during various points in development
Phenomics looks at changes to the genes with given phenotypes
Other fields exist, such as: spliceomics, glycomics, metabolomics, lipidomics, kinomics,
neuromics, predictomics
o Getting to the point where every field has its own “-omics”
The first genome sequences was the Epstein-Barr virus in 1984, because it has a very small
genome, and since then, DNA has been sequences from 1000s of different, more complex
Mammalian sequencing is much more complex because of a much larger genome
Over 1700 genomes sequences by 2011
Various viruses and bacteria have been sequences fro obvious purposes, such as the influenza
virus and Yersinia pestis (bubonic plague).
Sequences have also been found that are from organisms which we now refer to as model
Yeast is very easy to sequence because it has one of the smallest eukaryotic genomes
Zebrafish are important in research into the nervous system
Roundworm, mice, fruit fly and plants are also model organisms
There are over 600 completed and ongoing projects
Chimpanzees have not been fully sequences, as with humans
Sea squirts have been completely sequenced, and are useful for muscle development
Barley and grape genomes have been sequences for beer and wine production
9 banded armadillo, a carrier of leprosy, have been sequenced
The elephant’s genome has been sequences
o Note the size of the organism is not proportional to the size of its genome
A type of beetle that carries Chargoff’s disease has been sequenced
Naveen Sooknanan McGill Fall 2011
Genomics can be seen as the piecing together of the world’s largest jigsaw puzzle. However,
there are no visual patterns or algorithms that can be used for the genome, they all must be
deciphered from random coding of A, C, T and G.
A classic method of DNA sequencing is called shotgun sequencing, where the genome is blasted
into 1000s of overlapping fragments by a process called mechanical shearing.
When these fragments are placed in the presence of a template strand, complementary
base pairing occurs between the fragments and the
template, but also between overlapping fragments
This binding of fragments one on top of the other
creates a stair looking structure known as a tiling path
o This creates continuous sequences known as
contigs with gaps in-between. These gaps
cannot be bridged for many technical reasons
With many sequences, it is possible to piece together various contigs with perfect matches in
overlapping fragments to produce the entire genome sequence.
Nowadays, this technology (normal dideoxy DNA sequencing) is considered slow and out-dated
as many new technologies have been invented by various bioinformatics companies.
Roche 454 pyrosequencing and Illumina Hisequencing are two examples of next
generation sequencing methods by two different companies
o These are sued in current genome projects and speed up the process by many
orders of magnitude
These technologies are also getting cheaper and cheaper to use making them more
Roche 454 pyrosequencing has a very similar procedure to that of oligonucleotide synthesis. It
involves the immobilization of a substrate onto a bead which are immobilized in bigger wells
Starting with the entire genome, mechanical shearing occurs to tear the genome into more
manageable fragments
These fragments are then denatured (forming ssDNA) and are attached to attachments
These segments then interact with a molecule (bead) within a well
The bead quickly accumulated many more DNA fragments and are immobilized in wells
within a test plate
Reagents are then added in a stepwise manner
Naveen Sooknanan McGill Fall 2011
Pyrosequencing requires many typical reagents we have seen before, such as a template, primers,
dNTPs and DNA polymerase, but also requires new reagents:
ATP sulfurylase
Adenosine 5’ phosphosulfate (APS)
Luciferase, which is responsible for the glowing of fireflies, and luciferin, which is
required in this glowing reaction
Normal DNA polymerization produces a molecule of
pyrophosphate (PPi) through the hydrolysis of ATP.
The Roche 454 pyrosequencing method uses this PPi to
react with APS, catalyzed by sulfurylase, to produce
sulfate and ATP
This ATP can then be used by luciferase (with luciferin)
to produce oxyluciferin and light
The addition of 1 dNTP, therefore, will produce one pulse of light. Therefore, by controlling
which dNTP is added, we can determine the sequence of the DNA fragment
If we add dATP, the reaction will only light up where an A was added to the chain
Apyrase is used to break down unused dNTPs into dNDPs, dNMPs and phosphate, as
well as ATP into ADP, AMP and phosphate to assure accurate readings
This light emmitence is detected by a sensitive machine and
its intensity is recorded. For example, if three identical
nucleotides are added (AAA) then the light emmitence is
three times as intense.
The wells all contain these immobilized beads and
many reactions take place simultaneously
Illumina Hisequencing also has resemblance of oligonucleotide synthesis because it involves the
use of a blocking agent to stop the DNA polymerization activity.
Removable blocking agents are used (similar to ddNTPs) which
stop the polymerization
Each blocking agent is labelled with a specific fluorescent dye
Similar to normal DNA sequencing, blocking agents are in much
lower concentration than regular dNTPs and are incorporated
randomly into the sequence, stopping the polymerization reaction
A machine is able to record these fluorescent labels to determine the sequence of the gene
and the blocking agent can be removed, continuing with the polymerase reaction until
another blocking agent is added
This efficiently determines all of the nucleotides in a given template, given you know a
small part of the sequence to which a primer can be attached
As can be seen from this table, regular sequencing is very slow and takes much, much longer
than next generation methods. It also costs more to sequence a genome using the classical