Biology 1200b – Test Three
Lecture 18 – Cancer
In Canada, cancer is the leading cause of non-accidental death. Men are at a higher risk than women.
The top four most common are prostate, breast, lung and colon cancer. Heritability estimates from twin
studies show that there is a rather low correlation (about 0.27-0.42).
Embryogenesis involves rapidly diving cells. The cell cycle involves a complex called CDK (cyclin
dependant kinase) which is a checkpoint that ensures damaged cells do not replicate their DNA. Cyclins
are produced early in the cell cycle where they bond with CDK, which then phosphorylates targets
downstream and releases the G1 checkpoint. Different sets of cyclins are used for each checkpoint.
Expression of proto-oncogenes like EGF promotes cell cycling, which are sometimes treated as the cause
of cancer. They are, however, genes that are required for cell division but may play a role in the rapid-
cell division involved in cancer. The regulation of cell division can be caused by many different genes and
proteins that are involved in the translation of signals from the EGF.
Expression of tumor suppressor genes slow cell cycling. TP53 is a master tumour suppressor gene that
codes a transcription factor whose activity can result in increased DNA repair, cell cycle arrest by
blocking cyclin/CDK and apoptosis (cell death).
Sporadic cancer requires loss of function mutations in both alleles that code for regular cell
development. This is very rare as it involves both alleles to mutate. It is more common through
inheritance since one damaged repressor may be from either parent.
Inappropriate expression of miRNA can promote cycling, for example oncomirs. Different kinds of
tumors have different kinds of miRNA expression and this is diagnostic.
Cancer is deregulation. Uncontrolled growth can arise from upsetting the balance between the activities
of gene products that promote cell cycling versus those products that suppress cell cycling. Cancer is
also progressive. Various steps have to happen for cancer to be expressed.
Cancer may begin as alterations to gene expression in stem cells. Most tissues contain stem cells, which
are called pluripotent. These cells can differentiate into many different types. When a stem cell divides it
creates one differentiated cell and one stem cell. Cells are not the same in tumors. Some are
proliferative and some are not.
A mouse family with high risk for brain tumor has one defective tumor suppressor. A nucleus is removed
from a mouse embryo and replaced with a tumor nucleus and then given an electric shock. The cell then
divides under the control of a tumor nucleus and creates a mouse. Thus the maternal egg reprograms
the tumor nucleus. This means cancer is perhaps epigenetic. Is cancer contagious? Feline leukemia virus, mouse mammary tumor virus and HPV (human
papillomavirus) are contagious. HPV leads to cervical cancer. It is a DNA virus and many different strains
cause increased growth in different tissues in men and women. The number of new partners greatly
increases chance of acquiring the HPV virus. Lecture 19 – Molecular Homology
Molecular evolution is the study of evolution at the level of nucleic acid and amino acid. Gene evolution
is the study of how genes change over time. Changes in genes that lead to evolutionary change can be
mutation (insertion, deletion, frameshift), duplication, rearrangement and loss. All of these change
phenotype if they’re going to have an evolutionary effect. Some mutations do not change phenotype.
We will ask the question “how a mutation to a gene causes an enzyme to change substrate specificity”.
Change in phenotype leads to selection.
Homology has many definitions. In this course it means common ancestry. For example, the structure of
a flipper and wing come from a common ancestor and are thus homologous, even though they are not
completely similar. How do we know that they share a common ancestor? The gene GlsA in Volvox and
Chlamy is homologous, but they do not have the same nucleotide sequence, amino acid sequence,
length or function.
Genome annotation involves attaching biological meaning to a sequence of DNA. This results in gene
prediction, detection of regulatory elements, finding biological functions through similarity searches and
can be done automatically using algorithms.
Protein-coding gene prediction is used to detect what protein a gene codes for. There is a computer
algorithm that can be used to detect promoter elements, intron/exon boundaries and other conserved
DNA motifs. The computer sequence can splice out exons to create a deduced protein coding sequence.
Protein prediction involves translating all possible reading frames of the gene and detecting which is the
longest with few stops and starts.
Chlamydomonas has 15,000 predicted proteins. These are not definite but are likely. More work is then
done on the predicted proteins. Similar sequences are then researched. The National Center for
Biotechnology Information contains a Genbank with a sequence database. About 23,500 total genomes
have been fully mapped out.
Sequences can be arranged to show regions of similarity and thus used to detect functional and
structural similarity as well as evolutionary relatedness. There are different kinds of alignments. Global
alignment involves precise similarity between two sequences. Local alignment doesn’t force two
sequences to align perfectly but rather looks for different regions of high similarity.
There are 155 gene sequences at GlsA that are very similar to the Chlamy gene. The Volvox gene is
extremely similar, but there are others that have some similarity to certain parts of the sequence. BLAST
analysis shows that there are differences in the nucleotide alignments and amino acid alignments
between Chlamy and Volvox, and thus neither of these are the definition of homology. They also do not
have the same length or even function.
Amino acid sequence comparisons are more informative than nucleotide.
1. Nucleotides are a four letter alphabet and when converted into bits each base has two bits on
information (A00, G01, C10, T11). If I is the total information in a message with G symbols written in an alphabet of n letters, I = (Gln(n))/ln2. There are 20 amino acids and thus 20 characters in the amino acid
2. The genetic code is redundant. There are 64 possible triplets but only 20 amino acids as some codons
code for the same amino acid. Amino acid sequence is more highly conserved. Different nucleotide
sequences can be translated into the exact same amino acid sequence.
3. DNA databases are much larger. This is a bad thing because there is lots of junk, while the amino acid
database is more refined.
Homology determination is based on probability. There is no way to say with absolute certainty whether
two genes are homologous, it is simply a conjecture. There is no way to know what the common
ancestor of Volvox and Chlamy to test its genetic code. Thus decisions are based on similarity
numerically and correlated with probability. The higher the similarity between two sequences the lower
the probability that they originated independently of each other and became similar by chance.
E-Value – The lower the E-value the greater the likelihood of real homology. The Chlamy sequence of
GlsA and Volvox sequence of the same have an E-Value of 0.0, meaning the chances of these sequences
developing separately is almost impossible. Lecture 21: Experimental Evolution
Charles Darwin, 1859 – “In looking for the gradations by which an organ in any species has been
perfected, we ought to look exclusively to its lineal ancestors; but this is scarcely ever possible, and we
are forced in each case to look to species of the same group that is to the collateral descendants from
the same original parent form”.
Volvox and Chlamy are collateral descendants. We can try and figure out what happened in the past by
how they look today which involves a lot of inference. Instead we can use experimental evolution, which
is testing hypothesis about evolution using controlled experiments. Species in the lab are subjected to
different conditions and observed on how they adapt over time.
Model systems for EE are viruses, bacteria, Chlamy, drosophila and yeast. These species work because
they reproduce very quickly (short generation time). Chlamy have about a nine hour generation time
before they divide. Thus these species can be used to test evolution over a relatively short period of
Genetic novelty can appear due to spontaneous mutation, which is just a mutation which randomly
happens. Many of these are deleterious but occasionally one that is advantageous can occur.
Another way is through gene duplication which occurs when a region containing a gene is duplicated.
Most of the times one of these copies become destroyed and thus has no effect, but on occasion the
second copy is retained. When this occurs there is more freedom for the second copy of the gene to
mutate and change because it is not absolutely essential to the cell. Neo-functionalization means one
copy of the gene is able to do something advantageous. Sub-functionalization means the gene is not
changed but the promoter region is, changing the conditions in which the gene is expressed.
Gene rearrangement involves genetic processes that rearrange the genome. A promoter which is gene
dependant can move to promote a different gene than it originally did, which creates genetic novelty.
The Long Term Evolution Experiment asked “Can evolution produce adaptation if it depends on random
mutations, most of which are harmful?” They used E. coli to reduce complexity (asexual reproduction,
no recombination). Any change over time is due to duplication, spontaneous or rearrangement. The
population size for the experiment was huge.
The experiment started with 12 identical populations all coming from one original bacterium and are
thus genetically identical. Every day .1mL of the culture is put into fresh media so the cells can continue
to grow. This is done every day for all the populations. Every 500 generations (75 days) a sample can be
removed, frozen and later compared to another generation.
After about 30,000 generations one of the populations is more turbid (more cells for mL) than the
others. The cells in that culture had evolved the ability to use citrate as a carbon source. The genome of
E. coli is about 4.6 million bp, and pretty much every mutation was tried many times. The one that
allowed the culture to utilize citrate was very rare. Citrate was only in the solution to keep iron available for the cells. The cells normally have no ability to
bring in citrate under normal conditions. The glucose available runs out after about 8 hours and the
remainder of their time is spent in stationary phase. It represents an ecological opportunity for the
species to use citrate instead for growth.
Since all of the generations still were kept frozen the ancestors can be thawed out and used to find
when the mutation occurred. One question is whether the mutation was contingent on another
mutation that occurred prior. This was tested by taking prior generations and having them grow on a
citrate agar. It was found that some generations prior to 30,000 were able to grow. This shows that
there was more than one mutation – one that allowed the cells to utilize the citrate, and another that
allowed them to utilize it at extremely fast rates.
The evolution in the line can be replayed. Cells can be taken from different generations and grown for
another many generations to see if the same mutations occur. When this was tested it was found that
before 20,000 generations there was no way for this capability to arise. This is proof that another
mutation occurred prior to 20,000 that the later mutations were contingent upon.
The results of this experiment showed that there are three stages of a mutation – potentiation (an
unknown mutation that the final result was contingent on), actualization (the adaptation is observable),
refinement (species is able to fully utilize change).
The actual mutation that occurred in the E. coli experiment occurred on the citT gene. In the
actualization step of the experiment gene duplication duplicated citT and placed it downstream of the
rnk promoter which is a very strong promoter that is always on, especially in the presence of oxygen.
This means the citT gene is expressed.
The refinement of the Cit mutation occurs when there is an increase in the number of rnk-citT modules.
This creates a very strong Cit phenotype due to duplication. The more modules there are the greater
the culture density.
Mutations in each of the reply experiments showed that the amplification length was slightly different
but still came up with same Cit phenotype.
The new line of E. coli can grow on citrate. The defining characteristic of E. coli is that it cannot grow on
citrate. Does this new line warrant being considered a new species? Cit was not driven to extinction and
it was found that they are more efficient at utilizing glucose, meaning they have their own ecological
niche. Lecture 20 - Molecular Convergence
There can be differences in the amino acid sequence between two homologous genes due to the huge
amount of time since the species diverged. These single amino acid changes are normally neutral and do
not change the overall phenotype.
The neutral theory of molecular evolution notes that lots of mutations have no effect on the protein at
all. They are random and are inherited but are neither deleterious nor advantageous.
The number of differences in protein sequence between different species is roughly proportional to the
time since the species diverged. To be more specific, as the number of millions of years since divergence
increases, the number of amino acid substitutions per 100 residues increases linearly. Some genes
diverge more quickly than others. There is a relatively constant rate of change for gene differentiation.
The neutral theory explains the linear rate of change for genes, which is opposed to natural selection, by
calling most mutations neutral and thus not selected for or against. This linear graph can be used to
calculate the millions of years since divergence by testing the number of amino acid substitutions per
100 residues. This creates a molecular clock.
Testing the numb