LIFESCI 7C Lecture Notes - Lecture 9: Arabidopsis Thaliana, Post-Translational Modification, Genome Size
13.3: Gene number, genome size, and organismal complexity
Gene number is not a good predictor of biological complexity
● Genomes can differ in size and composition (e.g. fraction of genes that are protein-coding)
○ E.g. flower Arabidopsis thaliana has a genome about 5% as large as that of humans, but has
about as many protein-coding genes
○ E.g. humans are far more complex than worms but have about the same number of protein-
coding genes
● DISCONNECT between level of complexity, genome size, and gene numer
● And some organisms are able to do more with the genes they have than other organisms
○ E..g expression of protein-coding genes can be regulated in many subtle ways → different
gene products made in different amounts in different cells at different time
■ Differential gene expression → same protein-coding genes can be deployed in different
combos to yield a variety of distinct cell types
■ Proteins can also interact w/ one another, combining in different ways to perform
different functions
■ Single gene can yield multiple proteins
● Alternative splicing (different exons spliced together to make different proteins)
● Posttranslational modification (proteins undergo biochemical changes after they
have been translated)
● Perhaps major evolutionary changes can be accomplished not only by the acquisition of whole new
genes, but also by modifying existing genes and their regulation in subtle ways.
Viruses, bacterias, and archaeons have small, compact genomes
● Genome size is measured in base pairs (bp)
○ 1000 bp=1 kilobase (kb)
○ 1,000,000 bp = 1 megabase (Mb)
○ 1,000,000,000 bp = 1 gigabase (Gb)
● Most viral genomes range from 3-300kb, w/ the largest being 1.2Mb that contains almost 1000
protein-coding genes
○ Complete sequence of bacterial genomes has allowed researches to define the
smallest/minimal genome (and protein set) necessary to sustain life
● Genomes of archaeons and bacteria are information dense, w/ most of the genome having a defined
function
○ 90% or more of genome = protein-coding genes (even if protein has unkwown functions)
○ Bacterial genomes range 0.5-10mb
■ Bigger genomes have more genes, allowing these bacteria to make molecules others
have to scrounge for or use chemical energy in covlanet bonds of substances other
bacteria cannot
■ Archaeons’ genomes range 0.5-5.7Mb, and have similar capabilities
Among eukaryotes, there is no relationship between genome size and organismal complexity
● In eukaryotes, there is no correlation between genome size and metabolic/developmental/behavioral
complexity of an organism, just as the number of genes doesn’t correlate to complexity either
○ Largest genome is 500,000x bigger than the smallest, both genomes belonging to protazoa
■ Range of flowering plants (angiosperm): 3 orders of magnitude
■ Range of animals: 7 orders of magnitude
find more resources at oneclass.com
find more resources at oneclass.com
● One species of lungfish has a genome size 45x larger than human genome!
● C-value paradox The disconnect between genome size and organismal complexity (the C-value is
the amount of DNA in a reproductive cell). → difficulty in predicting one from the other
● Why are eukaryotic genomes so large?
○ polyploidy The condition of having more than two complete sets of chromosomes in the
genome.
■ Esp. prevalent in plants, due to deletion, duplication, or hybridization/crossing of entire
genomes
○ Another reason: genomes contain large amount of noncoding DNA such as introns & repetitive
DNA
About half of the human genome consists of transposable elements and other types of repetitive DNA
● Complete genome sequencing has allowed different types of nocoding
DNA to be specified more precisely in a variety of organisms
○ Only about 2.5% of human genome codes for protein!
○ The rest includes introns, noncoding DNA, and other types of
repetitive sequences
● Repeated sequences can be classified by organization (dispersed or
tandem) and by function
○ Alpha satellite (α) DNA consists of tandem copies of 171-bp
sequence repeated near each centromere an average of
18,000 times → essential for attachment of spindle fibers to
centromeres during cell division
● transposable element (TE) (aka transposon) A DNA sequence that
can replicate and move from one location to another in a DNA molecule
○ Have potential to increase thier copy number in the genome over time
○ Selfish DNA → seems like their only function is to duplicate and proliferate in the genome,
like a parasite
○ Make up about 45% of DNA in human genome; can be grouped into two major classes based
on replication method
■ DNA transposons Repeated DNA sequences that replicate and can move from one
location to another in the genome by DNA replication and repair.
● Make up about 3% of human genome
■ retrotransposons Transposable elements in DNA sequences in which RNA is used
as a template to synthesize complementary strands of DNA (cDNA), a reversal of the
usual flow of genetic information from DNA into RNA. (retro=backwards)
● Make up about 40% of human genome
● Over evolutionary time, the amount of repetitive DNA in a genome can change drastically, in large
part bc of accumulation of TE’s
○ In some genomes, repetitive DNA is maintainted over long periods of times even as new
species evolve and diversify
○ In other genomes repetitive DNA is held in check because of deletion and other processes
○ → genomes of different species can contain vastly differing amounts of repetitive DNA,
accounts in large part for C-value paradox!
find more resources at oneclass.com
find more resources at oneclass.com
Document Summary
Gene number is not a good predictor of biological complexity. Genomes can differ in size and composition (e. g. fraction of genes that are protein-coding) E. g. flower arabidopsis thaliana has a genome about 5% as large as that of humans, but has about as many protein-coding genes. E. g. humans are far more complex than worms but have about the same number of protein- coding genes. Disconnect between level of complexity, genome size, and gene numer. And some organisms are able to do more with the genes they have than other organisms. Eg expression of protein-coding genes can be regulated in many subtle ways different gene products made in different amounts in different cells at different time. Differential gene expression same protein-coding genes can be deployed in different combos to yield a variety of distinct cell types. Proteins can also interact w/ one another, combining in different ways to perform different functions. Alternative splicing (different exons spliced together to make different proteins)