Lecture 9: Transcription in eukaryotes
1) Structure of a gene
2) RNA polymerases
3) Transcription of protein-coding genes
4) RNA processing
Readings: Alberts, Ch 6, pp.339-353
Slide 1 DNA
Primary RNA transcript
- Gene structure is more complex in eucaryotes.
- In eucaryotes: genomic sequences (coding sequences in
genomic DNA) are interrupted by introns – intron sequences are
transcribed to form mRNA – further processing has to happen to
primary mRNA transcript in order to delete these intron
sequences in order to make mature mRNA transcript.
- 1 of those things are removal of intron sequences to put together
entire coding sequence of protein. Other things that need to
happen is putting on of RNA cap at 5’ end (5’ capping) as well as
poly-Adenylation at 3’ end.
- Then you can have exporting of mature mRNA transcript out of
nucleus & into cytoplasm where it can then be translated –
something that doesn’t happen in prokaryotes.
This is much different than in prokaryotes which only have direct
transcription and translation.
Slide 2 - Eucaryotes have 3 different kinds of RNA polymerases not just
1 as the 1 found in procaryotic – tend to transcribe different sort
- RNA polymerase II – transcribes protein coding genes – this is
the most similar to the polymerase in procaryotes.
5.8S, 18S are different sizes of the mRNA
Svedberg units are the measurement of RNA sizes
There are a distribution of jobs as in three different RNA
polymerases which specialize in their own synthesis.
Slide 3 Bacterial RNAP’s
- Eucaryotic RNA polymerases – lots of these subunits are
actually homologous to each other – thought to be derived from
similar ancestral sequences. Across 3 different polymerases, they
share some subunits in common, but then have some subunits
that are homologous to each other, but not exactly same, not
encoded for by exactly same gene.
- RNA polymerase II resembles most closely to E. coli RNA
polymerase – most similar, thought to be homologous.
There are more subunits found in the Eucaryotic polymerases
There are beta-like subunits, alpha-like subunits, omega-like
subunits as well for all three
Common subunits are all found between the three
There are also additional enzyme specific subunits
Since they’re similar, this indicates that they most likely divulged
from the same ancestor at some time during evolution
Slide 4 - Eucaryotic has additional domains (subunits that are part of
hollow enzyme that perform supporting roles).
- 3D structure is exactly the same. 5 different subunits even in very simple polymerases in
prokaryotes (Ex. E coli)
Substructures that are the same are colour coded (similar at least)
The colored regions are fairly homologous
All eucaryotic polymerases look similar to the prokaryotic
polymerase but polymerase II is most similar, eucaryotic one has
Polymerases between species is highly conserved
Slide 5 1. Transcription factors
- Promoter is sequence in genomic DNA that actually helps
position & initiate the transcription. There can be many.
2. Sigma subunit
- Instead of being a subunit of the polymerase, these are actually
completely different proteins and often, many more than one
transcription factor is required in eucaryotes in order to initiate
transcription. Reasons for whole host of factors needed for
initiation of transcription – more complexities & gene regulation
is needed in eucaryotes than procaryotes.
3. Chromosomal structure
Refers to the idea that DNA in eucaryotes are wrapped around
histones which forms the chromosomes. The histone proteins
makes it a bit trickier to get at (DNA that needs transcribing)
Harder to access DNA relative to bacterial DNA.
Slide 6 Multiple Proteins
Poly A Tail
- In procaryotes, level of primary transcript (mRNA transcript)
can actually have different genes encoded in single mRNA
transcript – those genes can code for multiple proteins – those
proteins are separate proteins that can be involved in completely
separate things – this is not usually found (very rarely) in
eucaryotes – usually have single coding sequence that codes for
- Exons & introns are not shown on this diagram – already has
intron sequences spliced out, 5’ cap & 3’ poly-A tail put in –
mature mRNA transcript.
- These modifications of 5’ & 3’ end are not in procaryotic
- This idea that you need many different factors involved in
splicing, modifying 2 ends – these are functions that actually
happen during transcription.
mRNA in prokaryotes are formed and can code for more than one
We can see interesting features where there is the 5’ and 3’ end.
The 5’ end has an untranslated region where start codon exists
Start codons are found between the proteins and stop codon at the
The non coding sequences are only buffers between different
genes in the mRNA transcript, it is no longer a primary transcript
Slide 7 RNA processing proteins (capping factors, splicing factors, 3’
end processing factors)
- The modifications are happening while the RNA polymerase is
actually doing its job. As nascent RNA transcript is emerging
from RNA polymerase, have factors that jump off & associate themselves with growing mRNA transcript & starting modifying
already – very efficient.
- Not all of these factors are associated at 1 time – some serve as
nucleation sites to attract other factors that are not associated with
RNA polymerase itself.
- The factors associated with the RNA polymerase are associated
with the C terminal tail/domain (CTD) of the polymerase. The
phosphorylation of this C terminal domain of RNA polymerase II
is what results in the binding of these RNA processing proteins
(splicing factors, capping factors, polyadenalation factors). If it
was unphosphorylated then the factors would not bind.
RNA polymerase made up of many different subunits
The carboxy means carboxyl end (carboxyl terminal domain)
with many repetitive AA sequences
Carboxy terminal domain is hanging off the carboxy terminal it is
not a domain of its own
This is like a RNA factory
As it transcribes, it brings with it whatever it needs to do the
Slide 8 - We start with Promoters
Slide 9 - Idea of promoter sequence in eucaryotes is similar to consensus
Positions RNAP II binding sites for promoters in procaryotes for sigma factors.
Highly transcribed genes - There are sequences that tend to be preferentially recognized by
certain transcription factors – there is some variation in those
consensus sequences for a given transcription factor but even
though there is variation, there are actually preferred nucleotides
- TATA box – 1 of most common & highly used promoters –
generally found just upstream of the start site for transcription &
is known to play a major role in helping to position the RNA
polymerase II – tends to be found in the most highly transcribed
genes – very strong promoter sequence. Highly transcribed genes
don’t just have TATA boxes, they would tend to have several
transcription factors associated with origin of transcription but
TATA boxes are 1 of the most important ones.
- Transcription factor associated with TATA box is TATA
binding protein (TBP) – when it binds, it tends to introduce kink
into the DNA which tends to loosen some of base pairs around
this kink – induces conformational change in DNA which helps
rest of what needs to happens for transcription. TBP essential for
binding of other factors at the site of transcription initiation
(including RNA polymerase II) as well as helping helicase for the
separation of the DNA strands.
You need to have a sequence on the 5’ or nearby end of the gene
to signal start of transcription
The general transcription factors will bind first and then RNA
polymerase will bind
In eucaryotic genes, there are the four that have been identified found in slide of chart
Look at where transcription starts and those approximate regions
of the promoter sequences
INR starts over transcription start area and DPE is further down
These promoter areas attract the factors which then attracts RNA
TATA box is good attractor for general transcription factors
Slide 11 - Each transcription factor has a different role.
- There are differences in how these transcription factors
assemble among different organisms. In most eucaryotic systems
we do know about though, these transcription factors are the key
players in the initiation of transcription.
- TATA box is actually part of larger transcription factor TFIID
transcription factor – it is binding of that one that enables the
binding of other transcription factors & the loading of the RNA
polymerase II onto the mature mRNA transcript.
- It is the recognition of the specific DNA sequences (most
important by the TATA box) that determines the specificity of
the transcription initiation site.
- There are many other components that are involved in different
- TFIIB binds next, recognizes BRE element in promoters & it
actually is 1 of major players in helping position RNA
- TFIIH – both helicase & kinase (2 separate domains). Helicase
function helps separate the strands – uses ATP. Kinase portion
phosphorylates the RNA polymerase in its C terminal domain
and allows for entering of elongation phase & is essential step for
all RNA binding process components.
- TFIIE helps load the helicase, etc.
TF stands for transcription factor and II refers to the fact that it is
a general transcription factor for RNA polymerase II.
BHC etc is for proteins with different functions
For TATA box, TBP protein binds first which then attracts other
Eventually you’ll get the start of the transcription factor
RNA transcription finally starts
Slide 12 1. TBP, TFIID
- This TBP is recognizing that specific structure of the minor
groove which exists even when the DNA is wound up around
histones. TBP binds in the minor groove and bends the DNA
allowing RNA polymerase to recognize the bend and initiates all
2. RNA polymerase II (RNAP II)
- RNA processing proteins can’t associate with C terminal tail if
phosphorylation of the C terminal tail hasn’t happened. It is also
necessary for the elongation conformation/phase of the RNA
- The key ones to know are TFIID & TFIIH
Slide 13 2) 26
3) 52 - Mostly serines get phosphorylated: the phosphorylation of