Lecture 9: The Proteome & Protein Analysis
Differences in proteome
- We have seen differences in transcriptome define cells tissues and organisms,
including two very similar organisms, changes in the epigenome that is epigenetic
differences between those individuals can result in two strikingly different individuals
even if they are genetically identical.
- We talked about the involvement of transcription factors and how transcription factors
are used and being important in determining what the transcriptome is.
- Today what he wants to do is to talk about the proteins derived from the transcriptome.
It's those differences in the proteome, the complement of proteins that are made, that are
going to determine cells tissues and organisms.
- So what is the proteome?
Proteome is the complete set of proteins
- Differences in proteome are what defines cells tissues and organisms.
- Now for analyzing the proteome. You actually know one method for analyzing
proteome, and that is 2D gel electrophoresis or 2D PAGE (should have read about it
already). We can look at these differences in the proteome, things such as the presence
of a protein on the left but not on the right or vice versa, we can find a bunch of
differences between one conditions and the other as we look across the different
- What is 2D gel electrophoresis? What we’re looking at is the separation of proteins in
2 different dimensions, one is isoelectric points running left to right from basic to acidic
and the other is on the basis of molecular weight from high to low, top to bottom.
- What are the different elements of gel electrophoresis? The elements are based on
polyacrylamide gel electrophoresis which is just the separation of proteins in an electric
field in a solid matrix, or semi-solid matrix, polyacrylamide.
- In the first dimension, proteins are separated by their isoelectric point by isoelectric
focusing polyacrylamide gel electrophoresis AKA isoelectric focusing PAGE. Isoelectric focusing
- This is done in an isoelectric focusing or IEF tube gel – literally a tube of
polyacrylamide in that glass tube and what is established in that glass tube is a stable pH
gradient running from acidic to basic conditions.
- Conducts electrophoresis of proteins in that gel where that stable pH gradient has been
established running them from anode to cathode, top to bottom.
- What will happen if we have a collection of proteins from a given cell, tissue or
organism is that we should be able to separate them in this stable pH gradient on the
basis of their isoelectric point.
- The point he wants to make is that proteins aren’t separating on basis of molecular
weight but rather on the basis of isoelectric point and from the slide we can see a small
one at the top (unlike large ones as we normally see), a large one in the middle and an
intermediate sized one at the bottom. We should all be familiar with the isoelectric point
of a protein, that is the pH at which the protein has a net neutral charge.
- Proteins have charge on basis of their R groups, histidine for example have different
pKa associated with ability to protonate or deprotonate that particular R group, for
histidine it should be pH 6 for the imidazole group.
- Depending on the R groups are acidic, histidine is around neutral etc and there are
basic groups like arginine and lysine for example and we’re going to determine the
isoelectric point of a particular protein dependent on the quantity of the amino acids and
their R groups, we’ll have different pHs where the protein assumes neutrality.
- What happens is that when a protein doesn’t have a charge, that is its isoelectric point
in the gradient and it will no longer migrate so if it has a positive charge, it will migrate
away from the positive electrode (negative charge moves away from the negatively
charged electrode) until it reaches the part of the gel that it has neutral charge and stops
- The second dimension separates proteins on basis of their molecular weights and that
is sodium dodecylsulfate PAGE.
- Lay that tube gel over SDS PAGE gel and run the proteins out of the IEF gel where
they’ve been focused based on isoelectric point and into the PAGE gel where we will
have them separate on basis of their molecular weight.
- Cathode to anode and away we go, the proteins migrate, large proteins at the top,
intermediate at the middle and the smallest at the bottom.
- This is the same as figure 8.16 in the textbook.
- What we’ve done in this instance is to give all the proteins the same charge to mass
ratio by denaturing them in the presence of a reagent that will break all the disulfide
bonds (betamercaptoethanol) and then you treat them with sodium dodecylsulfate.
- This gives them all a net negative charge, that is it gives them all the same charge to
mass ratio so now charge and mass is associated.
- Now the proteins will dissociate according to mass and in the end this is what gives
rise to the ability to compare two proteomes on the basis of which sets of proteins have
been separated out on the basis of one set of conditions versus another.
- The question you might ask is what are the proteins? What's there? - We can determine this using mass spectrometry and bioinformatics.
- We can degrade the protein using digestion, cutting the different proteins out of the
gel, take the protein and cut it with an enzyme that cuts the protein at defined site such
- You can do a trypsin digest then separate the fragments on basis of mass.
Here we have the fragments released by the trypsin digest.
- We separate those fragments on basis of mass and since we have a complete genome
sequence for a large number of organisms, we can create a virtual proteome – we digest
virtually with virtual trypsin and ask what is the fingerprint we observe experimentally
and how does it work with the ones we made virtually?
- Which fingerprints match? We can thereby identify which protein gave rise to the
difference in the two proteomes/proteins.
- Alternatively we can sequence each individual fragment by virtue of mass
spectrometry again cutting off one amino acid at a time and calculating the mass of the
amino acid cut off. As we do that sequentially, we’ll have 1 amino acid, next amino
acid, next etc. By virtue of those amino acids we’ve found the mass for the protein
through mass spectrometry, we can determine what the protein is that’s there.
- To finish the lecture, he asks so what? Who cares that you can identify those proteins?
What he wants to do in the lecture is talk about why they’re important and how we go
about identifying them.
**Lecture 9 stops here **
**Lecture 10 Starts**
- Today he is going to talk about searching and destroying proteins, about folding proteins and also about how
proteins are degraded. We will discuss about what proteins do before destroying them. From previous lectures we
look at the central dogma from the organization of DNA all the way to manufacture of proteins.
- In the last lecture what we started to do was to look at the way the steps influence the organisms around us. We
talked about finishing up in last lecture how the proteome defines cells, tissues and organisms and the differences
in the proteome are what’s important. We talked about how one characterizes differences in protein complement
focusing on role that 2D gel electrophoresis plays in looking at differences in the proteome.
- What we did at end of last lecture was to take a look at a protein that is different between one organism and
another or one tissue and another or in one cell and another. We figured out what that protein was using mass
spectrometry He finished the lecture with this question: so what? Once we know what the protein is, what do we
- Most of the time we know very little about what the protein really does. You can think of the characterization of
proteins like picking fruit, all the easy ones have been picked already and it's the one that define differences
between organisms that we don’t know anything about the function of those proteins that is the challenge we have
to figure out. How do we figure out what that protein that differentiates organisms do? How do you define a
function of the protein you identified that accounts for differences between cells, tissues and organisms?
Complementary DNA (cDNA)
- So how do we go from gene to function? How do we start? What we’re going to do is
we’re going to start simply cloning the gene that encodes our protein of interest.
- Where do we start with that? There are a number of places we can start, we can start
with the genomic version, that is the gene that encodes our gene of interest but that
could have problems like the gene regulatory sequences, that reside there, the introns
that would normally be spliced out from the coding sequence. We’d be fairly reliant on
some advanced cells and technology to take the gene and get the protein.
- What we want to do is make a lot of protein and we want to do so taking advantage of
only gene sequences that encode that protein. Only coding sequence, and what has
coding sequence? mRNA it is comprises completely for coding sequence without
introns but of course we have a problem. If we want to use this particular sequence to express our protein uniquely in bacteria, taking advantage of their ability to churn out
large quantities of protein, then we have a problem, we can’t use the messenger RNA
because we can’t really clone it, we need to have a DNA copy that corresponds to the
- What we need is complementary DNA or cDNA. This is a DNA clonable copy of the
mRNA. This is like taking the mRNA, converting it to a double stranded DNA molecule
that can be cut and pasted into a plasmid vector, replicated in bacteria and that
replicating vector can be used to produce large quantities of protein that characterize it.
- To clone it and get it into this cDNA form, we have to start with a tissue source that is
making the mRNA or the protein of interest. Once we’ve isolated all the mRNA, so the
whole transcriptome there, all the transcripts, we want to convert it to cDNA.
- We start with different tissues that contain different mixtures of RNA, we’re going to
start with a tissue that presumably contains the mRNA we’re interested in.
- Then we’re going to make use of an enzyme that comes from viruses that have RNA
genomes and they convert that genome to cDNA. The enzyme that is responsible for the
conversion of viral RNA genome into a double stranded cDNA genome is known as
reverse transcriptase, it does transcription in reverse starting with an RNA copy and
converting it to a double stranded DNA copy.
- This is shown in the slide, starting with a mRNA copy and then like most polymerases,
reverse transcriptase requires a primer to start the whole process and what one it uses is
an oligo(dT) primer, a stretch of deoxythyamine. Why? This is because we know that
mRNA is polyadenylated so poly-T will be perfectly complementary to that so we can
design a sequence that is completely comprised of Thymine in order to prime the first
strand of synthesis of cDNA. This is what we see in the slide: we start with oligo(dT)
and then use reverse transcriptase using deoxynucleotide triphosphates to synthesize the
first strand of cDNA.
- Then what we do is use an enzyme, an RNase that will nibble back (it's an
endonuclease) bits of the RNA so that there’s enough left behind to function as a primer
for reverse transcriptase to synthesize the second strand and you get some strand
displacement when that takes place so you get rid of the RNA.
- You end up with a double stranded cDNA molecule, that is the preparation of a cDNA
template. We now have a double stranded cDNA copy that corresponds to the coding
sequence of our gene of interest/protein.
Use in PCR reactions with cDNA template
- We want to specifically get at our gene of interest. We have a pool of cDNAs, each
cDNA will represent a different messenger RNA that was present in the transcriptome
- What we want to do now is specifically get our sequence of interest through the
polymerase chain reaction. What we do is on the basis of the known protein sequence,
we design primers, one at the 5’ and 3’ end and these primes can be used in a
polymerase chain reaction. We’re going to use this in a PCR reaction with our cDNA
template and as a consequence, amplify uniquely our gene of interest from that pool of
- We take the amplicon (the amplification product from the PCR reaction) and clone it
into a plasmid vector and to make sure we have the right thing, we sequence the plasmid
using BLAST to ensure that the sequence is what we think it is,
- Some of these concepts we’ve covered in lab and with Chang are now applied to m