4. From amino acids to proteins
Primary Source Material
• Chapters 4 and 12 of Introduction to Genetic Analysis Anthony: J.F. Grifﬁths, Jeffrey H.
Miller, David T. Suzuki, Richard C. Lewontin, William M. Gelbart (courtesy of the NCBI
• Chapters 4, 4 and 6 of Biochemistry: Berg, Jeremy M.; Tymoczko, John L.; and Stryer,
Lubert (courtesy of the NCBI bookshelf).
• Chapters 3 and 7 of Molecular Cell Biology: Lodish, Harvey; Berk, Arnold; Zipursky, S.
Lawrence; Matsudaira, Paul; Baltimore, David; Darnell, James E. (courtesy of the NCBI
• ExPASy: online course on Principles of Protein Structure
• Many ﬁgures and the descriptions for the ﬁgures are from the educational resources
provided at the Protein Data Bank (http://www.pdb.org/)
• Most of these ﬁgures and accompanying legends have been written by David S. Goodsell
of the Scripps Research Institute and are being used with permission. I highly recommend
browsing the Molecule of the Month series at the PDB (http://www.pdb.org/pdb/101/
• Suggested reading: Sections 14.5 to 14.8 and Sections 2.1 to 2.4 of Mikkelsen and Cortón,
Bioanalytical Chemistry 86
Where are we and how did we get here?
• We are done with the Central Dogma and now we move into the realms of protein structure
and function. The Central Dogma only relates to the ﬂow of genetic information, not to the
function of biological macromolecules. 87
Proteins come in all shapes and sizes
• Proteins are diverse and versatile ‘nano’ structures and machines
• Large number of potential combinations
• There is a relatively large number number of amino acids (a.a.) which you can use to
construct a protein.
• Includes 20 common a.a.’s plus numerous post-translational modiﬁcations.
• 200 amino-acid protein could have 20 to the 200th power possible sequences.
• Structurally versatile
• Polypeptide backbone can adopt a variety of conformations
• Many conformers of side chains
• Secondary structural elements can pack together in a wide variety of orientations
• Various states of homo- and hetero- oligomerization
• Proteins can bind prosthetic groups or cofactors (non-protein)
• Metal ions
• Structurally dynamic
• Allosteric activation
• Active and inactive forms 88
The structure of a protein is determined by the
linear sequence of amino acids (1º structure)
An unfolded protein can be refolded in vitro. This demonstrates that the information
needed to specify the tertiary structure is fully contained in the primary sequence.
• The classic work of Christian Anﬁnsen in the 1950s on the enzyme ribonuclease revealed
the relation between the amino acid sequence of a protein and its conformation. For this
work he was awarded the Nobel Prize in Chemistry in 1972. Anﬁnsen discovered that:
• Ribonuclease is a single polypeptide chain consisting of 124 amino acid residues
cross-linked by four disulﬁde bonds.
• Agents such as urea or guanidinium chloride effectively disrupt the noncovalent
• The disulﬁde bonds can be cleaved reversibly by reducing them with a reagent such
• When ribonuclease was treated with β-mercaptoethanol in 8 M urea, the product was
a fully reduced, randomly coiled polypeptide chain devoid of enzymatic activity. In
other words, ribonuclease was denatured by this treatment.
• Anﬁnsen then made the critical observation that the denatured ribonuclease, freed of urea
and β-mercaptoethanol by dialysis, slowly regained enzymatic activity. All the measured
physical and chemical properties of the refolded enzyme were virtually identical with those
of the native enzyme.
• These experiments showed that the information needed to specify the catalytically active
structure of ribonuclease is contained in its amino acid sequence.
• For a good discussion of these classic experiments: http://sandwalk.blogspot.ca/2007/02/
The 20 common amino acids
Ala showing L-
• 20 different common amino acids only differing in side chain
• Note that stereochemistry at Cα has not been indicated in this ﬁgure.
• All natural a.a.’s are in L-conﬁguration
• A more general system of stereochemical designation is the R/S system. The L-
conﬁguration nearly always corresponds to S in the R/S system. The exception is L-
cysteine which is R.
• You might want to keep this sheet handy as a reference.
• I will often used the one letter codes and you should learn these.
• Most are easy, but I ﬁnd D, E, N and Q the most tricky to remember
• Q: Do we need to memorize the structure and names of amino acids on the test?
• A: Yes. You should know the structure, name, 3 letter abbreviation, and 1 letter code of all
the common amino acids. 90
Amino acid classiﬁcation by property
• Various simple textbook classiﬁcations for a.a.’s
• e.g. small, nucleophilic, hydrophobic, aromatic, acidic, amide, basic
• e.g. aliphatic, non-polar, aromatic, polar, charged -ve, charged +ve
• However, no simple classiﬁcation can properly capture the diversity of a.a. interactions and
• the same amino acid in different charge states can go from polar to nonpolar (H or K
• Different portions of the same amino acid can have different properties (aliphatic
chain vs. guanidinium of arginine)
• Generally ﬁnd aliphatic/hydrophobic residues inside proteins and polar/charged on the
surface of proteins
• Cysteine is special because it is best nucleophile, is most easily oxidized, and can form
• Proline has a tertiary as opposed to secondary amide nitrogen and induces bend in
• Theonine and Isoleucine have chiral carbons in side chain 91
Free amino acids are almost always zwitterions
Commentary on the topic of zwitterions: http://bip.cnrs-mrs.fr/bip10/zwitter.htm
• Amino acids in solution at neutral pH exist as dipolar ions (zwitterions).
• In the dipolar form, the amino group is protonated (-NH3+) and the carboxyl group is
• Under almost any conceivable physiologically relevant conditions, the amino and
carboxylate group of a free amino acid will be in its charged state.
• This is also true of a polypeptide chain: the N-terminus and the C-terminus will be in the
• Possible exceptions
• Groups buried in the interior of proteins or lipid bilayers
• Proteins in the stomach 92
pKa values of protein functional groups
• Seven of the 20 amino acids have readily ionizable side chains. These 7 amino acids are able to donate or accept
protons to facilitate reactions as well as to form ionic bonds.
• The above table gives equilibria and typical pKa values for ionization of the side chains of tyrosine, cysteine, arginine,
lysine, histidine, and aspartic and glutamic acids in proteins.
• Two other groups in proteins—the terminal α-amino group and the terminal α-carboxyl group—can be ionized.
• You should know the approximate values for all of these ionizable groups. It is safe to say that all carboxylic acids in
proteins have a pKa of about 3-4.
• Q: What is so special about Histidine? It has a pka of ~6, but did you mention that it does not react with anything
• A: Histidine is very good at donating and accepting protons at physiological pH. This is a very important part of many
enzyme mechanisms. I may have mentioned that histidine is not such a good nucleophile. For enzyme mechanisms
that involve a nucleophilic attack on the substrate, cysteine would be the best amino acid, followed by lysine.
• Q. Proteins buried in lipid bilayers are charged on one terminal end or not at all? if its charged on part which one is it?
• A. The N-terminus is always positively charged and the C-terminus is always negatively charged under normal pH
conditions (near neutral). Under some circumstances, such as when the N- or C-terminus is buried in a very
hydrophobic environment, I suppose they could be uncharged. The pKa of an ionizable group is going to depend on its
• Q. Proteins in stomach are charged on their N terminals, am i right?
• A. I believe that the stomach is very low pH, like 2-3. At such low pH, practically every group in proteins will be
protonated. It is close to the pKa for the C-terminus, so it might be partially protonated.
• Q. Are the pKa values of AAs will be given in the test or not?
• A. They won't be provided. You should know which residues are positively and negatively charged at neutral pH.
• Oligopeptide: A compound made up of the condensation of a small number (typically less
than 20) of amino acids
• Polypeptide: A compound made up of the condensation of more than ~20 amino acids
• Each type of protein differs in its sequence and number of amino acids. It is the particular
sequence of the various side chains that makes each protein distinct.
• The two ends of a polypeptide chain are chemically different: the end carrying the free
amino group (NH3+, sometimes incorrectly written as NH2) is the amino, or N-terminus,
and that carrying the free carboxyl group (CO2-, sometimes incorrectly written as CO2H) is
the carboxyl, or C-terminus.
• The amino acid sequence of a protein is always presented in the N to C direction, reading
from left to right. This corresponds to the 5’ to 3’ direction in which genes are read. 94
The peptide bond is planar and
almost always trans
All amino acids except proline
• Note that the C-N bond length of the peptide is 10% shorter than that found in usual C-N
amine bonds. This is because the peptide bond has some double bond character (40%) due
to resonance which occurs with amides. As a consequence of this resonance all peptide
bonds in protein structures are found to be almost planar. This rigidity of the peptide bond
reduces the degrees of freedom of the polypeptide during folding.
• The planarity of the peptide bond is described using the angle ‘omega’. This is the dihedral
angle between the C alphacarbonyl bond and the N-C alphaond. The omega (ω) angle is
almost always 180º (trans) though sometimes (extremely rarely) it is 0º (cis).
• Of the cis-peptide bonds found in proteins, almost all involve proline residues . The overall
atom geometry in cis proline is very similar to the trans-proline case. Energetically, the trans
proline structure is not markedly more favorable than its cis-proline counterpart since much
the same spatial conﬂicts are present in both cases. Approximately 1% of prolines in proteins
• A cis-peptide bond induces a very sharp kink in the polypeptide chain.
• Q. It is stated that "Approximately 1% of prolines in proteins are cis." Does it mean 99% of
prolines in proteins are trans? So, trans-proline is still more favourable than cis-proline (Slide
87)? Also, do you mean that proline is the only amino acid that can exist in cis while 19 other
amino acids cannot.
• A. Correct. 99% of all prolines are trans and trans is more favourable than cis. The difference
in energy for cis vs. trans is smaller than it is for any of the other amino acids, and this is why
we occasionally see cis prolines. It is extremely rare to ﬁnd any of the other 19 amino acids
in a cis conformation. 95
Certain combinations of φ and ψ angles are
Scans downloaded from: http://www.nd.edu/~aseriann/cou.html
• Linus Pauling and Robert Corey analyzed the geometry and dimensions of the peptide
bonds in the crystal structures of molecules containing one or a few peptide bonds. This
analysis led Pauling to correctly predict the existence and structure of the alpha helix and
beta sheets (for which he was awarded the 1954 Nobel Prize in Chemistry)
• The take home message is that the secondary structure elements of proteins can be
predicted by looking at the structure of an individual amino acid. That is, an amino acid in
an alpha helical or beta sheet conformer is also in a minimal energy conformer because its
bonds are staggered and the peptide bond is planar.
• A polypeptide can be thought of as a series of planar units (peptide bonds) joined by
ﬂexible hinges (Cα-atoms).
• Each Cα-atom has two rotatable bonds, the C-N bond (φ, phi) and the C-C bond (ψ, psi)
• Only certain combinations of φ and ψ angles are allowed due to steric clashes between the
adjacent residues. 96
The Ramachandran Plot (φ vs. ψ)
• A graph of φ angle vs. ψ angle vs. occurrence in proteins is called a Ramachandran plot.
• There are actually only a few conformations that are strongly preferred and these give rise
to the common elements of secondary structure. 97
Plot of a typical
(as output by the
• The Ramachandran plot for a particular protein shows the phi-psi torsion angles for all
residues in the structure
• By looking at how well the angles match up with expected distribution, the quality of a
structure can be assessed.
• Glycine residues are separately identiﬁed by triangles as these are not restricted to the
regions of the plot appropriate to the other sidechain types.
• The coloring/shading on the plot represents the various levels of favorability: the darkest
areas (here shown in red) correspond to the "core" regions representing the most favorable
combinations of phi-psi values.
• A properly folded protein will have over 90% of the residues in these "core" regions.
• The percentage of residues in the "core" regions is one of the better guides to
stereochemical quality for assessing experimental protein structures.
• An ideal Ramachandran plot can be generated computationally using known atomic radii
and bond distances. 98
alpha-helices: an ‘island’ of preferred
• As mentioned earlier, Pauling and Corey twisted models of polypeptides around to ﬁnd
ways of getting the backbone into regular conformations which would agree with
experimental diffraction data (much like the way the structure of DNA was determined). The
most simple and elegant arrangement is a right-handed spiral conformation known as the
• The structure repeats itself every 5.4 Angstroms along the helix axis, i.e. we say that the
alpha-helix has a pitch of 5.4 Angstroms. Alpha-helices have 3.6 amino acid residues per
turn, i.e. a helix 36 amino acids long would form 10 turns. The separation of residues along
the helix axis is 5.4/3.6 or 1.5 Angstroms, i.e. the alpha-helix has a rise per residue of 1.5
• Every mainchain C=O and N-H group is hydrogen-bonded to a peptide bond 4 residues
away (O(i) to N(i+4)). This gives a very regular, stable arrangement.
• The peptide planes are roughly parallel with the helix axis and the dipoles within the helix
are aligned. That is, all C=O groups point in the same direction and all N-H groups point the
other way. This alignment of C=O and N-H bonds gives the alpha-helix a permanent dipole
with a partial positive charge at the amino-terminus and a partial negative charge at the
• Side chains point outward from helix axis and are generally oriented towards its amino-
• All the amino acids have negative phi and psi angles, typical values being -60 degrees and
-50 degrees, respectively 99
beta-strands: another ‘island’ of preferred
• In addition to the alpha helix, Pauling and Corey discovered another periodic structural
motif which they named the β-pleated sheet (β because it was the second structure that
they elucidated, the α helix being the ﬁrst).
• The β-sheet differs markedly from the rodlike α-helix. A polypeptide chain, called a β-
strand, in a β-sheet is almost fully extended rather than being tightly coiled as in the α-
helix. A range of extended structures are sterically allowed. The side chains of adjacent
amino acids point in opposite directions.
• A β-sheet is formed by linking two or more β-strands by hydrogen bonds. Adjacent chains
in a β-sheet can run in opposite directions (antiparallel β-sheet) or in the same direction
• In the antiparallel arrangement, the NH group and the CO group of each amino acid are
respectively hydrogen bonded to the CO group and the NH group of a partner on the
• In the parallel arrangement, for each amino acid, the NH group is hydrogen bonded to the
CO group of one amino acid on the adjacent strand, whereas the CO group is hydrogen
bonded to the NH group on the amino acid two residues farther along the chain.
• Many strands, typically 4 or 5 but as many as 10 or more, can come together in β-sheets.
Such β-sheets can be purely antiparallel, purely parallel, or mixed.
• β-sheets can be relatively ﬂat but most adopt a somewhat twisted shape. 100
Turns and loops connect strands and helices
• Most proteins have compact, globular shapes, requiring reversals in the direction of their polypeptide chains. Many of these
reversals are accomplished by reverse turns (not shown) and hairpins (shown).
• The residues forming these two-residue turns have torsion angles in characteristic regions of the Ramachandran plot.
• For type I' turns, residue 2 is always glycine whereas for type II' turns residue 1 is always Gly. This is because amino acids
other than glycine would cause steric hindrance involving the residue's side chain and the main chain.
• In other cases, more elaborate structures are responsible for chain reversals. These structures are called loops or sometimes Ω
loops (omega loops) to suggest their overall shape. Unlike α-helices and β-strands, loops do not have regular, periodic
structures. Nonetheless, loop structures are often rigid and well deﬁned. Turns and loops invariably lie on the surfaces of
proteins and thus often participate in interactions between proteins and other molecules.
• For example, a part of an antibody molecule has surface loops (shown in red) that mediate interactions with other molecules.
• Q. Do beta-hairpins only exists among beta-sheet?
• A. As the name implies, the beta hairpin is most commonly found as a connector between strands of an antiparallel beta sheet.
The reverse turn is a a bit more general and can be found in loops that connect both helices and strands.
• Q. What are the differences/relationship between reverse turns, beta-hairpin turns and omega loops?
• A. Reverse turns and beta turns do look very similar when you look at the structures on the slide. However, there are key
differences in the conformations of amino acids that deﬁne each of these two types of turns. I don't expect you to know the
details of these differences. One thing you should remember is that beta turns are typically used to connect two strands of anti-
parallel beta sheet. An omega loop is a larger structure that is supposed to look something like the omega character (Ω). That
is, the ends are very close together but the loop itself is large and extends out into space. The variable regions of an antibody
can be described as omega loops. Experimental measurement of 101
• Kim and Minor mutated position 53 of protein GB1 to all 20 possible amino acids, and then
measured the melting temperature for all of the resulting proteins
• This analysis revealed that certain amino acids are more favorable in β-sheets than others.
• Speciﬁcally, residues that are branched at the β-carbon of the amino acid tend to stabilize
a β-sheet structure.
• This observation is consistent with the concept of amino acid propensities. That is, every
amino acid has a certain preference for being in a β-sheet, α-helix, or turn region.
• For β-sheet vs. α-helix propensity are opposed to each other. That is, the reason a residue
has a high β-sheet propensity, is because it has a low α-helix propensity, and vice versa.
• image from: https://bioweb.uwlax.edu/Default.htm Proteins are generally composed of α-helices 102
and/or β-sheets connected by turns and loops
• The α-helical content of proteins ranges widely, from nearly none to almost 100%. For
example, about 75% of the residues in ferritin, a protein that helps store iron, are in α-
helices. Single α-helices are usually less than 45 Å long. However, two or more α-helices
can entwine to form a very stable ‘coiled coil’ structure, which can have a length of 1000 Å
(100 nm, or 0.1 μm) or more. Such α-helical ‘coiled coils’ are found in myosin and
tropomyosin in muscle, in ﬁbrin in blood clots, and in keratin in hair. The helical cables in
these proteins serve a mechanical role in forming stiff bundles of ﬁbers, as in porcupine
quills. The cytoskeleton (internal scaffolding) of cells is rich in so-called intermediate
ﬁlaments, which also are two-stranded α-helical coiled coils. Many proteins that span
biological membranes also contain α-helices.
• The β-sheet is an important structural element in many proteins. For example, fatty acid-
binding proteins, important for lipid metabolism, are built almost entirely from β-sheets. 103
Protein folding is largely driven by
surface cross section
• Myoglobin, the oxygen carrier in muscle, is a single polypeptide chain of 153 amino acids.
The capacity of myoglobin to bind oxygen depends on the presence of heme, a prosthetic
(helper) group consisting of protoporphyrin IX and a central iron atom.
• The folding of the main chain of myoglobin, like that of most other proteins, is complex and
devoid of symmetry. A unifying principle emerges from the distribution of side chains. The
striking fact is that the interior consists almost entirely of nonpolar residues such as leucine,
valine, methionine, and phenylalanine. Charged residues such as aspartate, glutamate,
lysine, and arginine are absent from the inside of myoglobin. The only polar residues inside
are two histidine residues, which play critical roles in binding iron and oxygen.
• The outside of myoglobin, on the other hand, consists of both polar and nonpolar residues.
This contrasting distribution of polar and nonpolar residues reveals a key facet of protein
architecture. In an aqueous environment, protein folding is driven by the strong tendency of
hydrophobic residues to be excluded from water.
• The polypeptide chain therefore folds so that its hydrophobic side chains are buried and its
polar, charged chains are on the surface.
• The secret of burying a segment of main chain in a hydrophobic environment is pairing all
the NH and CO groups by hydrogen bonding. This pairing is neatly accomplished in an α-
helix or β-sheet.
• The ability to predict whether or not a given polypeptide sequence will fold into a given
tertiary structure remains one of the ‘grand challenges’ of science.
• In nature, protein fold either independently or with the help of other proteins known as
Membrane proteins have grease on the outside
of the same
• Some proteins that span biological membranes are “the exceptions that prove the rule” regarding
the distribution of hydrophobic and hydrophilic amino acids throughout three-dimensional
structures. For example, ion channels are covered on the outside largely with hydrophobic
residues that interact with the neighbouring alkane chains. The inner channel is quite polar and
there are many speciﬁc interactions with the ion being transported.
• David S. Goodsell: The Molecule of the Month appearing at the PDB
• Potassium ions move through this channel from inside the cell to the outside. The driving force
for this movement is simply the concentration gradient. Cells concentrate potassium ions inside,
and then these ions are released when the membrane depolarizes (for example, during
transmission of signals through the nervous system). The selectivity ﬁlter is the part with the
backbone carbonyls oriented towards the ion in the centre of the channel. Only potassium (not
sodium) is perfectly coordinated by these carbonyl oxygen atoms, and so only it can pass
through the channel. It is my understanding that potassium ions are normally surrounded by 8
water molecules, whereas sodium is normally surrounded by 6.
• The 2003 Nobel Prize in Chemistry was awarded for work in the area of channels
• Roderick Mackinnon pioneered x-ray crystallography of ion channels.
• Peter Agre discovered water channels.
• Water channels facilitate the rapid transport of water across cell membranes in response to
osmotic gradients. These channels are believed to be involved in many physiological processes
that include renal water conservation, neuro-homeostasis, digestion, regulation of body
temperature and reproduction. Members of the water channel superfamily have been found in a
range of cell types from bacteria to human. 105
Chaperone assisted protein folding
• Folding of proteins in vitro tends to be an inefﬁcient process, with only a minority of
unfolded molecules undergoing complete folding within a few minutes.
• More than 95 percent of the proteins present in cells are in their native conformation.
• The explanation for the cell’s remarkable efﬁciency in promoting protein folding probably
lies in chaperones, a family of proteins found in all organisms from bacteria to humans.
• There are two general families of chaperones: molecular chaperones, which bind and
stabilize unfolded or partially folded proteins, thereby preventing these proteins from being
degraded; and chaperonins, which directly facilitate their folding.
• Chaperonins are probably used for a speciﬁc and relatively small selection of proteins,
whereas molecular chaperones are used for most, if not all, proteins.
• All chaperones have ATPase activity, and their ability to bind and stabilize their target
proteins is speciﬁc and dependent on ATP hydrolysis.
• Molecular chaperones include the Hsp70 family of proteins. When bound to ATP, Hsp70
assumes an open form in which an exposed hydrophobic pocket transiently binds to
exposed hydrophobic regions of the unfolded target protein. Hydrolysis of the bound ATP
causes Hsp70 to assume a closed form, releasing the target protein. Molecular chaperones
are thought to bind all nascent polypeptide chains as they are being synthesized on
ribosomes. More on GroEL 106
David S. Goodsell: The Molecule of the Month appearing at the PDB
• Proper folding of a small proportion of proteins (e.g., the cytoskeletal proteins actin and
tubulin) requires additional assistance, which is provided by chaperonins.
• Shown on this slide is the bacterial chaperonin, GroEL, which contains 14 identical subunits
stacked in two concentric rings (green). GroES is shown at the bottom in pink.
• The large GroEL-GroES complex is available in PDB entry 1aon. In this picture, three of the
subunits in each GroEL ring have been removed to show the interior, leaving four subunits
in each ring. On the two in back, the hydrophobic amino acids, LEU, ILE, VAL, MET, PHE,
TYR and TRP, are coloured blue.
• Notice the stripe of hydrophobic amino acids around the entry at the top. This will interact
strongly with unfolded proteins by coaxing them into the upper cavity. Once the unfolded
protein is bound, ATP and GroES bind to GroEL. This causes a conformational change that
forces the protein into the larger lower cavity that is much more hydrophilic than the upper
• Now that the protein is in a hydrophilic environment, it will be forced to fold in order to
minimize they unfavourable interactions between its hydrophobic portions and its
• After the protein has folded, ATP is hydrolyzed and GroES (the lid on the cavity) is released
along with the newly folded protein.
• Q: When use chaperonin to help proteins to fold, the GroES will bind to GroEL to the large
cavity side or hydrophobic stripe side?
• A: I believe it can bind to both sides. Don't worry about the details. 107
Proteins often consist of multiple independent
domains and have 4 structure o
• Some polypeptide chains fold into two or more compact regions that may be connected by a ﬂexible segment of
polypeptide chain, rather like pearls on a string.
• These compact globular units, called domains, range in size from about 30 to 400 amino acid residues.
• For example, the extracellular part of CD4 (shown at top), the cell-surface protein on certain cells of the immune
system to which the human immunodeﬁciency virus (HIV) attaches itself, comprises four similar domains of
approximately 100 amino acids each. Often, proteins are found to have domains in common even if their overall
tertiary structures are different.
• Antibodies (immunoglobins) have a distinct domain structure in addition to quaternary structure. We will be taking a
much closer look at antibody structure in the next section.
• Quaternary structure refers to the spatial arrangement of subunits and the nature of their interactions.
• The simplest sort of quaternary structure is a dimer, consisting of two identical subunits. This organization is present in
the DNA-binding protein Cro found in a bacterial virus called λ.
• More complicated quaternary structures also are common. More than one type of subunit can be present, often in
variable numbers. For example, human hemoglobin, the oxygen-carrying protein in blood, consists of two subunits of
one type (designated α) and two subunits of another type (designated β). Thus, the hemoglobin molecule exists as an
• Viruses make the most of a limited amount of genetic information by forming coats that use the same kind of subunit
repetitively in a symmetric array. The coat of rhinovirus, the virus that causes the common cold, includes 60 copies
each of four subunits. The subunits come together to form a nearly spherical shell that encloses the viral genome.
• Q: It mentioned that the coat of rhinovirus includes 60 copies each of four subunits. But from the picture I only see
three coloured subunits. What's wrong in this.
• A: There is a 4th protein that is inside and not visible from the outside. 108
Post-translational modiﬁcations of proteins
N-link glycosylation O-link glycosylation S-link glycosylation C-Mannosylation
(Asn) (Ser, Thr) (Cys) (Trp)
O HN HO OH OH
HO O O O
HOO NH O HO HHO HO O H
R NH HN HN HO N
O S HO
O H H HO HO
O N N H
H H N N
O O H O
H O H O H O H O
N N N N N N N N
H H H H
N-MethylatiN+ N-AcylatiHN O phosphoryla-O P O sulfatio-O S O
(Lys) (Lys, N-term) (Ser, Thr, Tyr)- (Tyr)
prenylation/farnesylatiNn N N N S-Acylation
(Cys) H H (Cys)
• Many proteins are covalently modified, through the attachment of groups other than amino acids,
to augment their functions. Many proteins, especially those that are present on the surfaces of cells
or are secreted, acquire carbohydrate units on specific asparagine residues. The addition of sugars
makes the proteins more hydrophilic and able to participate in interactions with other proteins.
Conversely, the addition of a fatty acid to an α-amino group or a cysteine sulfhydryl group produces
a more hydrophobic protein that will be tightly associated with the membrane.
• Glycosylation is the most common modification in mammalian cells.
• Proteins can also be reversibly modified to regulate their activity. Perhaps the most important
modification for signaling pathways is phosphorylation and dephosphorylation of serine, threonine,
and tyrosine residues. Regulation of protein activity by phosphorylation is basis for intracellular
signalling. The enzymes that catalyze the addition of phosphate groups (from ATP donors) are called
kinases (why kinases?). Enzymes that remove phosphate groups are called phosphatases.
• Histones—proteins that assist in the packaging of DNA into chromosomes as well as in gene
regulation—are rapidly acetylated and deacetylated on specific lysine residues in vivo. More heavily
acetylated histones are associated with genes that are being actively transcribed. A more
permanent modification of lysines in histone proteins is methylation.
• The attachment of ubiquitin, a protein comprising 72 amino acids, is a signal that a protein is to be
destroyed, the ultimate means of regulation.
• This slides shows only a few of the common examples. A number of additional post-translational
modifications are known. Lysine H C=O2 2 ε-N-monomethyllysine HP1 chromodomain recognize short peptide motifs that are embedded in
target proteins, but do not bind stably until the pep-
Acetyl O 109 tide has acquired an appropriate PTM
c Acetylation domains usually have a conserved binding pocket for
+ CoA CoA
Post-translational modiﬁcations are catalyzed NH3 HN the modified residue and a more variable surface that