5 Chemoselective strategies
5.1 Site-selective modification of biopolymers
5.1.1 General considerations
We will be concerned with labeling biomolecules, and the issue ofspecificitywillbea
recurring problem. We may want to label a specific functional group (e.g. an amine), however in
a biopolymer there may be many instances of this group. Can we target only one functional
group among many in the same molecule? We will explore strategies for tackling this problem,
and they will break down into two general areas: (1) exploitation of microenvironments or (2)
introduction of, or use of unique functional groups. One could imagine that strategy (2) has some
inherent advantages, but may face limitations in many situations. For example, Cys-based
reactivity is often the most commonly used strategy as the thiol group is easily differentiated
from other common protein nucleophiles. However, if our protein of interest contains multiple
Cys residues, we are less likely to have site-specific control of the reaction (only functional-
group control.) In some situations this issue may not matter, however in other instances it may be
188.8.131.52 Frequency of reactive sites
The reactivity of individual functional groups within biopolymers is discussed in later
sections on Proteins (Sec III), Nucleotides (Sec IV), and Carbohydrates (Sec V). However, a
primary consideration in site-selectivity is the number of a particular functional group that may
be present in a given biopolymer. We will be concerned with proteins for this discussion, as they
have the most chemical variety in functional groups. Common proteins can be composed c.a. 20
AA residues, versus 4 in DNA or RNA. While carbohydrate polymers may consist of a large
number of monomers (especially when potential modifications are included), much of their
variability is due to differences in stereochemistry rather than functional groups.
Statistical analysis of protein structures has provided useful information regarding the
frequency of common amino acids; as well as their availability to solvent within folded proteins.
Not all residues are equally represented in protein structures, which may be an important
consideration in choosing a bioconjugate strategy.
179 BioconjugateChemistry Cairo
amino acid % occurrence 1 % buried 1
Leu 9.5 41
Ala 7.5 38
Lys 5.8 4.2
Ile 5.5 65
Arg 5.2 0.0
Try 3.3 13
Met 2.8 50
His 2.2 19
Cys 1.8 47
Trp 1.3 23
Based on these data, we can see that one strategy is to choose reactivity for very low
abundance residues. However, the values above are for average occurance and specific protein
sequences will contain variations. Also note that, on average, a 56 residue protein will have one
The solvent accessibility of specific residues also gives us an aggregate view of reactivity:
solvent inaccessible residues may be very difficult to react, while very accessible residues will be
the most reactive. In the case of Arg, this residue is always found to be solvent accessible, and is
also of relatively high abundance. Thus, Arg-reactive chemistry is likely to result in multiple
reaction sites in most proteins.
184.108.40.206 Protein structure
Proteins and biopolymers may adopt stable conformations, or folds, that will alter the solvent
accessibility of specific parts of the molecule. Proteins often adopt a globular fold, which may
consist of several structural elements. A brief review of the organization of protein
Primary structure (1°). The chemical (covalent) structure of the biopolymer. This property
may also be referred to as the sequence or even primary sequence of a polypeptide.
Secondary structure (2). Regular conformation found within a globular fold. Typical elements
of these local structures include: -helices, -strands or -sheets, poly(Pro)-helix, turns or loops.
Approximately 30% of a.a. residues are found in -helices, and about 28% are found in -sheets.
180 BioconjugateChemistry Cairo
Note that secondary structures have regular conformations, which are most clearly noted in
the relevant torsional angles of individual residues. The figure below illustrates the relevant
torsion angles and their typical designations.
These torsions fall into expected ranges for particular secondary structures:
Torsion Atoms Expected rangesxi
N C : -120°-60°
C(O)-C : -45°
C(O)-N ’ 0°, 180°
C C 0°, 120°, -120°
Secondary structures are typically stabilized through non-covalent interactions, such as
hydrogen bonds, van der Waals forces, electrostatics, or dipole effects.Thesestructuresoften
result in easily recognized repeating structures. Some examples:
xFor an expanded discussion of protein secondary structure, see Proteins, 2 Ed., Creighton, W.H. Freeman &
Co., Chapter 6.
181 BioconjugateChemistry Cairo
220.127.116.11 Protein microenvironments
Protein microenvironments may affect the reactivity of reagents and of specific functional
groups within a protein or other biopolymer. 2 This effect can take many forms, the simplest is the
exclusion of a functional group from the reactive environment in the solvent. But let us first look
at the null hypothesis – are there situations where protein microenvironments are not relevant?
In the case of denatured proteins, the answer is a qualified ‘yes’. In the reaction of amine
groups with a good electrophile, the quantitative conversion of alllysinesidechainscanbe
accomplished in denaturing conditions, for example in high concentrations of guanidine (6M).
Thus, the denatured (random coil) form of the protein may exhibit uniform reactivity.
In contrast, folded proteins can exhibit surprising differences in reactivity and stability of
individual functional groups. The factors that influence protein microenvironments, and reagents
that experience them, have been summarized by Cohen in 1968. An abridged summary is given
Effects of the protein environment on functional groups
182 BioconjugateChemistry Cairo
i. Protection/solvent accessibility. Sites may be masked by solvent exclusion, or made
more reactive by solvent accessibility.
ii. Hydrogen bonding. May increase the pKa of a site, or alter nucleophilicity. Note that a
LP engaged in a H-bond will be less available for nucleophilic attack. In contrast, a H-
bond donor heavy atom may have greater nucleophilicity by virtue of a more labile
proton (and more electron density).
iii. Field effects. The electronic environment of a reactive site may alter pKa or reactivity.
iv. Polarity of the local environment. Buried hydrophobic (non-polar) vs. solvent exposed
(polar) environments can alter reactivity. A basic residue, like Lys, when buried in a
non-polar environment can experience large depression of its pKa value 4–as low as
5.3 (-5 pKa units)! Conversely, an acidic residue, like Glu or Asp, when buried in a
non-polar environment often shows a positive shift in pKa (+2-3 pKa units.)
v. Reversible covalent interactions and placement of reactive groups. Examples would
include acetal formation and reversible acylation.
Effects of the protein microenvironment on reagents:
i. Selective association. Binding sites may localize the reagent to a particular
ii. Electrostatic interactions. Charged reagents may be attracted or repelled.
iii. Sterics. Sites may be sterically inaccessible to large reagents, or conversely – small
reagents may access all sites.
iv. Polarity. May alter the reactivity of the reagent.
v. Other functional groups. May assist or interfere with reactivity.
vi. Conformation. A reactive conformation may be prevented or favored.
183 BioconjugateChemistry Cairo
All of these factors may alter reactivity for a functional group with a reagent. Folded proteins
are more likely to have more factors operative. All factors are essential for understanding “site-
selective” reactions; i.e. any reaction which is selective for one (or a few) functional group site(s)
in the presence of others.
18.104.22.168 Examples of Protein microenvironments
The conformation of the protein may alter the stability of bonds. For example, the energy of
individual peptide bonds can vary depending on their location within the structure.
There are many examples of solvent/surface exposure of a reactive group influencing its
reactivity. The staphylococcal nuclease contains no cysteine residues, and was found to be
unreactive with mustard alkylating agents. Mutation of surface exposed residues to Cys resulted
in mutants that could react with alkylating agents; even so, the reactivity of individual Cys
residues were not identical. 7 Alkylation of sulfhydryl groups with chloroacetic acid and
chloroacetimide shows differences based on microenvironment. A likely explanation in this case
is the electrostatic repulsion of the reagent at high pH (due to its negative charge) and attraction
of the reagent at lower pH than its usual range of activity. A similar effect is seen in other Cys
proteases.9 Alkylation of His residues with DEPC can also be affected by protein
microenvironment. Isom et al. have shown that the specific location of charged residues within
aproteincandramaticallyalterthepKaofionizablegroups. 4 Lysine-reactive indole reagents
provide evidence of Lys microenvironments in HEWL.
All these example serve to illustrate that there are a number of factors to take into account
when considering protein microenvironments. Also see Sec 6.1.6 for a detailed example of site-
selective reactivity of His residues in a folded protein.
5.1.2 Affinity Labeling
22.214.171.124 Complex formation
One of the most successful strategies for site-selective modification of proteins and
biopolymers takes advantage of the presence of protein microenvironments that alter functional
group reactivities through specific non-covalent interactions. This strategy can be applied to any
ligand-receptor pair where stable, specific, interactions are available. First some definitions:
active site: a region of an enzyme which contains the functional groups responsible for catalysis.
184 BioconjugateChemistry Cairo
combining site: a region of a receptor which contains the functional groups responsible for
Since we will be discussing receptor-ligand and enzyme-substrate interactions, we will need
some general review of terms and standard equations for these systems. Lets start with enzyme
catalysis. A standard equation for an enzyme-catalyzed reaction is shown below, where E is the
enzyme, P is the product, and S is the substrate:
▯▯▯ →▯▯ →▯▯ →▯▯ ▯ ▯ 1 ▯
← ← ←
Typically the rate limiting step is the conversion of ES to EP, and the dissociation of EP is
typically favorable. Thus, we can simplify the previous equation to:
▯▯▯ →▯← ▯▯▯▯▯▯▯ 2 ▯
In this case the formation of product can be written as:
▯▯ ▯▯▯▯▯ ▯ ▯▯▯ ▯▯3 ▯▯
If we can assume that the reverse rate of ES formation (k )-1s significantly slower than the
rate of product formation (k-1< k )cathen we can rewrite the equation, with the rate of product
formation, v, as:
▯▯ ▯▯ ▯▯▯ ▯▯▯ ▯
If we develop an equation for [ES], we can develop a more useful form of this equation. First
we can generate a statement of the equilibrium constant for the complex of [ES], which we will
define as the Michaelis constant:
▯ ▯▯ ▯ ▯ ▯▯▯▯▯▯ ▯
▯▯▯▯ ▯ ▯▯▯▯
Acknowledging that the initial enzyme concentration, [E] 0 can also be written as:
▯ ▯ ▯ ▯
▯ ▯ ▯▯ ▯▯ ▯ ▯ ▯
Combining these two equations allows us to develop the following statement of [ES]:
▯▯ ▯ ▯▯▯ ▯
Finally, combining Eqs. 4 and 7 gives:
▯▯▯▯ ▯ ▯▯▯
▯▯ ▯ ▯
▯ ▯▯ ▯
185 BioconjugateChemistry Cairo
When we combine k [Ecats 0 new constant (Vmax), we get the final form:
▯▯ ▯▯▯▯▯▯▯ ▯ ▯▯▯▯ ▯
▯▯ ▯▯▯ ▯▯▯▯
Eq. 9 is known as the Micheaelis-Menten equation, and is a foundation of most enzyme
kinetic analysis. Note that the rate of the reaction when [S] = K is
▯▯▯▯▯ ▯ ▯▯▯▯
▯▯ ▯ (10)
▯▯▯▯ ▯ ▯
Thus, K Man also be defined as the concentration at which the reaction proceeds at half-
maximal velocity. More importantly, the K Mescribes the affinity of the substrate to the enzyme
active site, where 1 and k-1re the forward and reverse rates of binding to form ES. Note that if
kcat is small, M approaches K Dor substrate binding.
▯ ▯▯▯▯ ▯▯▯▯▯ ▯▯▯
▯▯▯ ▯▯▯▯▯▯ 1 ▯▯ ▯
If we turn to receptor-ligand interactions, the situation is similar with the exception of no
chemical change taking place. However, the formation of the receptor-ligand complex can be
followed by a similar to that of the [ES] complex. The equilibrium we are dealing with here is:
▯▯▯▯↔ 2 ▯▯▯ ▯ ▯
An equilibrium constant for the association (A ) of the RL complex would be (units M ):
▯ ▯▯3▯ 1 ▯
or, as a dissociation constant DK ) (units M):
▯ ▯▯4▯ 1 ▯
▯▯ ▯▯ ▯ 1 ▯
Recall that we can relate free energy change to an equilibrium constant using a form of the
Arrhenius equation (G=-RTlnK ). Asing a similar substitution strategy as above, we can write
an equation for the RL complex:
▯ ▯ ▯ ▯▯▯ ▯▯▯ ▯▯▯
▯▯▯ ▯ ▯▯▯▯ ▯▯▯▯ 1▯ ▯▯ ▯
So, to summarize, we have to general situations for complex formation:
186 BioconjugateChemistry Cairo
▯▯ ▯▯▯▯ ▯ ▯▯▯ ▯ ▯▯▯▯ (17)
▯▯ ▯▯▯ ▯ ▯▯ ▯
▯ ▯ ▯ ▯ ▯▯▯▯▯▯▯
▯▯▯ ▯ ▯▯ ▯▯ ▯8 ▯▯ ▯▯ ▯
In both cases, complex formation will put functional groups of the protein in proximity to the
ligand or substrate. This increases the likelihood of a reaction taking place. As a result, the
binding energy of the complex provides the site-specificity in these cases. This is an example of
what Cohen described as “specific association.” 2
Note that compounds which form complexes resulting in displacement of the native substrate
or ligand are known as inhibitors. These can be classified in several ways, one of them is by how
reversible the inhibitor-protein complex formation is.
reversible inhibitor: often known as a competitive inhibitor, essentially sets up a second
competing equilibrium with the formation of the ES complex.
▯ ▯▯9▯↔ ▯ ▯
irreversible inhibitor: inhibitors in this class can have different origins of their activity. In the
general case, they form an enzyme‐inhibitor complex that cannot be reverted back to the
uncomplexed protein (E), consuming the enzyme.
▯▯▯ →▯← ▯ ▯ ▯ (20)
Note that there are a variety of mechanisms for enzyme inhibition, a large class of slow
binding inhibitors that can show unusual kinetics, but are not irreversible.
126.96.36.199 Organophosphates as affinity labels of proteases
An early example where this principle could be exploited to label an active site of an enzyme
is the use of diisopropyl phosphofluoridate (DFP). DFP acts as an inhibitor of esterases and
proteases. To consider this example in detail, we first need to summarize the general case of
enzyme inhibitors and the form of the reaction catalyzed by proteases. For this first example, we
will be concerned with the protease, chymotrypsin. As any protease, chymotrypsin cleaves a
protein amide linkage. The specific linkage is specified using the nomenclature shown below,
where the amino acid side chains of the peptide N-terminal to the scissile bond are designated as
187 BioconjugateChemistry Cairo
P1, P2, P3…Pi and those C-terminal to the scissile bond are designated P1’, P2’, P3’…Pi’ et
cetera. Also note that the enzyme sites that recognizes specific sidechains are designated as S1,
S2, S3…Si, if they recognize the sidechains of P1, P2, P3…Pi and so on.
In the case of chymotrypsin, the enzyme tends to cleave substrates that contain aromatic
residues at the P1 site (Phe, Tyr, Trp). The mechanism of cleavage is proposed to be catalyzed by
a triad of residues: Ser195, His57, and Asp102.
O O - H
H H O O
N N - N
H O N H N N
S1 H N Ser195
O H N O N -
H 2 - O O
N OH O H
N H2O Ser195
OH N S1
188 BioconjugateChemistry Cairo
DFP binds to, and acts as a covalent (irreversible) inhibitor of chymotrypsin (a proposed
Of 20 Ser residues found in Chymotryps in, only Ser195 is labeled by DFP . The compound is
not a general “enzyme poison,” in that it does not react with all other enzymes. Instead, the
compound specifically targets esterases and proteases, supporting a specific association between
the compound and particular enzymes. In this case, the specific association of the reagent is
likely encouraged by the S1 site and the electrophilic character of DFP combined with the
exceptional nucleophilicity of Ser195 in the active site.
Organophosphate modification of the enzyme active site can be reversed by hydrolysis,
however it is typically observed that the enzyme-inhibitor complex undergoes a process of
ageing which prevents reversibility. One might consider that an organophosphate could undergo
a hydrolysis reaction as outlined below.
O AChE O O
O P F O P O O O
O -HF O -OiPr Me O
In the active site of acetylcholinesterase (AChE), compound I is found to form II at short
times and III at long times. Product III is stabilized by non-covalent contacts in the active site,
189 BioconjugateChemistry Cairo
further reducing the reversibility of the reaction. But what is the mechanism of the conversion
from II III? Two potential mechanisms are shown below.
O H2O O OH O
O P -O P O
O P O O O
O Me R -HOiPr Me O R
C-O scission -
+ O P O
The experimental data support the carbocation mechanism due to C-O bond scission. The
strongest evidence is the observation of carbocation rearrangement products from
Ageing of organophosphates is important for treatment of toxic exposure to these compounds.
Organophosphates, due to their activity against AChE, are potent nerve agents. If intermediate II
can be hydrolyzed before ageing, then the active site Ser can be regenerated and restore enzyme
activity. This can be accomplished by employing a class of nucleophiles that drive the hydrolysis
of intermediates of the form of II.
190 BioconjugateChemistry Cairo
Oxime reagents like pralidoxime (2-PAM) are used prophylactically when exposure to nerve
agents is a risk, or as a way to restore some AChE function after exposure.
The heart of this strategy is the specific association of the reagent with a unique site on the
protein target, and this is often referred to as affinity labeling. When used to identify classes of
proteins through proteomic approaches, it is also referred to as activity based protein profiling
(ABPP). A variety of groups have exploited this strategy – far too many to allow a
comprehensive summary here, but the reviewer is referred to many excellent reviews. We
will discuss some classic examples to illustrate the utility of the method.
While chymotrypsin binds to, and labels, DFP one might consider using affinity labeling to
probe the enzyme active site in more detail. Before the elucidation of the structure of
chymotrypsin, the site of protein modification could be used to identify sites that were in
proximity of the active site. Lawson & Schramm designed an affinity labeling strategy to do
191 BioconjugateChemistry Cairo
O O S
O P H
O Br Ser195
O H S1
O S+ -O S +
O H O H
Met OH Met
First a substrate analog was designed containing two components: an electrophilic site along
with an easily hydrolyzed functional group that could react with Ser195. Compound 2-I was
found to label a Met residue which was then inferred to be close to the active site. Note that
treatment of the enzyme with compound 2-I resulted in reduced activity of the enzyme, but not
complete loss. Subsequent treatment with DFP could eliminate this remaining activity,
confirming that Ser195 was freed by the end of the sequence summarized in the scheme below.
This example illustrates two principles related to affinity labeling:
endoalkylation: alkylation of an enzyme by an active‐site directed inhibitor within the active site.
The reaction may result in inactivation or denaturation of the active site.
exoalkylation: alkylation of an enzyme by an active site directed inhibitor outside the active site.
192 BioconjugateChemistry Cairo
Note that these definitions depend upon how one defines the active site, and here it is left
arbitrary. If the active site is defined as only residues responsible for catalysis, then the Met
alkylation above is exoalkylation. If it is defined as residues near the active site, then Met may be
part of the active site and this could be considered endoalkylation. Also note that alkylation
through affinity labeling is not restricted to active sites of enzymes – we can treat the same
processes in regards to combining sites of receptors.
188.8.131.52 Protease and peptidase inhibitors
A brief aside on the nature of protease/peptidase transition states is relevant as a classic
example of rational enzyme inhibitor design.
Protease enzymes have long been proposed to proceed through a tetrahedral intermediate
(species II), as the general scheme below illustrates. Note the active site nucleophile (shown for
the case of a Serine protease) is activated by a general base, such as His. Attack on the scissile
peptide bone proceeds through formation of the tetrahedral intermediate, loss of the amino
terminus and formation of a covalent ester intermediate (III). The ester is then subject to
hydrolysis to release the C terminal peptide fragment.
193 BioconjugateChemistry Cairo
The structure of the tetrahedral intermediate has been used to develop potent inhibitors of
protease enzymes. Non-hydrolyzable isosteres are used to mimic the transition state, and block
the active site.
isostere – molecules or ions with the same presentation of valence electrons. In drug design,
isosteres are intended to mimic the shape (sterics) of a transition state or substrate, sometimes
called a bioisostere when biologically active.
The archetypal example of a rationally designed inhibitor of protease enzymes are statin-
based inhibitors. The statins are non-hydrolyzable tetrahedral mimics of species II in the scheme
above. Statin-based drugs have been extremely successful for the inhibition of HMG-CoA
reductase, used to treat cardiovascular disease (CVD) by suppression of cholesterol
20, 21 22
biosynthesis. Transistion-state mimics have also been used to target HIV protease.
One strategy for generating a mimic of the tetrahedral intermediate is to manipulate the
electronics of a carbonyl group. Abeles demonstrated that a sufficiently electron deficient ketone,
which favors a hydrate form of the structure, can also act as an enzyme inhibitor and transition-
The tetrahedral intermediate has been directly observed by crystallography. After formation
of the covalent ester intermediate, the crystalwas subjected t o a pH jump (from pH 5 to 7) and
trapped at low temperature (LN ). 2
194 BioconjugateChemistry Cairo
R O R O
3 H 3 H
N O N O N NH
H O R 2 H O R2
H N+ NH O H
pH 5 pH 7
R3 H -O OH
184.108.40.206 Affinity labeling of receptors
Affinity labeling strategies can be applied to receptors, even though they lack catalytic
activity. An early example demonstrates this clearly, and all that is required is (1) a tight binding
receptor-ligand pair, and (2) a reactive group on the ligand to generate the labeled receptor.
Using an antibody raised against a small molecule hapten, such as an arsenate (I), the arsenate
can be modified to react wrotein residues. Modification to include a diazo group allows for
reaction with Tyr residues in the antibody (III). Additionally, the modified arsenate gives an
increased absorbance at 475 nm.
195 BioconjugateChemistry Cairo
Experiments demonstrated that II only resulted in modification of a single Tyr residue in the
antibody, and the reaction proceeded at 400-fold faster rates with the ligand-receptor pair (as
compared to non-specific antibody controls.) Diazo compounds that lack the arsenate moiety are
slow to react with the specific antibody. Thus, the labeling is driven by the affinity of the
complex, which accelerates the Tyr-labeling reaction. 24, 25
220.127.116.11 Acyl-transfer catalysts in affinity labeling
A more modern example of exo-alkylation exploits acyl transfer catalysts to perform affinity
labeling of a receptor. This strategy has been demonstrated for even low-affinity receptor-ligand
acyl transfer catalyst – a catalyst which accelerates acylation of a nucleophile. Dimethyl amino
pyridine (DMAP) is a prototypical example of an acyl transfer catalyst.
196 BioconjugateChemistry Cairo
lectin – a carbohydrate‐binding protein (receptor).
Acylation transfer catalysts are useful reagents in organic synthesis. However, in the context
of a protein or biopolymer these reagents may not react selectively. A biopolymer featuring
many nucleophiles and electrophiles would likely result in a randomly cross-linked aggregate
when exposed to a “hot” acylation catalyst. So how can these reagents be used to selectively
target sites on a receptor?
If the catalyst could be localized to a specific site on the protein, it could be used to catalyze
the alkylation of a particular nucleophile. One strategy would be to attach the acyl transfer
catalyst to a moiety that is recognized by the receptor. The recognition element would provide
specific association, and the proximity of the catalyst would allow for selective alkylation. This
strategy was first described as “affinity-guided DMAP catalysis” (AGD-catalysis) by Koshi et
al. Experiments found that benzylthiols were much slower than thioesters when used as
acylation reagents. The uncatalyzed reaction (where the DMAP was not attached to the
recognition element) was approximately 20-fold slower than the catalyzed reaction. Labeling of
the receptors resulted in selective labeling of a single amino acid (Tyr51) in the protein,
confirming the selectivity of the experiment. This modification could be reversed with treatment
at high pH (pH 12, 24 h).
197 BioconjugateChemistry Cairo
If the acylating reagent (a thioester in this case) contains a fluorescent label, then the progress of
the reaction can be observed using spectroscopy. Additionally, the location of the labeling
reaction can be probed if a corresponding ligand is made that contains the recognition element
and a fluorescence-quenching group. Binding of such a ligand would bring the quencher in close
proximity to the fluorophore. Additionally, this quenching should be reversible upon
introduction of a “cold” recognition element through competitive binding.
In this particular study, the receptor was a lectin, known as Congerin II, which binds to lactose.
Thus, the recognition element was a lactose group, and the reagents for this study included the
198 BioconjugateChemistry Cairo
OH OH OH
O O O N O O OH
HO HO O
OH OH N
OH OH OH
O S O
HO O O O O
Advantages of this strategy include:
1. Introduction of a single fluorophore label to the protein.
2. Avoids disruption of the active site. The DMAP reagent “protects” the combining site
residues from direct acylation.
There are disadvantages as well: background acylation is observed, although specific
acylation is much more rapid.
The versatility of the strategy is illustrated by improvements to the acylation catalyst, and
tests of selectivity in the presence of multiple receptors. General protocols are available for the
synthesis of reagents and their application.
Selectivity of the AGD-catalysis strategy can be seen by using the probes in the presence of
multiple lectin receptors. In this case, the receptor-ligand pairs were: Congerin II (lactose), WGA
(GlcNAc), and Concanavalin A (Man). In a mixture ofectins, the lactose reagents sho
above selectively targeted Congerin II, implicating the recognition element in the activity of the
The reactivity of the acylation catalyst can also be improved by altering the number and
display of the recognition elements or the catalytic group. A multivalent presentation was
generated containing four separate copies of the recognition element. This strategy is well known
199 BioconjugateChemistry Cairo
to increase the apparent affinity (avidity) of ligands for their targets. 29, 30Display of four
recognition elements attached to one DMAP-catalyst group improved the efficiency of labeling.
Alternatively, adding additional DMAP groups to a single recognition element increased the rate
of reaction, but did not reduce selectivity. Rates were increased nearly four-fold for a “di-
DMAP” derivative, and nearly five-fold for a “tri-DMAP.”
200 BioconjugateChemistry Cairo
AGD-catalysis has been demonstrated on other systems. Acylation of specific cell surface
receptors (GPCRs) can be performed in 30 min at 37 C (concentrations of probe 10 – 20 mM).
Related strategies have targeted DHFR receptors on live cells. 32
Acylation of a recognition site has also been accomplished withouttheuseofanacylation
catalyst, by exploiting a reactive group on the recognition element itself. Thioesters conjugated
to a carbohydrate recognition element have been used to label maltose-binding protein (MBP). 33
This strategy likely takes advantage of the expulsion of N2 after nucleophilic attack at the
thioester group. Glycosyl hydrazides are reactive enough for glycosylations, but are relatively
stable in water.34
This strategy has most recently been improved be eliminating the need for the addition of a
probe (thioester) reagent. By designing the reagent to include an electrophilic site, it acts as an
labeling agent when in complex with the protein target. This revised strategy exploits an aryl
sulfonate leaving group as the acylation reagent. This approach is notably similar to one of the
earliest examples of a reagent designed for affinity labeling that includes a recognition element
and an electrophilic trap.
201 BioconjugateChemistry Cairo
18.104.22.168 Chemical mutagenesis
Perhaps the most venerable application of affinity labeling in bioconjugate chemistry is
known as chemical mutagenesis. This is not to be confused with the use of chemical agents
which may result in mutagenesis (damage of genetic information.) The process is distinct, and
involves only translated proteins. The earliest example of which is from the mid-1960s, but
modern examples abound.
chemical mutagenesis – a process which converts an amino acid residue on a translated protein to
another amino acid residue. (Also defined as “post‐expression mutagenesis” by some groups.)
An example of chemical mutagenesis would be the specific conversion of a Ser residue to a
Cys residue – using chemical reagents (rather than a molecular biology approach.) At first it may
seem confusing to even consider chemical strategies for this type of conversion. After all, are not
molecular biology methods ideal for switching an amino acid from one functional group to
another by genetic means? There are two key reasons why this approach is worthy of
consideration. First, one should consider that historically the tools of molecular biology have not
always been readily available to the bioconjugate chemist. Site-directed mutagenesis (SDM) was
not a commonly employed technique until the mid-to-late 1970s; therefore, any attempts at
conversion of amino acid residues before this time had no other outlets. Certainly, this problem
has since been resolved – but the second reason is perhaps more important. A versatile chemical
mutagenesis strategy should allow for not only the conversion between naturally occurring
amino acid functional groups, but also the introduction of new and unnatural amino acid
functionality. It is for this reason that we will examine the topic in detail – when properly
designed chemical mutagenesis allows us to specifically alter the chemical structure of a protein
and introduce new chemical groups.
Let us examine the first example of chemical mutagenesis as presented by Koshland to
convert a Ser residue to Cys in the protease, Subtilisin. 37Subtilisin was selected as it does not
contain any naturally occurring Cys residues, simplifying analysis of the product (unlike other
proteases, such as Chymotrypsin.) Note that the topology of the Subtilisin active site is similar to
that of Chymotrypsin, and contains a catalytic triad: Ser221, His64, and Asp32.
202 BioconjugateChemistry Cairo
Like other proteases, subtilisin reacts with reagents similar to DPF to form a covalent
complex. Other electrophilic reagents, such as phenylmethanesulfonyl fluoride (PMSF) also
react with the active site Ser residue (also see Sec 5.2.7). The resulting sulfonate ester can be
displaced by nucleophilic attack with a thiolate ester. The resulting enzyme conjugate bears a
thioacetate in plase of the active site Ser, which upon hydrolysis generates a Cys residue in its
place. The net result is a Ser->Cys mutation achieved by chemicalmeans.Theuseofa C-
labeled PMSF reagent was used to confirm the stoichiometry of this conversion was close to
complete (95 ± 5 %). Modification of PMSF to make it more bulky (with a toluene, rather than a
phenyl group) reduced its reactivity; likely indicating that the sterics of S1 limited accessibility
of the reagent. The conversion to Cys was confirmed by several methods, including: reaction
with iodoacetate, dithio-p-nitro-benzoic acid (DTNB), and amino acid analysis. Modification of
the enzyme could be inhibited with the use of a competitive inhibitor, N-acetyl-L-phenylalanine.
203 BioconjugateChemistry Cairo
The resulting novoenzyme was found to be active on some substrates,forexampleesters
including good leaving groups. In the case of nitrophenylacetate (NPA), the enzyme was able to
cleave approximately 30% of the substrate (as monitored by A 407). However, the novoenzyme
was not active on peptide substrates. Additionally, the enzyme could be inhibited by p-
chloromercuribenzoate (PCMB), a reagent known to modify Cys residues. 38 Thus, the
novoenzyme, while still reactive, is remarkably different from the starting Ser protease enzyme.
If anything, we might have expected that the Cys-novoenzyme wouldbemorereactivethan
the starting Ser-protease. Isn’t a thiol a better nucleophile? A very similar study was conducted
by Bender et al. who had very similar findings. So what is going on? Consider that an enzyme
active site is not reactive due to the presence of an isolated nucleophile – it is an integrated
chemical environment. The nucleophile is balanced by general acids and bases as well as
neighboring residues that may isolate it from (or force it into contact with) solvent. Another way
to think of this is that making the nucleophile more reactive may prevent release of the product,
rather than just stabilizing the transition state (as the enzyme is evolved to do), the novoenzyme
may stabilize the product or starting material instead. So what use is this strategy? Here, all we
have succeeded in is reducing the catalytic activity of a perfectly good protease. Perhaps its value
lies in engineering new reactivity.
204 BioconjugateChemistry Cairo
The strategy above can be used to introduce novel functional groups that may introduce new
reactivity. Displacement of the thioacetate form of subtilisin with hydrogen selenide generates a
selenoenzyme. The selenoenzyme is similar to the thioenzyme: unreactive with amide substrates,
but reactive with activated esters. Seleno and thioesters tend to undergo aminolysis more readily
than oxoesters. Hilvert proposed that the selenoenzyme could therefore be used for amide
formation, rather than hydrolysis.
X kNH2 /OH relative
O 19 1
S 7,400 389
Se 27,000 1,421
Subtiligase is an engineered form of subtilisin which demonstrated that the thioenzyme could
indeed be superior at amide formation, rather than hydrolysis as observed in the native case.
Subtiligase was generated using a combination of strategies. The active site Ser221 was
converted to a Cys through SDM (Ser221Cys), followed by other mutations found to disfavor
hydrolysis (Pro225Ala). Finally, a library of P1 and P1’ substrates were screened to improve the
reaction. The resulting system was able to effect ligation of the P1 and P1’ substrates with >95%
205 BioconjugateChemistry Cairo
Although the subtiligase is very efficient, and results in a native peptide product (thus, it is a
native ligation) there are some key disadvantages. The primary limitation is that the subtiligase
enzyme must be engineered and the identity of the target and substrate is not general – but must
be carefully sought out. This fact limits the use of subtiligase to specific sequences. However, it
is still a remarkable example of protein engineering and bioconjugate chemistry that illustrates
the profound changes that can be made to enzyme function with seemingly subtle changes.
Clarke & Lowe have reported methods for conversion of Cys to SerorGlyusingaNorrish
Type II cleavage reaction. 43 Using the Cysteine-protease, papain, they first acetylate the active
site with a bromomethyl ketone. The conjugate is then cleaved photochemically to generate a
thioaldehyde. The thioaldehyde is unstable in water, and rapidly hydrolyzes to an aldehyde
206 BioconjugateChemistry Cairo
which can be reduced to Ser, or further reacted. 43 Treatment of the enzyme with bromomethyl
ketone alone irreversibly inhibits the enzyme; however, photolysis regenerates enzyme activity
(~75%). Subsequent cycles of bromomethyl ketone and photolysis is still able to regenerate
catalytic activity, but with diminishing amounts of activity with each cycle. Reduction of the
photolyzed enzyme with NaBH 4 resulted in radiolabeling of only the active site Ser residue;
although only 0.2 mol Ser per mol protein was labeled. The Ser-novoenzyme is not catalytically
207 BioconjugateChemistry Cairo
The Norrish Type II cleavage likely proceeds by the following mechanism. A Type II
cleavage involves the 1,6 abstraction of an H atom at the -position (with respect to the
208 BioconjugateChemistry Cairo
Note that a Type I cleavage proceeds by cleavage of the carbonyl carbon-C bnd as shown
209 BioconjugateChemistry Cairo
22.214.171.124.4 Cys->Dha (dehydroalanine)
The reaction of serine proteases with PMSF can also result in other side reactions which can
be useful for modification of an active site. Reaction of trypsin with PMSF was used to convert a
Ser residue to dehydroalanine (Dha) via a -elimination. Formation of a cyclic oxazoline
intermediate is also possible, which can undergo nucleophilic attack or hydrolysis. Dha or the
oxazoline can be reacted with a thiol to generate a Michael addition product.
O F O
HN OH HN O S
O 2-propanol O
5-exo-tet cyclization -elimination
H 2 O
SN2 NH Michael
2M HS 2 addition
HN NH 2 HN NH2
H2N S S
O O O
Note that formation of Dha in this way is not very efficient, and requires harsh conditions. A
more mild strategy was reported by Lawton et al. using a quinone diimide reagenEven this
method is not highly efficient, conversion of 12-50% of Cys residues to Dha was reported.
210 BioconjugateChemistry Cairo
fluoride. This is an improvement over PMSF as it generates a tosylate leaving group. Tosyl
chloride works on model peptides, but the tosyl fluoride gives better selectivity at the active site
of Ser proteases. The resulting Dha-enzyme can be derivitized using several reagents.
211 BioconjugateChemistry Cairo
Note that the basic elimination conditions are still harsh enough to induce b-elimination at
other sites in the protein (0.1 M NaOH, 35 °C, 20 hr). The anhydroenzyme is catalytically
inactive, but the protein was not denatured by the protocol. The anhydroenzyme can also react
with intramolecular nucleophiles, such as NH gr2ups of Lys residues.
Because Dha is useful for conjugation strategies (though the introduction of a new orthogonal
reactive group), mild conditions for its generation in proteins continues to be an area of
investigation. Davis and coworkers have reported a comprehensive study exploring a range of
strategies for generation of Dha from Cys proteases. 47Among several strategies, they found that
212 BioconjugateChemistry Cairo
generation of a sulfonium leaving group was the most efficient. They then designed a water-
soluble version of the reagent for use with native conditions. Note that these studies did not
explore proteins with additional Cys residues, so it is not clear if this strategy can be used on site
other than the active site nucleophile.
Note that the application of sulfonyl fluorides as tools for click chemistry is also discussed in
Sec 5.2.7. The reactivity of the sulfonyl fluoride with proteins in solution is extremely sluggish
when there is no complexation between the two molecules. However, complexation allows
nucleophilic attack on the fluoride. This fits well with the discussion of PMSF above, but can
also be exploited in other protein modification or labeling strategies.
213 BioconjugateChemistry Cairo
126.96.36.199 Activity-based proteomic profiling (ABPP)
Affinity-based protein labeling has been adapted as a strategy to identify new protein species
that carry the same enzymatic functions. In essence, an affinity-based label for an enzyme active
site is used to deliver a tracer to any active site of a particular enzyme class. So, one might label
all Serine proteases using a derivative of DPF that contains a tracer. Combining this approach
with mass spectrometry has been termed “Activity-based proteomic profiling” or ABPP.
Cravatt and coworkers have done precisely this using DPF analogs to capture esterase
enzymes. 50Here the DPF group is used as a “warhead”, and a tag which can be used to capture
the labelled enzyme is incorporated through a linker. This allows treatment of cell lysate with the
reagent, and affinity-enrichment for the tag provides a selection of the target proteins. A related
strategy employs epoxide substrates as a capture strategy for Cysteine proteases, such as
A large variety of labels have been designed for ABPP strategies. One should be cautious in
the application of these strategies.Therearetwocriticalcom ponents for a successful ABPP
214 BioconjugateChemistry Cairo
labeling event. First the specific association of the reagent with the active site must occur to
place the reactive warhead in proximity to the active site. Second, the reaction between the
enzyme and the probe must take place resulting in a covalent probe-protein conjugate. If either of
these steps does not occur, the strategy will fail. If the association of the probe is not specific,
false positives will occur do to promiscuous reaction. Consider if a probe is able to associate
with the active site, but the reaction between the enzyme and the probe is slow on the time scale
of binding/unbinding. This situation will result in diffusion of the probe from the desired site of
labeling. If the probe remains reactive, it may also result in false positives.
Quinone methide ligands have been used in several ABPP strategies. Probes for the
detection of phosphatase enzymes was developed for use in proteomic screening.Shen et al.
adapted the quinone methide strategy as part of an amino acid analog to detect protein tyrosine
HO O O -O
O F PTPase F
O N affinity tag O N affinity tag
O O Nu
O N affinity tag O N affinity tag
Quinone methides have been applied to ABPP strategies to identify glycosidase enzymes,
however there are questions about the efficiency of the method. Tsai et al. reported a quinone
methide probe strategy, and were able to identify labeled proteins in their meHowever,
later work by other groups has suggested that the lifetime of the quinone methide is so long that
it can diffuse out of the active site and label unintended targets (false positives.) These authors
were able to adapt the strategy by using it to detect enzyme activity at the scale of tissues and
whole cells, where the diffusion problem is essentially minimized.
215 BioconjugateChemistry Cairo
Alternative covalent labels for phosphatases have been explored using phosphonate analogs of
phosphotyrosine (pTyr). Widlanski first identified -bromobenzylphosphonates (-BBP) as
irreversible inhibitors of PTPase enzymes.Since then, other groups have improved on the
strategy by generating a tagged probe for ABPP, or amino acid analogs for peptide
incorporation. Related approaches have used vinyl sulfonates as an electrophile to capture the
PTPase active site nucleophile.
O O O
Br P Br P Br P
-OH -OH -OH
O O O
Cys O O
5.1.3 The Edman degradation
One arguably unique reactive site in a peptide chain is the 2NH group. However, while
chemically distinct, the -2Hroup shares very similar reactivity to the2-NH group of Lys. In
fact, many alkylation reactions will randomly react at either of these sites. However, there are
some chemistries known to differentiate between the - and 2NH groups – in some cases with
exquisite differences in reactivity. The Edman degradation is perhaps the best known and more
venerable of these reactions.
216 BioconjugateChemistry Cairo
The Edman degradation is performed by reacting phenylisothiocyanate with a peptide
containing a free amino terminus. The result is a thiourea conjugate which, when treated with
strong acid (TFA) undergoes a cyclization reaction selectively at -NH gro2ps. The resulting
product is a phenylthiohydantoin containing the amino sidechain of the N-terminal amino acid.
This compound can be compared to standards for other amino acids to determine the identity of
the amino terminal residues. A sequential application of this reaction would allow a sequencing
from N- to C- upon isolation of the phenylhydantoin derivatives.
Note that the reaction is selective at the -NH2 site as this is the only location in a typical
peptide sequence where a favorable cyclization can occur.
217 BioconjugateChemistry Cairo
The primary utility of the Edman degradation has been in the sequencing of short peptide
sequences or the compositional analysis of peptides. As an example, the Edman degradation was
critical to the determination of peptide hormone structures, such as oxytocin. 63
5.1.4 Selective -NH acyl2tion
Acidic acylation of proteins can be performed (HBr, AcOH), which will selectively acylate
oxygen nucleophiles as amine groups will become protonated under these conditions. Once O-
acetylation has taken place, alkaline conditions will result in neutralization of the amine
nucleophiles. For the special case of N-terminal Ser and Thr residues, this will result in an O->N
acyl migration. Any remaining O-acyl groups will be hydrolyzed under extensive alkaline
treatment. N-terminal acetylation is a common protein modification.
5.1.5 -NH Tra2samination
Transamination, and glyoxal-based chemistry in general, has had a resurgence in the past
decade. 66The use of sodium periodate at micromolar concentrations can be used to selectively
oxidize N-terminal Ser and Thr residues to glyoxal residues. 67 Conversely, the oxidized
glyoxylate in the presence of metal ions will oxidize the N terminal residue. 68
218 BioconjugateChemistry Cairo
188.8.131.52 Oxidation of N-terminal Ser and Thr
The N-terminal Ser and Thr residues are unique among proteins in their display of 1,2-amino
alcohol functional groups. Thus, chemistries that can target the 1,2-amino functionality can be
exploited for N-terminal modification of these residues. The most well-known of these methods
is the use of sodium periodate to oxidize the N-terminus to a glyoxal group. 67
The resulting glyoxal is a unique reactive handle that can be used for derivitization with
hydrazine or aminooxy reagents. Treatment of proteins with sodium periodate can result in
oxidation of Met residues to sulfoxides (when performed > pH 5). Although the sulfoxide can be
reduced, this may detract from use of periodate for labeling.
Transamination strategies can be used to oxidize the N-terminal residue of a protein or
peptide using relatively mild conditions. These conditions avoid the use of periodate as an
oxidant, and instead use glyoxa