Biology C2005 Lecture 4
Unlike the catch-all category of LIPIDS, NUCLEIC
ACIDS are biopolymers par excellence. There are 2 types, DNA and
RNA, the monomers are nucleotides, that have nitrogen-containing rings
and 5-carbon sugars. There are four types of monomers in each polymer.
We will discuss them in detail, but not for a few weeks yet.
::Proteins: Amino
acids are the monomers (20)::
PROTEINS. These are the
most important class of macromolecules in the cell, and we will discuss
them now in detail. The monomers that make up proteins are the amino acids,
of which there are 20. The same 20 in E. coli and in elephants and in
tomatoes.
The general structure of an amino acid is:
Note the central carbon atom, to which 4 different groups are attached:
an amino group (drawn by convention at the left), a carboxylic acid group
(put at the right side), a hydrogen, and a side chain, or R-group. Only
the R-group varies among the 20 different amino acids. This is the side
chain, and so there are 20 different side chains. Look at the amino
acids and peptides handout for some of the side chains. Your texts
and hard copy handout show al l 20, and you should examine all 20.
Out of laziness, I drew the general amino acid incorrectly:
Actually at neutral pH, the molecule is charged, because the carboxylic
acid group is an acid, and the amine group is a base, so more accurately:
(also see 3-D
picture)
Let's take this opportunity to discuss the charge
on organic molecules a bit more. In living systems, the carboxylic acid
group is mostly charged and the amine is mostly charged, but that is a
pH7, the cellular pH under most circumstances. Is an acid always charged
in aqueous solution? No. It depends on the pH of the environment. In the
laboratory we do not have to keep things at pH 7, as it is in the cell.
We can vary the environment at will, adding strong acids such as hydrochloric
acid as a source of hydrogen ions (lowering the pH), or a strong base
such as sodium hydroxide (raising the pH). The strength of an acid is
a measure of how readily it gives up a proton. Carboxylic acids are always
in equilibrium with the hydrogen ions (protons) in the solution, so if
the hydrogen ion concentration is high (acidic) then the equilibrium will
shift toward the protonated (uncharged) species. At pH 3 an amino acid
carboxyl group is protonated about half the time; for each pH unit this
proportion of protonated species will drop by a factor of 10, so very
little of the carboxyl group is protonated at the neutral pH of 7 found
in most cells. A similar situation pertains to the amine base end: at
a very low H+ ion concentration (e.g., 10-11M H+, a high pH of 11), it
will tend to lose its extra proton, but at pH y,it will mostly remain
protonated, with a positive charge. ly. high pH .
So at pH7, most amino acids are neutral (no net charge), but they are
highly charged nonetheless.
Now, what are some of these 20 different side groups?
Here are 2 charged side group, e.g.:
asp: R= -CH2-COO- , there is a second carboxyl group on this amino acid)
lys: R= -CH2-CH2-CH2-CH2-NH3+ , there's a second amine on lysine, so lysine
will have 3 charged groups, and a net charge of +1 (two +'s and one -)
at pH7.
There is a convention for numbering amino acid carbons; actually
it's a lettering. It starts from the central carbon, called alpha: so
lys has (count with me) an alpha, beta, gamma, delta, EPSILON-amino group
as well as an alpha-amino group (and an alpha-carboxyl).
The average molecular weight of an aminoacid is ~120,but the range is
from 75 to 203.
The smallest amino acid (a.a.) is glycine (gly), MW = 75. Here the side
chain is merely hydrogen.
The largest is tryptophan (trp), MW = 203 [-CH2- bridge to a 5-membered
ring containing a N + a fused 6-membered ring] and fairly hydrophobic.
Look over the structures of the 20 amino acids in the textbook. It is
the properties of the functional groups on
the 20 different side chains of the 20 different
amino acids that determine the function of a protein, so they are all-important.
The handout shows all 20 aa's, but without indicating the ionization of
the acidic and basic groups. We will discuss many of the side chains within
the context of the discussion as we go along.
There are a couple that deserve special mention: arginine contains a functional
group that is not on your list; it is -NH-CH2(NH2)NH2, called the guanido
group. The guanido group is a strong base, even stronger than an ordinary
amine, so it is positively charged at pH7 (like lysine). Proline has a
side chain that folds back and forms a covalent bond to the amine nitrogen
of the amino acid, thus producing a ring structure.
You should be able to recognize the properties of the side chains as polar
or non-polar, charged or not charged. You will not be responsible for
recalling a specific amino acid structure from the English name or vice
versa, but given the structure you should know how it behaves. {Q&A}.
::Stereoisomers::
Now let's consider the structure of an amino acid
in 3 dimensions:
When carbon forms 4 single bonds, it makes them spaced equally apart from
each other in space, in the form of a tetrahedron as in this representation
of glycine [a model with 2 white groups is shown].
Now consider this other molecule of an amino acid [again with 2 white
groups], with 2 H's of glycine, e.g. Are these the same molecule, that
is, are they distinguishable or are they indistinguishable?
They are indistinguishable, since I can rotate them and superimpose their
atoms.
But now suppose I make this alanine instead of glycine. I replace one
H with a [blue] -CH3 group on each molecule [I am being sure to make them
stereoisomers].
I can no longer superimpose them. They are both alanine, as they have
the same four groups attached to the central carbon. But in three dimensions
they are actually mirror images of each other. See [Purves6ed
2.21a]. We call one D-ala and one L-ala. See [Purves6ed
2.21b].
This one is D, or is it this one... ? I can't remember .. it's not too
important here.
What is important is that in general, you have this situation, the possibility
of two stereoisomers, whenever there is what
is called an ASYMMETRIC carbon atom in a molecule,
that is, a carbon with four different groups attached.
These stereo isomers are sometimes called optical
isomers, since the two forms, in solution, will bend a beam of polarized
light one way or the other. Thus the D designation originally meant dextro,
or to the right), whereas L stood for levo, to the left.
All amino acids except glycine have an asymmetric carbon, which is the
alpha-carbon. So we can draw 19 of the amino acids in 2 stereoisomeric
forms.
So do we really have 39 a.a.'s? No. All the stereoisomeric forms of the
amino acids in proteins are L-amino acids,
so we only have to worry about 20.
Note that the sugars we discussed, like glucose, have several asymmetric
carbon atoms. Aside from L and D designations, the sugar stereoisomers
are given different English names (e.g., D-glucose, D-mannose, L-rhamnose,
etc.).
::Polypeptides, peptide
bond:: Polymerization of aa's
OK, now let's string these L-amino acids together,
polymerize them. The bond that connects two amino acids is an AMIDE
bond (-CO-NH-) between the carboxyl of one amino acid and the amino group
of the next. Once again, a molecule of water is removed in the formation
of the connecting bond:
In the special case of proteins, this amide bond is called a PEPTIDE
BOND, and the resulting product a PEPTIDE,
a dipeptide (or we could go on to a tri-peptide, oligo-peptide, or finally,
POLY-PEPTIDE). (See also polypeptide
handout). Also see [Purves6ed
3.4], and another picture.
By convention, the amino group is written on the left for an amino acid
and also for a peptide.
In the tripeptide in the diagram, note the peptide bond (boxed), and the
repeating unit, or aa "residue" [circled]. Residue refers to what's left
of the amino acid monomer after it ahs been incorporated into a polypeptide,
which is most of it: it just lacks one H at what was the amino end and
one OH at what used to be the carboxyl end. Note also that the charged
amine and carboxyl groups no longer exist inside the polypeptide, having
been replaced by the amide, an uncharged functional group.
Almost all polypeptides have 2 ends, the amino
end and the carboxyl end, which do
remain charged at pH7. .
The "backbone" of the polypeptide is defined
as all of the atoms except the side chains.
The only free amino and carboxyl atoms of
the backbone are at the 2 ends.
The side chains then, stick out of this backbone (also see polypeptide
handout).
The length of polypeptides is commonly 100-1000 amino acids, but smaller
and larger ones also can be found.
Each and every protein molecule in the cell has an identity defined by
its particular sequence of amino acids. Each E. coli cell contains about
3 million polypeptide molecules, but only about 3000 different ones. Each
of these individual protein types has a name to go along with its chemical
identity.
Some examples of polypeptides, taken not from E. coli, but from more familiar
organisms include:
hemoglobin, which carries oxygen in red blood cells;
egg albumin, a nutrient in the white of a hen's egg;
keratin, providing toughness in skin, fingernails, and wool;
collagen, providing a strong connection between cells in tendons;
beta-galactosidase, which helps digest the milk sugar lactose.
::Primary (1o)=
linear sequences of AAs::
Each of these proteins contains a polypeptide with
a particular sequence of amino acids, usually all 20 are represented,
although not at all equally. Unlike polysaccharides, this sequence usually
exhibits no obvious regularity, or repeating subsequence:
This linear sequence
of amino acids is called the primary (1o)
structure of a protein.
::Methods: Paper
chromatography, electrophoresis, fingerprinting::
I will discuss a bit now some methodology
used in the purification of amino acids and proteins. We bring in some
selected lab methods from time to time for two reasons: First, the behavior
of molecules in experimental situations helps you to understand their
behavior in nature; and second, the methodology is interesting in its
own right as an example of how science is done.
Our first topic of methodology is directed at the question of how we get
to know this primary structure, this sequence of amino acids in a polypeptide?
One way to determine the sequence of a polypeptide is to chemically degrade
it in a stepwise fashion, starting at the carboxyl end. First you must
purify the polypeptide in question away from the other 3000 polypeptides
in the cell; we will discuss that process a little later.
The degradation of the polypeptide back to its
free monomer aa's is a form of HYDROLYSIS, a reverse of the dehydration
that accompanied the formation of the peptide bond. The controlled hydrolysis
of amino acid residues from the carboxyl end of a polypeptide is a form
of enzymatic hydrolysis; an enzyme, called carboxypeptidase, itself a
polypeptide, catalyzes this hydrolysis; it does not happen by itself.
We will learn more about enzymes next week. After the carboxypeptidase
is mixed with a peptide, hydrolysis begins: all the trillions of molecules
release their C-terminal amino acid in unison, synchronously, so that
in the first wave the last (original c-terminal) amino acid is released.
If the reaction is stopped at this point and the released amino acid is
separated from the main peptide and identified. By letting the reaction
proceed for increasing amounts of time, the time that amino acids are
released can be correlated with their distance form the C-terminal end.
You can get the sequence of perhaps 20 amino acids from the carboxyl terminal
in this way, before the process breaks down. Since most polypeptides are
greater than 20 amino acids in length, you first need to chop the polypeptide
into manageable pieces and then sequence each piece by subjecting it to
hydrolysis by carboxypeptidase. Here I want to concentrate on the chemical
analysis problem of separating and identifying the different amino acids
that are released by this carboxypeptidase hydrolysis.
How do you know which amino acid came off when?
Amino acids will behave sufficiently differently from each other under
certain conditions to allow the complete separation of all 20 species
from a mixture. We will discuss two methods for separation and identification
here. One way is based on the migration of amino acids in an electric
field. In PAPER ELECTROPHORESIS, an amino acid mixture is spotted onto
a sheet of filter paper, the paper is wet with a buffer salt solution
and placed between two electrodes and a high voltage (e.g., 2000 volts)
applied. At neutral pH, the acidic amino acids (asp and glu) will have
a net negative charge and will migrate toward the ANODE (+ pole) while
the basic amino acids (arg and lys) will migrate toward the CATHODE (-
pole). {Q&A}
Electrically neutral amino acids will not migrate much, unless the pH
is made acidic or basic.
A more versatile separation method is PAPER CHROMATOGRAPHY. This method
is based on the differential solubility of the different amino acids
in organic (non-polar) solvents, which in turn
is determined by the nature of the side group. The amino acid mixture
is spotted onto a filter paper; one edge of the paper is immersed in a
mixture of aqueous and non-aqueous solvents. The liquid will be drawn
up the paper by capillary action. As it rises the water in the liquid
mixture is bound by the paper (cellulose, with its many OH groups), forming
a stationary water layer, or stationary phase. The organic solvent (e.g.,
propanol) moves up without as much interaction with the solid cellulose;
it is considered the mobile phase. The amino acids will be constantly
equilibrating between being in the mobile organic phase or the stationary
water phase. The more polar the side chain, the more time the amino acid
will spend in the stationary phase. The more hydrophobic the side chain,
the more time it spend in the mobile organic phase. By using a series
of different solvents, all 20 amino acids can be separated in this way.
It works for many other organic molecules as well.
Small PEPTIDES [I emphasize peptides here, oligopeptides, not polypeptides]
can also be separated by both of these techniques; the properties of the
peptides will be a COMPOSITE of the properties of the constituent amino
acids. {Q&A}.
One of the most famous examples of the use of these
methods to analyze peptides rather than single amino acids was in the
study of sickle cell disease. Sickle cell disease is caused by an abnormal
hemoglobin protein. Hemoglobin is made up of several components, one of
which is a polypeptide called alpha-globin. The sequence of amino acids
in beta-globin from sickle cell hemoglobin differs from that of normal
alpha-globin. The nature of that difference can be determined by chopping
up the sickle cell alpha-globin into small peptides and then first separating
them along one edge of a filter paper sheet by paper electrophoresis.
The sheet is then turned 90 degrees and subjected to paper chromatography.
The result is a series of spots (after staining to visualize their positions)
representing all the sub-peptides. One peptide migrates differently in
sickle cell globin compared to normal globin. This peptide can then eluted
from the paper and sequenced. Comparison with the normal counterpart shows
that the sickle cell globin carries a single amino acid substitution.
In place of glutamic acid, it has a valine at one position in the peptide.
How could such a small change have such a large effect? The answer lies
in the 3-dimensional shape of proteins, to which we will turn next.
Most proteins can be separated into characteristic
patterns of spots this way. The procedure is called FINGERPRINTING a protein,
since the migration patterns are so characteristic.
Protein 3-dimensional structure
Now let us return to polypeptide structure.
Each polypeptide has a particular sequence of amino acids. Thus if we
could examine several molecules of the protein albumin we might find:
Molecule #1: N-met-leu-ala-asp-val-val-lys-....
Molecule #2: N-met-leu-ala-asp-val-val-lys-...
Molecule #3: N-met-leu-ala-asp-val-val-lys-... etc.
So they have the same primary structure. But as always, we must consider
structure in 3-dimensional space for a real picture of the molecule.
While the linear structure is the same, the 3-d
structure for each molecule must surely be different in solution, no?
After all, thermal motion will be buffeting this rope of strung-together
amino acids all about, so that each molecule will be expected to take
on a random configuration, no? Look at this scale model of a POLYPEPTIDE
OF 500 amino acids, a CLOTHES LINE. The dimensions are about right, but
the side chains have been left out. I have put colored parts of the rope
red to indicate polar side chains, the white
parts being apolar or hydrophobic [board]. At 37 degrees, you might imagine
this clothesline in a Jacuzzi, constantly taking on new shapes, with its
hydrophilic side chains constantly forming new hydrogen bonds to water.
This is the wrong picture. A more appropriate picture is a bundled up
rope, folded into a compact structure that withstands this thermal motion
at body temperatures [bundled rope].. red on outside ...white hydrophobic
on inside (which makes sense based onthe weak bond behavior we discussed).
OK, maybe this molecule could collapse on itself
.. after all the hydrophobic side chains will tend to aggregate. But if
we took another molecule, another linear chain, it would probably fold
a different way, after all, 500 amino acids, there must be many many ways
to get the hydrophobics inside. I could stuff the white parts of the rope
together and put them on the inside in many different ways. But if we
look for a second folded up example of this molecule, it looks like this
[second rope bundle], exactly the same as
the first (note loop count, etc.). Protein molecules exist as precisely
defined 3-dimensional structures in solutions, each molecule like the
next, superimposable.
That is, a typical polypeptide chain, having some 10,000 atoms linked
together, is folded up so that these 10,000 atoms all have the same relative
position in each and every molecule you examine. This still amazes me.
How could this be?
Well, what is holding the molecule in this shape? The four weak bond types
we discussed earlier, plus one new bond to be described in a few minutes.
Let's consider how this folding looks in more detail:
First, the flexible rope was not a good representation of even the backbone,
because the peptide bond itself imposes some constraint on structure.
The peptide bond itself has a property that influences all polypeptides
regardless of the side chains. Because of the electronegativity difference
between C and O or N, there is a partial separation of charge, one you
could have predicted.
What you may not have realized is that the partial + charge on the C and
the partial - charge on the adjacent N, imparts a partial extra bond between
those 2 atoms, and thus a partial double bond character to the C-N bond.
This partial double bond is sufficient to stop free rotation about the
C-N bond. Thus the backbone is not free to rotate around all connections,
but rather each repeat contains 6 atoms confined to one plane:
The polypeptide can be visualized as having a series of planes, each able
to rotate about one another. So a chain would be a better representation
than a rope.
::Secondary (2o)=
alpha-helix, beta sheet::
This partial separation of charge also means that
the O and the NH of the peptide bond can hydrogen bond... to water for
example. Since the NH is a hydrogen donor and the O is a hydrogen acceptor
for a hydrogen bond, we should consider the possibility that these groups
can H-bond to each other. But H-bonds require a linear orientation of
the 3 atoms involved, so certainly the NH of the very next residue cannot
H-bond to a C=O preceding it. But what about the next residue? No, still
can't make it. But by the fifth residue down you are able to line up an
NH to the O: -C=O..H-N-. i.e., there are 3 complete residues 3 in between.
So the C=O of #1 can H-bond to the HN of #5. But then also the C=O of
#2 should be able to H-bond to the HN of #6, and so on. This twisting
and H-bonding can hold the backbone in a HELIX, the so-called alpha-helix.
The alpha-helix is an example of secondary structure,
which is (my definition): structure produced by
regular repeated interactions between atoms of the backbone.
We might expect all the amino acid backbone
atoms to be in an alpha-helical conformation, but we have left out consideration
of the side chains, which can greatly influence the folding, as we will
see in a minute.
The alpha-helix is not the only form of secondary
structure, there is another, the beta-pleated sheet. In this case we once
again have the C=O and the NH of the backbone forming H-bonds to each
other, but in this case two sections of the polypeptide are aligned side
by side:
Several sections of polypeptide can line up like this, to produce a sheet
of strands. The chains are usually anti-parallel, but parallel alignments
are also possible. See text for better pictures. B:49 &
[Purves6ed3.5a].
Once
again, side chain interactions play a major role in allowing or disallowing
such secondary structures to form. But in fact, most proteins do have
extensive regions folded into alpha-helices and beta-pleated sheets.
Secondary structure consists mostly of these 2 structures.
Tertiary structure means the overall 3-dimensional
folding of a single polypeptide chain. We will continue with this most
important level of structure next time.
(C) Copyright 2001 Lawrence
Chasin and Deborah Mowshowitz Department of Biological Sciences
Columbia University New York, NY
Clickable pictures are from Purves, et. al., Life, 5th Edition,
Sinauer-Freeman's Images of Life 5.0.
A production of the Columbia
Center for New Media Teaching and Learning
|