Biology C2005 Lecture 5

::Protein 3-dimensional structure::
So the long chain of a polypeptide does not assume a random coil configuration in space, but, amazingly, folds into a precisely defined 3-dimensional shape each and every time. That is, each of the billions of protein molecules of a given kind, with a specific name and sequence, are superimposable, atom for atom. Well, what is holding the molecule in this shape? The four weak bond types we discussed earlier, plus one new bond to be described in a few minutes.

Let's consider how this folding looks in more detail:

First, the flexible rope was not a good representation of even the backbone, because the peptide bond itself imposes some constraint on structure. The peptide bond itself has a property that influences all polypeptides. Because of the electronegativity difference between C and O and C and N, there is a partial separation of charge, one you could now have predicted.

What you may not have realized is that the partial + charge on the C and the partial - charge on the adjacent N, imparts a partial extra bond between those 2 atoms, and thus a partial double bond character to the C-N bond. This partial double bond is sufficient to stop free rotation about the C-N bond (remember the lack of rotation about the double bond in an unsaturated fatty acid causing a kink?). Thus the backbone is not free to rotate around all connections, but rather each repeat contains 6 atoms confined to one plane:

The polypeptide can be visualized as having a series of planes, each able to rotate about one another. So a chain would be a better representation than a rope.

::Secondary (2^o): alpha helix::
This partial separation of charge also means that the O and the NH of the peptide bond can hydrogen bond... to water for example. And since the N-H is a hydrogen donor and the O is a hydrogen acceptor for a hydrogen bond, we should consider the possibility that these groups can H-bond to each other. But H-bonds require a linear orientation of the 3 atoms involved, so certainly the N-H of the very next residue cannot H-bond to a C=O preceding it. But what about the N-H on the next residue down from a given C=O? No, still can't make it. But by the fifth residue down you are able to line up an NH to the O: -C=O..H-N-. i.e., there are 3 complete residues 3 in between the two residues that are involved in the the bonding, as shown in the diagram below. (After the turn between residues 3 and 4, the letters for the atoms have been drawn backwards, to indicate the that the chain is circling around to achieve this position. Note that many atoms have been left out, so that the relevant atoms can be more clearly seen. Note also that the rectangles here delineate the amino acid residues, not the 6-atom planes discussed above.

So the C=O of residue #1 can H-bond to the H-N of residue #5. But then also the C=O of residue #2 should be able to H-bond to the H-N of #6, and so on. This twisting and H-bonding can hold the backbone in a HELIX, the so-called alpha-helix.

The alpha-helix is an example of secondary structure, which is (my definition): structure produced by regular repeated interactions between atoms of the backbone (only).

We might expect all the amino acid backbone atoms to be in an alpha-helical conformation, but we have left out consideration of the side chains, which can greatly influence the folding, as we will see in a minute.

::Secondary (2^o): beta pleated sheet::
The alpha-helix is not the only form of secondary structure, there is another, the beta-pleated sheet. In this case we once again have the C=O and the N-H of the backbone forming H-bonds to each other, but in this case two sections of the polypeptide are aligned side by side:

Several sections of polypeptide can line up like this, to produce a sheet of strands. The chains are usually anti-parallel, but parallel alignments are also possible. Every other residue H-bonds to a strand alongside; the side groups stick out both above and below the sheet that is formed from these H-bonded strands. Once again, side chain interactions play a major role in allowing or disallowing such secondary structures to form. But in fact, most proteins do have extensive regions folded into alpha-helices and beta-pleated sheets.

Secondary structure consists mostly of these 2 structures, the alpha helix and the beta-pleated sheet.

::Tertiary (3^o)= overall 3-D of a polypeptide::
Tertiary structure means the overall 3-dimensional folding of a single polypeptide chain. For this overall shape, interactions between side chains are very important, as are interactions between side chain and water. A generality is that the hydrophilic groups are folded to be on the outside where they can interact with water via H-n bonds, while the hydrophobic side chains are collected in the inside of the structure, pushed together by hydrophobic forces. This rule is not at all 100% true, and most proteins have side chains that deviate from this generality. That is, there are hydrophobic side chains on the surface, but they are intermingled among the hydrophilic groups. And there are hydrophilic groups on the inside, where they are usually interacting with other hydrophilic groups. In fact, it is this interaction of side chains with each other that confers most of the overall 3-dimensional shape on a given polypeptide.

Pictured here are the weak bonds that were introduced earlier. The side chain interaction indicated in the diagram illustrate examples of these various interactions. Consult your text for the exact nature of the side chains:

1. ionic (lys - asp)

2. hydrophobic and VDW (phe - val)

3. H-bond (ser - ser)

4. H-bond to ionic (asp - asn)

5. van der Waals (ser - ala)

Most proteins fold into a roughly globular shape (enzymes, Hb, antibodies--see a picture of the enzyme lysozyme: a space-filling model, or showing just the backbone connections, or a ribbon model), but many take on an elongated or even fibrous shape (collagen, myosin [in muscle], fibroin [in silk]).

These are weak bonds, but in the aggregate, they are strong enough to hold the polypeptide together at least under the thermal motion conditions of physiological temperature (37 deg C). [Purves6ed 3.8]

::Sulfhydryls, disulfides::
But there is one strong bond that contributes to the folding of some proteins. This is the DISULFIDE BOND, and it differs from all these other bonds in being a covalent bond. It can only be formed between the side chains of two CYSTEINE residues. The side chain -CH2-SH contains a SULFHYDRYL group: -SH. Two sulfhydryls can react with oxygen to lose their 2 hydrogen ATOMS (H with its electron, not H+ ions, not protons) and become bound to each other in the process:

Protein-CH₂-SH + HS-CH₂-Protein + üO₂ ---> Protein-CH₂-S-S-CH₂-Prot + H₂O

So now the two sulfur atoms are sharing electrons in a strong covalent bond. This bond cannot be broken by mere thermal energy, and so disulfide bonds hold the two parts of the polypeptide chains that had contained the two cysteines firmly together. Not all proteins have disulfide bonds, but many do.

This reaction is an example of an oxidation-reduction reaction: the sufhydryls are getting oxidized (here oxidation means losing hydrogen atoms), while the oxygen is getting reduced, (or gaining hydrogen atoms) and ending up as water. This reaction will take place rapidly with no further help from catalysts.

(Note that it is not a hydrogen ION (proton, H+) that is being moved about here, but the hydrogen ATOM with its electron. Actually, it is the electrons that accompany the hydrogen atoms that are fundamental to the definition of oxidation/reduction, rather than the hydrogen atoms as a whole, as we shall see later. That is, oxidation is the loss of electrons, and reduction is the gain of electrons, with or without an accompanying hydrogen ion.)

The net result is tertiary structure, or the overall 3-dimensional shape of a folded-up single polypeptide. Note that there will be many regions of secondary structure within this overall tertiary structure. It is the interactions of the side chains that are to a large extent responsible for preventing the whole polypeptide from simply becoming 100% alpha-helix or 100% beta-sheet.

So now we can see that one polypeptide molecule can be folded into a compact structure and we can understand what holds it together, but why is it that there is only one structure formed and not many? Is there only one solution to the folding problem for a particular polypeptide chain?

Perhaps all possible conformations are tried in the course of folding, and only the most stable one accumulates. Can we predict the conformation from first principles? If we plug in the properties of all the amino acid side chains, how hydrophobic they are, what is the strength of an ionic bond, etc. we can ask a computer to try and try many combinations, many interactions. This is a very difficult computer problem, even for today's supercomputers, because the number of possibilities for a good-size polypeptide of say 500 amino acids is enormous (20⁵⁰⁰). But it has been tried, and so far usually the wrong structure comes out. The right structure is determined by examining crystals of the proteins, beaming X-rays through the protein crystals and calculating how they are refracted by the atoms in the crystal. Perhaps we really don't know the right properties of the side chains. Or perhaps there is some guide to folding that being imposed on the polypeptide as it is being polymerized in the cell, some outside influence, even a template of sorts, one can imagine a plaster mold analogy, or some kind of camera lucida.

::Denaturation, renaturation::
Well, if it is true that the folded structure of a particular protein is unique simply because it is the most stable, then if we UN-fold the polypeptide, it should be able to RE-fold into its unique structure. How could we unfold a protein, let's say one with no disulfide bonds, only weak bonds. We could consider egg albumin, for an everyday case of polypeptide denaturation. Raw egg white is a concentrated solution of this single ~500 amino acid polypeptide, that exists folded into a roughly spherical shape.

How can we denature it?....Here's some examples:

Heat: thermal motion becomes to great for the weak bonds.

pH: acids and bases both work, disturbing ionic bonds.

"Chaotropic" agents, such as very high concentrations (e.g., 8M) of urea (H2N-CO-NH2) can form so many H-bonds that they compete with and disrupt interactions with water.

organic solvents (e.g., hydrocarbons: octane, benzene, chloroform) turn the polypeptide inside out, as the hydrophobic forces disappear. {Q&A}

These are all DENATURING conditions. After denaturing the pure polypeptide, we could try to reverse the disruption.

Let's heat it (boil it). The sphere is now subject to faster and faster thermal motions, until finally it starts to unravel. What has happened to the egg white (the albumin polypeptide)? It has become denatured. [Purves6ed 3.9]. No longer native, which is the structure in the cell. No covalent bonds have been broken by this 100^oC temperature. The bundled up rope became the open randomly coiled rope in the Jacuzzi, and this allowed many wrong bonds to form, it exposed the hydrophobic groups normally hidden in the interior of the protein. In this concentrated solution a tangled mass of interacting polypeptide chains was produced, which resulted in a gel, a hot hard-boiled egg. So while folded up polypeptides are stable enough in their native environment inside the cell, the 3-dimensional structure is typically rather fragile: most proteins are easily denatured by heat and other treatments that can affect these weak bonds. This bundled rope in the Jacuzzi exists on the verge of becoming unraveled.

So now let's reverse the denaturation, let's cool down this hard-boiled egg and return it to normal temperatures. The gel seems to stay. We do not get back our runny egg white. A case of irreversible denaturation. But not a very fair experiment, letting all those molecules get SO tangled. Let's try a denaturation - renaturation in a more gentle, gradual way.

A fellow named Christian Anfinsen did this experiment in the 1950's. He took a protein called ribonuclease, a protein that is a digestive enzyme, a protein that helps break down the macromolecule RNA. It must be in its native structure to do this job.

Actually he had to break disulfides here to get full denaturation. So he did: he added a reducing agent: mercaptoethanol (HO-CH3-CH2-SH). In the presence of this reagent, one gets exchange among the disulfides and the sulfhydryls:

Protein-CH₂-S-S-CH₂-Protein + 2 HO-CH₂CH₂-SH --->

Protein-CH₂-SH + HS-CH₂-Protein + HO-CH₃CH₂-S-S-CH₂CH₂-OH

The protein's disulfide gets reduced (and the S-S bond cleaved), while the mercaptoethanol gets oxidized.

::Dialysis::
After disruption of the disulfide bonds in ribonuclease, Anfinsen placed the polypeptide in a sack, and added urea, H2N-CO-NH2 to the solution outside the sack. Urea will break hydrogen bonds at high concentrations (e.g., 8M). {Q&A} The sack is made of a semi-permeable plastic material with pores big enough to allow small molecules like urea and water to pass through but not macromolecules like albumin or ribonuclease. This process of allowing the concentrations of changing small molecules to change while holding the concentrations of large molecules constant is a called dialysis. After allowing time for diffusion, the concentration of urea inside the sack should be the same as the concentration outside. {Q&A} He then checked that the protein had become denatured (e.g., by ultracentrifugation, see below)..

Now he gradually dialyzed out the urea (by changing the solution outside the sack to stepwise lower and lower concentrations of urea). A dilute solution of the protein was used, and the gradual removal of the urea gave time for the polypeptide to re-fold.

He then exposed the polypeptide to oxygen to get back the disulfide from correctly positioned cysteine side chains (+H₂0).

He got back native ribonuclease. It checked out physically, and also functionally, by the fact that it regained its ability to digest RNA.

This type of experiment has been now been repeated many times for many different proteins. It works for many, fails for some. But the positive results are very important, for they prove that for many or even most proteins, all the information that is necessary for the complex and unique 3-dimensional structure is present in the primary sequence of the polypeptide chain.

That is, PRIMARY STRUCTURE DETERMINES TERTIARY STRUCTURE. This conclusion was a major step in biochemistry and earned Anfinsen a Nobel Prize.

::Chaperonins::
That said, it must be added that in the past 5-10 years, it has become apparent that some special proteins, called chaperonins, can help certain other proteins to fold with in the cell. It seems that these chaperonins may be needed not so much for initial folding, but when proteins denature inside the cell: for instance, after they have traversed a membrane, with its hydrophobic environment. Or after cells have been exposed briefly to slightly elevated temperature (called heat shock), when a few of the least stable proteins may start to denature. The role, the generality, and the mechanism by which these proteins aid other proteins in folding correctly is not yet well-understood. However, these cases do not really detract from the general principle that primary sequence CAN determine all higher order structures.

::Quaternary (4^o)= association of multiple polypeptides::
Tertiary structure describes folding of a single polypeptide, and while many proteins do consist of a single chain, most are composed of several distinct polypeptide chains. The association of these separate chains in known as QUATERNARY STRUCTURE.

The number of polypeptides in a protein can be 2, 4, 8 or higher. Or 3 (rarer).

These chains are folded up in 3-dimensions, assuming a tertiary structure, and then are stuck to each other. What keeps them stuck together? The same answer as usual: those weak bonds we keep discussing, and more rarely, the covalent disulfides.

Proteins with quaternary structure are called MULTIMERIC proteins. Individual polypeptides are called SUB-UNITS (of the protein).

One polypeptide chain can be considered a monomer, relatively speaking. A protein with 4 chains a tetramer. Etc. The subunits can be identical ( called HOMOPOLYMERIC) or they can be different polypeptides (or HETEROPOLYMERIC).

Now we can distinguish a "protein" from a "polypeptide". In its native form, the macromolecule is called a protein, and may consist of one or more polypeptides, depending on the protein.

E.g., Hemoglobin, Hb, has the structure a₂¤₂, consisting of 4 polypeptides, 2 alpha chains and 2 beta chains, of MW 16,000 each. So the MW of the Hb protein (a tetramer) is 64,000. [Purves6ed 3.7]

If you denature a multimeric protein, the MW will change (unless the subunits are held together with disulfide bonds and you don't disrupt them ), e.g., the MW changes from 64000 to 16000.

The subunits of some multimeric proteins are held together by disulfide bonds (in addition to the usual weak bonds). For example, the antibody molecule, immunoglobulin, is a tetramer of two identical "heavy" chains (H) and two identical "light" chains (L), or H₂L₂ and it includes S-S bonds between the H and L chains. You must denature AND reduce the disulfides to get the individual subunit polypeptides dissociated from each other.

So the surfaces of polypeptides have also evolved to allow interaction with other particular subunits but not with other proteins in general.

Consider now Sickle Cell Disease again. Hemoglobin is a tetramer of 2 pairs of identical sub-units: a₂b₂. Glu --> val was the a.a. change comparing normal Hb to sickle Hb (HbS). The result is that the tetramers inappropriately interact, presumably via hydrophobic interactions that in normal Hb is precluded by the charged glu. In HbS this position is valine and now has a more hydrophobic patch of surface. The result is that these patches can now get stuck together by hydrophobic forces, and aided by the fact that each HbS molecule has two such patches (one for each beta chain), and the concentration of Hb molecules inside a red blood cell is very high (they can almost be viewed as bags as Hb). You get long chains of tetramers, and these long arrays can distort the shape RBC (red blood cell) into a sickle shape. This shape is not a hydrodynamic as the original c shape, and the RBCs can now get clog in small capillaries, the manifestation of the disease. One a.a. out of 250 was responsible. Once again we see that proteins are fragile, are often only on the brink of stability.

PROSTHETIC GROUPS: There are some NON-amino acid components of proteins that are so tightly bound they are considered part of the protein. These small molecules are usually essential for the function of the protein. For example, in hemoglobin, the "heme" groups are actually organic ring compounds with an iron atom at their center, and it is this iron atom that actually binds the oxygen that is carried by the hemoglobin protein. Some of the vitamins become prosthetic groups (e.g. riboflavin). See B: 427 for heme structure.

PROTEIN PURIFICATION (SEPARATIONS)

::Protein purification methods: Ultracentrifugation::
While we are on the subject of proteins, let's take some time once again to discuss methodology. In this case, the purification of individual proteins, which involves their separation from all the other proteins in the cell. Much of what we want to know about proteins requires that we have a pure preparation contain only protein molecules of one homogeneous type. Since there are 3000 different types of protein molecules in E. coli, our task will be to separate one away from all 2999 others, to purify it.

[The word separate sometimes causes confusion at this point. In the context of purifications, "separate" is used as a relatively passive action, operating on a mixture without altering the components greatly, e.g., to separate the wheat from the chaff. Our primary objective here will not be to cleave molecules ("I'm gonna separate your head from your body"), although some cleavages may occur in he course of an experiment (e.g., cleavage of the disulfides of immunoglobulin in order to effect a separation of the individual subunits).]

How can we proceed to purify a protein? Well, what makes one protein different from another?

Can you proffer some characteristics?: size (MW), charge (net), shape, hydrophobicity (solubility), surface binding ability....

Yes, all these are used in what is still a challenging task for any biochemistry laboratory, the purification of its favorite protein.

Here is one sometimes useful method: ULTRACENTRIFUGATION

ultra means = >20,000 rpm; 60,000 rpm is common, compare. a Ferrari at 6000 rpm, redlining; this is ten times faster; you need a vacuum chamber so no heat from air friction can be produced. )

Diagram of tube, spin, distribution of molecules ...

A mixture of molecules will be subject to two main forces in the ultracentifuge as it starts to spin (ignoring buoyant force):
Causing sedimentation is the centrifugal force = m(omega)2r = (which is proportional to the mass or MW of a protein).
m = mass, omega = angular velocity, and r = distance from the center of rotation.
Opposing sedimentation = friction = f_oV. f_o = frictional coefficient, a constant for any particular protein, it is minimum for a sphere, higher for less compact shapes like cigars or pancakes. V = velocity of the molecule as it moves away from the center of rotation.

Soon after accelerating, V increases to a point where no further acceleration takes place, as the forces on the molecule are balanced. It continues to sediment, but at a constant velocity.

Now at this point, at this velocity: Centrifugal Force = Frictional force (there's no net force, no acceleration, but constant velocity)

So at this point (soon achieved): M(omega)²r = f_oV

And: V = m(omega)²r/f_o,

where f = a frictional coefficient dependent on shape (to visualize the effect of shape on friction, compare the velocity of a falling feather vs. a tiny pebble of equal weight, dropped in the fluid of air).

Higher f = more friction.

If we assume a spherical shape, then we can estimate a MW (Assume f_o, and then measure V and r, so we can solve for m, or the MW)

On the other hand, if we know the MW, we can get information about shape (via f_o).

Sedimentation velocity is often measured in Svedbergs, which takes the centrifugation conditions into account s = V/(omega)²r, and so m = sf_o.

So ultracentrifugation separates proteins on the basis of MW and shape. It is a gentle procedure (non-denaturing, can be carried out at nice low temperature (say 4 deg C, which tends to stabilize proteins) and in the presence of a buffer at pH 7 and physiological levels of salts).

You can recover your protein by punching a hole in the bottom of the centrifuge tube, and collecting the solution in a series of tubes as it drips out the bottom. Each tube can then be examined, or assayed, for the presence of the protein to be purified. For this purpose you need to be able to detect the protein in the midst of the other proteins. For example, if you were purifying Anfinsen's ribonuclease, you could measure the ability of the tube contents to catalyze the breakdown of RNA to its monomers.

How about separation on the basis of the net charge of a protein. We separated amino acids on the basis of charge in paper electrophoresis. For proteins, the solid supporting material is a gel, not paper:

GEL ELECTROPHORESIS:

There are two types -

::Native gel electrophoresis::
First: native gel electrophoresis.

Acrylamide (a monomer in this chemistry) in aqueous solution ---> polyacrylamide (P.A.G.E.). The result is a network of polymer fibers, which form a gel, with the consistency ~ Jello.

Usually a vertical apparatus, with an anode and a cathode. Apply the protein mixture to the top of a slab of this gel.

Apply voltage (~200 v).

The gel consists of a tight fiber network, so proteins have trouble migrating, negotiating their way through the tangled fibers.

Their rate of migration depends on two properties:
Their net charge and their "size" (which is proportional to MW if spherical)
Molecules with the most charge (net) (of a sign opposite to that of the far electrode) migrate to the far electrode fastest.
Molecules that are smallest (i.e., lowest MW) can worm their way through the gel fibers fastest. So the smallest and most highly charged wins the race.
After the electrophoresis has been stopped, molecules will be distributed along the gel length according to these two characteristics (MW and net charge).
[Note that molecules with a charge opposite to the near electrode, will migrate up and off the gel, into the buffer reservoir and be lost. Trial and error will dictate how you setup the electrophoresis if you do not know the charge on the protein you are trying to isolate.]

::SDS gel electrophoresis::
Second, a more widely used variation of gel electrophoresis: SDS PAGE.

Add sodium dodecyl sulfate, SDS (or SLS): CH₃-(CH2)₁₁- SO₄=

[sulfate is similar in structure to phosphate, and is a strong acid]. Like a phospholipid, SDS has a highly polar end and a highly hydrophobic body.

Might you expect SDS to denature a protein? Yes. It's a detergent and a powerful denaturant. It binds all over the protein, coating every protein with a uniform negative charge. SDS is put into in the gel when you form it and into the electrophoresis buffer. Now run SDS-PAGE. Where should the anode be placed? Does it matter? Yes, the protein is coated with negative charge now so anode is always at the bottom.

Under these denaturing conditions, the polypeptides exist as a random coils, which then migrates solely on the basis of their size, which is the equivalent of a sphere for all polypeptides. Larger molecules have more difficulty finding their way through the polyacrylamide fibers. So the lowest MW wins.

One must remember to reduce the disulfides with mercaptoethanol first (usually), so as to have a truly random coil to compare.

If you run standards of known MW, you can determine the MW of your protein by comparison, and this is a very common way to assign a MW to a polypeptide. However, it is not always completely accurate, as some proteins probably do bind a bit more SDS than others.

If you don't yet know what a protein does, you can just call it by its molecular weight, from SDS gels: e.g., p53, a famous protein whose absence is associated with cancer was named this was, and the name has stuck even though quite alot is known about its function (p in p53 stands for protein, so you have names like p27, p100 etc.).

::Gel filtration::
If we want to know the MW of a protein in its native, even quaternary structure?

For this we could use molecular sieve chromatography, or Sephadex, or gel filtration (these are all ~synonymous).

You start with plastic beads in a glass column with a support screen on the bottom.

Add your protein mixture to the top. Elute with a buffer. The beads are riddled with channels of a specified size. If a protein is smaller than the channel size, it enters, explores, diffuses out finally, having wasted its time in the race to the bottom of the column. Larger proteins can't fit in to the channels, don't waste their time, and win the race. Intermediate sizes waste some time but less than the smaller proteins. So larger molecules come out (elute) first, and the smallest come out last. Here again, you would collect the eluted proteins in a series of tubes, and then assay each tube for the presence of the protein being purified. If you calibrate the column by noting the behavior of spherical proteins of known size, you can determine the MW of your protein by comparison, if it is also spherical. If is is not spherical it will appear to have a higher molecular weight than its true MW (imagine a pancake being excluded from a channel while a sphere of the same MW gets in).

Other methods include ion exchange chromatography, which also takes advantage if the net charge on a protein, and affinity chromatography, which takes advantage of the surface properties of a protein (which we'll discuss next). One can purify a particular protein away from all other proteins in 4-5 such steps. For more on protein separation techniques, see the protein separation handout.

(C) Copyright 2001 Lawrence Chasin and Deborah Mowshowitz Department of Biological Sciences Columbia University New York, NY
Clickable pictures are from Purves, et. al., Life, 5th Edition, Sinauer-Freeman's Images of Life 5.0.
A production of the Columbia Center for New Media Teaching and Learning