CHY 431 Structure and Mechanism in Biological Chemistry

Problem Set for Protein Structure

Problem Set Due 25 February 2008

Please, please, please! When you submit your solutions, name the file after yourself! For example, Smith2, Jones2, etc.!

Undergraduates: do any SEVEN problems. Graduate students: do any EIGHT problems.

You may use any literature source you like, and discuss the problems with each other or with me. However, your answers must be in your own words and structure drawings.

1. A synthetic polypeptide of 20 amino acids was designed to adopt a b-sheet structure in solution. NMR demonstrated that it did indeed adopt this conformation, forming a 3-strand antiparallel structure.

The sequence (from the N-terminal end) of this polypeptide is:

RGWSVQNGKYTNNGKTTEGR

(If you need to, you can look up the one-letter amino acid codes on our Basics page.)

  1. Draw a careful sketch showing how this polypeptide might fold, and identifying the amino acids that form the turns.

  2. Draw in the hydrogen bonds that should stabilize this structure.

Now submit the sequence to the PsiPred server, which predicts protein secondary structure. Compare your result to that from the server, attempting to explain any significant differences.

2. Myoglobin, the oxygen storage protein of muscle tissue, contains eight a-helices. In horse heart myoglobin, one of them has the sequence (from the N-terminal end):

His-Glu-Ala-Glu-Leu-Lys-Pro-Leu-Ala-Gln-Ser-His-Ala-Thr-Lys

  1. Make a cartoon showing the positions of the side-chains on the helix. (One way to do this is to draw a helical wheel: draw a circle, representing the first helical turn, and place the symbol for each amino acid apropriately on the circle. Then draw another circle outside of the first, and place the next four. Remember that the first residue of the second turn will be displaced from the last residue of the first, and so on. You could also use the helical wheel website from our protein structure pages.)

  2. Which side of the helix is likely to face the interior of the protein? Why?

(Another way to answer this question is to download a myoglobin structure, identify the helix for which the sequence is given, using Rasmol, and create a Rasmol picture.)

3. Download a structure for one of the following proteins from the protein databank (the pdb code is in parentheses after each name). Use Rasmol or the Swiss PDB viewer to visualize the structure. Describe the types of secondary (helices, sheets) and tertiary structure (motifs, domains) that are present. Generate pictures that show clearly the kinds of features you identify.

Human peptidylprolyl cis/trans isomerase (1vbs) Pig adenylyl kinase (3adk)
Human thioredoxin (1eru) E. coli L-arabinose-binding protein (1abe)

4. The sequence below is that of a recombinant fungal lignin peroxidase, an enzyme that is one of the few things on the planet, short of a pulp mill, that can degrade lignin. Submit the sequence PsiPred for prediction of its secondary structure. Retrieve the result as a pdf file, and attach it as your answer to this problem.

VIEKRATCSNGKTVGDASCCAWFDVLDDIQQNLFH
GGQCGAEAHESIRLVFHDSIAISPAMEAQGKFGGG
GADGSIMIFDDIETAFHPNIGLDEIVKLQKPFVQK
HGVTPGDFIAFAGAVALSNCPGAPQMNFFTGRAPA
TQPAPDGLVPEPFHTVDQIINRVNDAGEFDELELV
WMLSAHSVAAVNDVDPTVQGLPFDSTPGIFDSQFF
VETQLRGTAFPGSGGNQGEVESPLPGEIRIQSDHT
IARDSRTACEWQSFVNNQSKLVDDFQFIFLALTQL
GQDPNAMTDCSDVIPQSKPIPGNLPFSFFPAGKTI
KDVEQACAETPFPTLTTLPGPETSVQRIPPPPGAR
ATCSNGKTVGDASCCAWFDVLDDIQQNLFHGGQCG
AEAHESIRLVFHDSIAISPAMEAQGKFGGGGADGS
IMIFDDIETAFHPNIGLDEIVKLQKPFVQKHGVTP
GDFIAFAGAVALSNCPGAPQMNFFTGRAPATQPAP
DGLVPEPFHTVDQIINRVNDAGEFDELELVWMLSA
HSVAAVNDVDPTVQGLPFDSTPGIFDSQFFVETQL
RGTAFPGSGGNQGEVESPLPGEIRIQSDHTIARDS
RTACEWQSFVNNQSKLVDDFQFIFLALTQLGQDPN
AMTDCSDVIPQSKPIPGNLPFSFFPAGKTIKDVEQ
ACAETPFPTLTTLPGPETSVQRIPPPPGA

5. The Protein DataBank contains structures for the alcohol dehydrogenases of humans (1d1s, 1hso, 1hsz), horses (1lde), mice (1e3e) and fruit flies (1mg5).

Prepare a multiple sequence alignment for the enzymes from each type of critter. To do this, you will need the sequences in FASTa format, which can either be downloaded directly from the PDB or created from the pdb structure file by the Swiss PDB viewer.

For preparing the alignment, you can use BioEdit, JalView (use the Web Services menu), the ClustalW web site (where you can also download the software to your own PC; note that installation of BioEdit also installs ClustalW, which can be run from within BioEdit), or download another alignment editor, GeneDoc. Be sure you indicate as part of your response how you prepared the alignment, including wht scoring matrix you used.

6. Select the enzymes from each of two different critters in Problem #5 and use the Swiss PDB viewer to do an alignment of the backbone atoms. Use the "Magic Fit" tool. Create a picture showing the result, and indicate the rms difference in atomic positions that is calculated. Identify the regions that seem to have the worst fit. 7. Following is the sequence of bovine prion protein, the progenitor of "mad cow" disease. Use the Expasy web site to search for other proteins containing this sequence. Select a least three of your hits, identify them, and use one of the tools from the previous problem to prepare a sequence alignment. Use the BLOSUM62 matrix.

GSVVGGLGGYMLGSAMSRPLIHFGSDYEDRYYRENMHRYPNQVYYRPVDQYSNQNNFVHD
CVNITVKEHTVTTTTKGENFTETDIKMMERVVEQMCITQYQRESQAYYQRGA

Then do backbone alignments. How good are the sequence alignments? How good are the backbone alignments? Does this exercise support the idea that similar sequences should produce similar structures?

8. You are sucked through a black hole into an alternative Universe. In this Universe, amino acids are built differently than in ours. Describe thoroughly how protein structures might be affected by the following changes:

  • The amino acids all are "D" instead of "L" (R instead of S in real nomenclature)

  • Half of the amino acids are "D" and half are "L"

  • They all are b amino acids; that is, there's a CH2 in between the carboxyl and the amino-substituted carbon

A few references you might want to consult: (a) Angew. Chem. Int. Ed., 1997, 36, 1836; (b) Chem. Comm., 1977, 2015; (c) Chem. Rev., 2001, 101, 3219; (d) J. Peptide Res., 2002, 59, 18; J. Am. Chem. Soc., 2000, 122, 4865; Proc. Natnl. Acad. Sci. US, 2001, 98, 5487; J. Am. Chem. Soc., 2007, 129, 1532.

9. Below is the sequence of a human tumor antigen in FASTa format.

>P04637|P53_HUMAN Cellular tumor antigen p53 - Homo sapiens (Human).
MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLMLSPDDIEQWFTEDPGP
DEAPRMPEAAPPVAPAPAAPTPAAPAPAPSWPLSSSVPSQKTYQGSYGFRLGFLHSGTAK
SVTCTYSPALNKMFCQLAKTCPVQLWVDSTPPPGTRVRAMAIYKQSQHMTEVVRRCPHHE
RCSDSDGLAPPQHLIRVEGNLRVEYLDDRNTFRHSVVVPYEPPEVGSDCTTIHYNYMCNS
SCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENLRKKGEPHHELP
PGSTKRALPNNTSSSPQPKKKPLDGEYFTLQIRGRERFEMFRELNEALELKDAQAGKEPG
GSRAHSSHLKSKKGQSTSRHKKLMFKTEGPDSD

Follow the steps in the tutorial to build a homology model of this protein. Save the output from the modeling as a .pdf file (use the button near the top of the output page) and submit that file as your answer to this question.

TUTORIAL

10. Download the pdb files for cytochrome C from tuna (5cyt) and rice (1ccr). Identify the residues (Rasmol?) that seem to be involved in binding the heme in each, and describe any differences. Then do an alignment of the sequences of the two proteins, and determine how many differences in sequence occur away from the heme binding site. Several investigators have suggested that evolution should produce variations most rapidly in regions of proteins that are least essential for their function. Do your observations agree with this generalization?

11. Use the SCOP database (http://scop.mrc-lmb.cam.ac.uk/scop) to determine:

  1. a protein in the same superfamily as enolase, but in a different family

  2. another enzyme family related to the family of glyceraldehyde-3-phosphate

  3. the geneaology of chloramphenicol acetyltransferase

12. The folding of bovine pancreatic trypsin inhibitor (BPTI; 1bhc in the protein databank) has been studied extensively by Creighton. The native protein contains six cysteine residues, which form three disulfide bonds, and the molecule can be unfolded easily by reducing the disulfide links. The cysteines are at 5, 14, 30, 38, 51, and 55. How many different disulfide bonds could be formed during the refolding of the protein, and what are they? Creighton was able to identify some possible intermediates in the refolding that had one disulfide formed, and two disulfides formed. How many possibilities are there for these intermediates, and what are they? Use Rasmol or other viewing program to look at the structure of the native species. Which disulfide bonds actually exist in the molecule?

13. Bovine rhodopsin is a membrane-spanning protein involved in vision. Because the interior of a membrane is very nonpolar, membrane spanning proteins generally present hydrophobic side chains in the parts of the protein that are exposed to that environment.

Creating a hydrophobicity plot is a good way to locate such sequences.

(a) Download the sequence of bovine rhodopsin and use BioEdit to produce a plot. (Select Sequence - Protein, and choose a type of plot.)

Selecting the type of plot determines the hydrophobic scale utilized:

  • Kyte-Doolittle is a widely applied scale for delineating hydrophobic character of a protein. Regions with values above 0 are hydrophobic in character. Conversely, negative values correspond to hydrophilicity.

  • Window size refers to the number of amino acids examined at a time to determine a point of hydrophobic character.

    • Window size can be varied from 5 to 25 (default 7); one should choose a window that corresponds to the expected size of the structural motif under investigation

    • Window size of 5-7 is good for finding hydrophilic regions that are likely exposed on the surface and may possibly be antigenic.

    • Window size of 19-21 will make hydrophobic, membrane-spanning domains stand out rather clearly (typically > 1.6 on the Kyte-Doolittle scale).

    • Another way to create a hydrophobicity plot: Hydrophobicity website at Colorado State University

(b) Identify the regions of the sequence most likely to be exposed in the interior of the membrane

(c) Submit the sequence to the PsiPred web site. What kind of structure does the site predict for those membrane-spanning sections?


Just in case anyone would like to draw some chemical structures to include in their work, here are links to some free structure-drawing programs:

WinDrawChem WinPlot ChemSketch


This page last modified 9:27 AM on Monday February 4th, 2008.
Webmaster, Department of Chemistry, University of Maine, Orono, ME 04469