CHY 431 Structure and Mechanism in Biological Chemistry

Problem Set for Protein Structure

Problem Set Due 27 February 2009 (The last day before spring break. If you really, eally want to work on this over break, I will accept your work on 17 March. Let me know if you are going to choose this option.

Please, please, please! When you submit your solutions, name the file after yourself! For example, Smith2, Jones2, etc.!

Undergraduates: do any SEVEN problems. Graduate students: do any EIGHT problems.

You may use any literature source you like, and discuss the problems with each other or with me. However, your answers must be in your own words and structure drawings.

1. A synthetic polypeptide of 20 amino acids was designed to adopt a b-sheet structure in solution. NMR demonstrated that it did indeed adopt this conformation, forming a 3-strand antiparallel structure.

The sequence (from the N-terminal end) of this polypeptide is:

RGWSVQNGKYTNNGKTTEGR

(If you need to, you can look up the one-letter amino acid codes on our Basics page.)

  1. Draw a careful sketch showing how this polypeptide might fold, and identifying the amino acids that form the turns. (keep in mind the generalities about b-strands and turns we described in class.)

  2. Draw in the hydrogen bonds that should stabilize this structure.

Now submit the sequence to the NNPredict server, which predicts protein secondary structure. Compare your result to that from the server, attempting to explain any significant differences. [NNPredict seems to be giving access problems; if you have trouble, try the PsiPred Server.]

2. Myoglobin, the oxygen storage protein of muscle tissue, contains eight a-helices. In sperm whale myoglobin, one of them has the sequence (from the N-terminal end):

Glu-Ala-Glu-Leu-Lys-Pro-Leu-Ala-Gln-Ser--His-Ala-Thr (residues 83-95)

  1. Make a cartoon showing the positions of the side-chains on the helix. (One way to do this is to draw a helical wheel: draw a circle, representing the first helical turn, and place the symbol for each amino acid apropriately on the circle. Then draw another circle outside of the first, and place the next four. Remember that the first residue of the second turn will be displaced from the last residue of the first, and so on. You could also use the helical wheel website from our protein structure pages.)

  2. Which side of the helix is likely to face the interior of the protein? Why?

(Another way to answer this question is to download the myoglobin structure, identify the helix for which the sequence is given, using Rasmol, and create a Rasmol picture.)

3. Download a structure for one of the following proteins from the protein databank (the pdb code is in parentheses after each name). Use Rasmol or the Swiss PDB viewer to visualize the structure. Describe the types of secondary (helices, sheets) and tertiary structure (motifs, domains) that are present. Generate pictures that show clearly the kinds of features you identify.

tRNA splicing endonuclease (1a79) Human antitrypsin (1atu)
Human serum transferrin (1d3k) E. coli L-arabinose-binding protein (1abe)

4. Virnau and coworkers [PLoS Computational Biology, 2006, 2 (9), e122; (http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.0020122) ] discuss the function and evolution of knots in proteins. Summarize the main points of their paper.

5. The Protein DataBank contains structures for the alcohol dehydrogenases of humans (1d1s, 1hso, 1hsz), horses (1lde), mice (1e3e) and fruit flies (1mg5).

Prepare a multiple sequence alignment for the enzymes from each type of critter. To do this, you will need the sequences in FASTa format, which can either be downloaded directly from the PDB or created from the pdb structure file by the Swiss PDB viewer.

For preparing the alignment, you can use BioEdit, JalView (use the Web Services menu), the ClustalW web site (where you can also download the software to your own PC; note that installation of BioEdit also installs ClustalW, which can be run from within BioEdit), or download another alignment editor, GeneDoc. Be sure you indicate as part of your response how you prepared the alignment, including wht scoring matrix you used.

6. Select the enzymes from each of two different critters in Problem #5 and use the Swiss PDB viewer to create an alignment of the backbone atoms. Use the "Magic Fit" tool. Create a picture showing the result, and indicate the rms difference in atomic positions that is calculated. Identify the regions that seem to have the best and worst fits. 7. Following is the sequence of bovine prion protein, the progenitor of "mad cow" disease. Use the Expasy web site to search for other proteins containing this sequence. Select a least three of your hits, identify them, and use one of the tools from the previous problem to prepare a sequence alignment. Use the BLOSUM62 matrix.

GSVVGGLGGYMLGSAMSRPLIHFGSDYEDRYYRENMHRYPNQVYYRPVDQYSNQNNFVHD
CVNITVKEHTVTTTTKGENFTETDIKMMERVVEQMCITQYQRESQAYYQRGA

Download PDB files for the three you selected and do backbone alignments (Swiss PDB Viewer). How good are the sequence alignments? How good are the backbone alignments? Does this exercise support the idea that similar sequences should produce similar structures?

8. You are sucked through a black hole into an alternative Universe. In this Universe, amino acids are built differently than in ours. Describe thoroughly how protein structures might be affected by the following changes:

A few references you might want to consult: (a) Angew. Chem. Int. Ed., 1997, 36, 1836; (b) Chem. Comm., 1977, 2015; (c) Chem. Rev., 2001, 101, 3219; (d) J. Peptide Res., 2002, 59, 18; J. Am. Chem. Soc., 2000, 122, 4865; Proc. Natnl. Acad. Sci. US, 2001, 98, 5487; J. Am. Chem. Soc., 2007, 129, 1532.

9. Below is the sequence of a human tumor antigen in FASTa format.

>P04637|P53_HUMAN Cellular tumor antigen p53 - Homo sapiens (Human).
MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLMLSPDDIEQWFTEDPGP
DEAPRMPEAAPPVAPAPAAPTPAAPAPAPSWPLSSSVPSQKTYQGSYGFRLGFLHSGTAK
SVTCTYSPALNKMFCQLAKTCPVQLWVDSTPPPGTRVRAMAIYKQSQHMTEVVRRCPHHE
RCSDSDGLAPPQHLIRVEGNLRVEYLDDRNTFRHSVVVPYEPPEVGSDCTTIHYNYMCNS
SCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENLRKKGEPHHELP
PGSTKRALPNNTSSSPQPKKKPLDGEYFTLQIRGRERFEMFRELNEALELKDAQAGKEPG
GSRAHSSHLKSKKGQSTSRHKKLMFKTEGPDSD

Follow the steps in the tutorial to build a homology model of this protein. Save the output from the modeling as a .pdf file (use the button near the top of the output page) and submit that file as your answer to this question.

TUTORIAL

10. Download the pdb files for cytochrome C from tuna (5cyt) and rice (1ccr). Identify the residues (Rasmol?) that seem to be involved in binding the heme in each, and describe any differences. Then do an alignment of the sequences of the two proteins, and determine how many differences in sequence occur away from the heme binding site. Several investigators have suggested that evolution should produce variations most rapidly in regions of proteins that are least essential for their function. Do your observations agree with this generalization?

11. Use the SCOP database (http://scop.mrc-lmb.cam.ac.uk/scop) to determine:

  1. a protein in the same superfamily as enolase, but in a different family

  2. another enzyme family related to the family of glyceraldehyde-3-phosphate

  3. the geneaology of chloramphenicol acetyltransferase

12. The folding of bovine pancreatic trypsin inhibitor (BPTI; 1bhc in the protein databank) has been studied extensively by Creighton. The native protein contains six cysteine residues, which form three disulfide bonds, and the molecule can be unfolded easily by reducing the disulfide links. The cysteines are at 5, 14, 30, 38, 51, and 55. How many different disulfide bonds could be formed during the refolding of the protein, and what are they? Creighton was able to identify some possible intermediates in the refolding that had one disulfide formed, and two disulfides formed. How many possibilities are there for these intermediates, and what are they? Use Rasmol or other viewing program to look at the structure of the native species. Which disulfide bonds actually exist in the molecule?

13. Bovine rhodopsin is a membrane-spanning protein involved in vision. Because the interior of a membrane is very nonpolar, membrane spanning proteins generally present hydrophobic side chains in the parts of the protein that are exposed to that environment.

Creating a hydrophobicity plot is a good way to locate such sequences.

(a) Download the sequence of bovine rhodopsin and use BioEdit to produce a plot. (Select Sequence - Protein, and choose a type of plot.)

Selecting the type of plot determines the hydrophobic scale utilized:

(b) Identify the regions of the sequence most likely to be exposed in the interior of the membrane

(c) Submit the sequence to the NNPredict web site. What kind of structure does the site predict for those membrane-spanning sections?

Here are my answwers to problems 2, 3, and 10, as examples of what I am looking for. CLICK.


Just in case anyone would like to draw some chemical structures to include in their work, here are links to some free structure-drawing programs:

WinDrawChem WinPlot ChemSketch


This page last modified 12:42 PM on Tuesday February 10th, 2009.
Webmaster, Department of Chemistry, University of Maine, Orono, ME 04469