Predicting Protein Structure

Whether we call it predicting or folding, the problem is the same one. Given the primary structure of a protein, how do we find, theoretically (that is, no X-rays) what the tertiary structure is?

Why do we want to predict protein structure ?

In a recent review of the state of the art, Baker [Science, 2005, 310, 638] makes the point that prediction and design are complementary processes, and employ essentially the same methodology.

Predicting Secondary Structure is relatively easy. For example:

For greater accuracy we need:

Results tend to be right 70-75% of the time.

For an example of this and other prediction methodologies, I have used the sequence of the amyloid precursor protein (APP), one of the proteolysis products of which is the Alzheimer's amyloid protein:

MLPGLALLLLAAWTARALEVPTDGNAGLLAEPQIAMFCGRLNMHMNVQNGKWDSDPSGTKT
CIDTKEGILQYCQEVYPELQITNVVEANQPVTIQNWCKRGRKQCKTHPHFVIPYRCLVGEFVSD
ALLVPDKCKFLHQERMDVCETHLHWHTVAKETCSEKSTNLHDYGMLLPCGIDKFRGVEFVCC
PLAEESDNVDSADAEEDDSDVWWGGADTDYADGSEDKVVEVAEEEEVAEVEEEEADDDE
DDEDGDEVEEEAEEPYEEATERTTSIATTTTTTTESVEEVVREVCSEQAETGPCRAMISRWYF
DVTEGKCAPFFYGGCGGNRNNFDTEEYCMAVCGSAMSQSLLKTTQEPLARDPVKLPTTAAST
PDAVDKYLETPGDENEHAHFQKAKERLEAKHRERMSQVMREWEEAERQAKNLPKADKKAVI
QHFQEKVESLEQEAANERQQLVETHMARVEAMLNDRRRLALENYITALQAVPPRPRHV
FNMLKKYVRAEQKDRQHTLKHFEHVRMVDPKKAAQIRSQVMTHLRVIYERMNQSLSLLYNVPA
VAEEIQDEVDELLQKEQNYSDDVLANMISEPRISYGNDALMPSLTETKTTVELLPVNGEFSLDDLQ
PWHSFGADSVPANTENEVEPVDARPAADRGLTTRPGSGLTNIKTEEISEVKMDAEFRHDSGYEV
HHQKLVFFAEDVGSNKGAIIGLMVGGVVIATVIVITLVMLKKKQYTSIHHGVVEVDAAVTPEERHLS
KMQQNGYENPTYKFFEQMQN

Submitted to the PsiPred server (link on the main page), this brought the result by email:

PsiPred Prediction

Going after tertiary structure is MUCH more difficult.

The difficult is related to the overall protein folding problem.

Levinthal's Paradox [J. Chim. Phys., 1968, 85, 44] suggests that given the size of the conformation space available to a protein of any significant size, finding a particular conformation starting from any other conformation could take longer than the age of the Universe.

Consider a polypeptide with 100 amino acid residues, and two equally possible conformations per residue

Hence, simply searching the conformation space of a protein for the "best" tertiary structure is futile, even with a supercomputer.

Several methodologies have been developed for dealing with this problem.

We will take a brief look at:

Comparative Modeling consists of four steps:

Two general methodologies are used for finding the template:

  • Sequence comparison plus distance geometry data from NMR

    • Experimental data allow use of less than optimum alignments

    • However, they require that the protein have been isolated and purified

    Software for implementing homology or comparative modeling includes

    • Modeller, from Andrej Sali's lab (link on main page), available for free download

    • SwissMod, implemented in the Swiss PDB Viewer

    Reading the raw APP sequence into Deep View, and submitting it to SwissMod brought the result:

    SwissMod Results

    Satyavan Singh in my group at Maine is applying homology modelling to the structure of a quinone reductase from brown rot fungus. The sequence was determined at the USDA Forest Product Lab in Madison by Ken Hammel and his group:

    MCFPSKRRKDGSPEEGGRIKRSRSAQEPAESTNTPAPPTSTGTKPTTTQTTDTTMSSPRLAIVIYTMYGH
    VAKLAEAIKSGIEGAGGNASIFQVAETLSPEILNLVKAPPKPDYPVMDPLDLKNYDGFLFGIPTRYGNFP
    VQWKAFWDSTGPLWASTALCGKYAGLFVSTGSPGGGQESTLMAAMSTLVHHGVIYVPLGYKYTFAQLANL
    TEVRGGSPWGAGTFANSDGSRQPTPLELEIANLQGKSFYEYVARVKW

    A first pass through the SwissMod site found three model proteins that have approximately 40% sequence identity. When modeled against those templates, our sequence produced the structure:

    Quinone Reductase from Brown Rot Fungus, G. trabeum

    The enzyme is known to require FAD as a cofactor; our next step is docking FAD into this structure.


    This page last modified 1:26 PM on Friday February 3rd, 2006.
    Webmaster, Department of Chemistry, University of Maine, Orono, ME 04469