Automated function prediction currently takes one of four general approaches:
Interact means synthesize a substrate for or bind to, creating a functional complex
We consider an example or two from each of these categories.
Evolutionary - The Rosetta Stone

| Inferred Pathways in E. coli |
|---|
![]() |
| A = shikimate; B = purines; C, D = predicted from A, B |
Each symbol represents an enzyme in the given pathway: for example, AroB is dehydroquinate synthase (3clh); AroD is dehydroquinate dehydratase (3JS3); PurK is a phosphoribosylaminoimidazone carboxylase....
Genomics - Inferring Genes Belonging to the Same Operon
Operon organization in prokaryotes results in a series of adjacent genes being transcribed into a single mRNA, which codes for the synthesis of multiple proteins
| Possible Gene Arrangements |
|---|
![]() |
| RBS = ribosome binding site; UTR = untranslated region |
| Threshhold (bp) | Non-Annotated genes with links | Non-annotated linked to one or more annotated |
|---|---|---|
| 0 | 474 | 217 (14%) |
| 25 | 786 | 412 (27%) |
| 50 | 913 | 521 (34%) |
Genomics - Phylogenetic Profiles
The PLEX database stores:
and can be searched iteratively using either a sequence or a phylogenetic profile. It provides links for gene neighbors and Rosetta stones.
I submitted the sequence for a fungal quinone oxidoreductase modeled by our group:
MCFPSKRRKD GSPEEGGRIK RSRSAQEPAE STNTPAPPTS TGTKPTTTQT
TDTTMSSPRL AIVIYTMYGH VAKLAEAIKS GIEGAGGNAS IFQVAETLSP
EILNLVKAPP KPDYPVMDPL DLKNYDGFLF GIPTRYGNFP VQWKAFWDST
GPLWASTALC GKYAGLFVST GSPGGGQEST LMAAMSTLVH HGVIYVPLGY
KYTFAQLANL TEVRGGSPWG AGTFANSDGS RQPTPLELEI ANLQGKSFYE
YVARVKW
| The Phylogenetic Profile |
|---|
![]() |
| Organisms |
![]() |
| Enzymes |
![]() |