Genomes and Genes
Genome: the entire collection of genes encoded by a particular organism
Gene: ????
| A Gene (Autry) |
 |
 |
Not really....
The classic definition, based on the central dogma: a section of DNA coding for a protein
But: RNA now is known to have structural, catalytic, and regulatory properties
Since Jacob and Monod in 1960, we have known that genes include regulatory regions that are not transcribed into RNA and thence into protein
So, let's try this.
Gene: a complete chromosomal segment responsible for making a functional product
OK, then, how do you recognize a gene when you meet one on the street?
Five criteria are commonly used:
Open reading frames (ORFs)
- ORF = string of codons bounded by start and stop signals
- In a random sequence, one codon in 20 usually is a stop codon
- A stretch of 50 or more codons with no stop is usually an ORF
- Works best for prokaryotes, which have relatively few introns
- Critters like humans tend to have small exons tucked in between large introns, making ORFs difficult to find
Sequence features
- ORFs often contain non-random use of codons, or specific patterns, which can be analyzed statistically to locate genes
- However, current software catches fewer than 50% of exons, and 20% of complete genes
Sequence Conservation
- Comparing sequences among species sometimes works because Mother Nature tends to stick with a good idea when she finds one
- This approach requires sequences from related organisms with appropriate evolutionary distance
- Complications: regulatory regions tend to be highly conserved; evolutionary relationships may not be what you think
Evidence for transcription
- Work backwards, from product to gene
- For example, microarrays containing chromosome segments can be hybridized to cDNA
- Problems: some conserved ORFs are not expressed; protein products of many genes have not been identified
Gene inactivation
- Disrupt or inactivate (RNA interference) a segment of DNA you think is a gene and see what, if anything, goes missing
- Problem: many genes make products that do not have an obvious impact on the phenotype
- For example, only about 1/6 of yeast genes impact the phenotype as long as the yeast is grown in rich media
Many other difficulties exist:
- Overlapping reading frames
- Overlapping transcriptional units - an exon of one gene contained within the intron of another
- Overlapping protein-coding and RNA-coding genes
- Alternative gene splicing
- Pseudogenes
- Transposons ("hopping genes")
Genomics is such fun!
Although the complete genomes of some 800 organisms, mostly bacteria, have been sequenced, to my knowledge, there is no organism for which the complete genome has been explicated; that is, for which all genes and what they code for have been identified.
This page last modified 10:04 AM on Friday December 24th, 2010.
Webmaster, Department of Chemistry, University of Maine, Orono, ME 04469