Databases and Sequence Alignments with BioEdit or Jalview

We shall access GenBank by going through the Entrez gateway at NCBI.

A link to Entrez is on our main course page. Here's another. This takes you to the gateway for all NCBI databases:

Entrez Gateway
NCBI1 (50K)

We're going to search for, and compare, the genes for insulin in humans (Homo sapiens) and chimpanzee (Pan trogdolytes).

Select Nucleotide from the left hand column. The page pictured below will appear.

Nucleotide Search Page

In the drop-down box at the top, change Nucleotide to Gene. Then choose Advanced Search:

Choose First Search Term

In the drop-down box at the left, select Organism; then type homo sapiens in the blank. Click the Add To Search Box button

Now change the drop-down box to Gene Name, and type INS in the blank. Click Add TO Search Box:

Click the SEARCH button:

Several versions of the gene are listed: the entire gene, including introns; the gene broken into two segments; and the complete coding sequence (cds) of the mRNA (cDNA). This is the gene sequence obtained by reverse transcription of the mRNA, and is easiest to work with, since it is information only. Click on it, and then choose FASTA:

At the upper right click on Send; choose Coding Sequence

and then click on Create File. Provide a filename, with the extension fasta or fas. You have now saved the gene sequence on your computer. I like to keep a separate folder for my sequences to make finding them easy.

A further exploration. Just above the sequence, click on GenBank:

This is the full GenBank entry describing the gene.

For a graphical representation, click Graphics:

Repeat the process using Pan troglodytes as the organism.

Protein sequences can be obtained similarly: just choose Protein from the Entrez gateway page.

