Gene Finding

With advancements in high throughput sequencing technologies, complete genomic sequences of novel species are becoming more and more abundant. Given a new genome, one of the most important tasks is determining the structure of its protein-coding genes. This procedure is known as gene prediction or gene finding, which is essential for genome characterization and allows downstream bioinformatics applications, such as functional annotation.

This functionality can be found under Genome Analysis → Gene Finding.

Two gene finding strategies are available:

  • Eukaryotic Gene Finding: The Eukaryotic Gene Finding functionality is based on the AUGUSTUS software, which is designed to predict genes in eukaryotic genomic sequences. It is one of the most accurate programs for the species it is trained for.

  • Prokaryotic Gene Finding: The Prokaryotic Gene Finding functionality is based on Glimmer, which is a system for finding genes in microbial genomes (bacteria, archaea, and viruses). Glimmer uses Interpolated Markov Models (IMMs) to identify the coding regions and to distinguish them from noncoding DNA.