Gene Ontology Mapping

Mapping is the process of retrieving GO terms associated with the Hits obtained by the BLAST search. OmicsBox performs four different mappings steps:

  1. BLAST result accessions are used to retrieve gene names or symbols making use of two mapping files provided by the NCBI (gene_info, gene2accession). Identified gene names are then searched in the species-specific entries of the gene-product table of the GO database.

  2. GeneBank identifiers (gi), the primary blast Hit ids, are used to retrieve UniProt IDs making use of a mapping file from PIR (Non-redundant Reference Protein Database) including PSD, UniProt, Swiss-Prot, TrEMBL, RefSeq, GenPept and PDB.

  3. Accessions are searched directly in the dbxref table of the GO database.

  4. BLAST result accessions are searched directly in the gene-product table of the GO database.

Figure 1: Mapping options

  • Run Mapping. Mapping will start.

  • Remove Mapping. Delete Mapping results for the selected sequences.

The mapping step needs protein ids to run. Make sure you ran blast against a protein database.

blastx - if one has nucleotide sequences

blastp - if one has protein sequences

Show Individual Mapping Results

For each sequence, it is possible to see the mapping results individually.

  1. Show Mapping Results. A new table will be displayed (see figure 3). The resulting table shows the GO mapping results for a particular sequence. See Table section to manipulate/extract the results from this table.

  2. Show GO Descriptions. GO ID, description, type, and definition are given for all GO terms associated with the selected sequence. The GO ID is linked to the AmiGO browser at the Gene Ontology site while the show option displays the DAG representation of the GO term.

  3. Annotate Sequence. This function allows changing annotation parameters for the selected sequence and re-running automatic annotation.

  4. Change Annotation and Description. This function edits the annotation of the selected and allows typing and deleting of annotation or sequence description. A manual annotation check-box (see figure 5 in Gene Ontology Annotation section) is available for marking sequences with manual annotation. The sequence will get the pink label on the Main Sequence Table.

  5. Make Graph of GO-Mapping-Results with Annotation Score. Displays a DAG with all GO terms related to one sequence. Shows all the GOs from the mapping step as well as final annotations (highlighted). The wizard (figure 4 allows filtering the hits which will be taken into account (see Gene Ontology Graphs section for more details about visualization in OmicsBox)

    1. Hit Filter. Nodes can be filtered out by a number of hits: only nodes with more than a given number of BLAST-Hits will be shown in the graph.

    2. HSP-Hit Coverage CutOff: Includes only those hits which are overage with the HSP for a given percentage.

Figure 2: Show Mapping Results

Figure 3: Mapping Results for sequence C02006A02

Figure 4: Single Graph Drawing Configuration


If a BLAST result is successfully mapped to one or several GO terms, these will be shown in the GOs column of the Main Sequence Table and this sequence row will turn light-green. Assigned GOs can be reviewed in the BLAST results Table (see Show BLAST Results section and BLAST figure 8 of that section).

Three different charts are available to summarise the mapping step:

  • GO Mapping Distribution: Shows the distribution of the amount of Gene Ontology candidate terms assigned to each sequence during the GO Mapping step.

  • EC Distribution for Blast Hits (figure 5): Evidence Codes associated to the obtained GO pool

  • EC Distribution for Sequences (figure 6): This chart shows the distribution of GO evidence codes for the functional terms obtained during the mapping step. It gives an idea about how many annotations derive from automatic/computational annotations or manually curated ones.

  • DB Resources of Mapping (figure 7): This chart gives the distribution of the number of annotations (GO-terms) retrieved from the different source databases e.g. UniProt, PDB, TAIR etc.

Commonly IEA (electronic annotation) is overwhelmed in the mapping results. However, the contribution of this (and other) type of annotation to the finally assigned annotation to the query set can be modulated at the annotation step.

Mapping Statistics Graphs:

Figure 5: Evidence Code Distribution of BLAST hits

Figure 6: Evidence Code Distribution for sequences

Figure 7: DB Resources of Mapping

Export Mapping Results

A tab separator text file can be exported with the corresponding mapping results (File > Export > Export Mapping Results).