Gene Ontology Mapping

Introduction

Mapping is the process of retrieving GO terms associated with the Hits obtained by the BLAST search. OmicsBox performs four different mappings steps:

  1. BLAST result accessions are used to retrieve gene names or symbols making use of two mapping files provided by the NCBI (gene_info, gene2accession). Identified gene names are then searched in the species-specific entries of the gene-product table of the GO database.

  2. GeneBank identifiers (gi), the primary blast Hit ids, are used to retrieve UniProt IDs making use of a mapping file from PIR (Non-redundant Reference Protein Database) including PSD, UniProt, Swiss-Prot, TrEMBL, RefSeq, GenPept and PDB.

  3. Accessions are searched directly in the dbxref table of the GO database.

  4. BLAST result accessions are searched directly in the gene-product table of the GO database.

Please cite:

Gotz S., Garcia-Gomez JM., Terol J., Williams TD., Nagaraj SH., Nueda MJ., Robles M., Talon M., Dopazo J. and Conesa A. (2008). High-throughput functional annotation and data mining with the Blast2GO suite. Nucleic acids research, 36(10), 3420-35.

Figure 1: Mapping options

Run Blast2GO Mapping

The Blast2GO Mapping functionality can be found under functional analysis → Blast2GO Mapping.

The Blast2GO Mapping options are:

  • Run Mapping. Mapping will start using online mapping database.

  • Remove Mapping. Delete Mapping results for the selected sequences.

  • Run GO Mapping (Direct DB Connection). It is possible to run Mapping using a Blast2GO local database by setting up MongoDB.

The mapping step needs protein ids to run. Make sure you ran blast against a protein database.

blastx - if one has nucleotide sequences

blastp - if one has protein sequences

Results

Show Mapping Results

For each sequence, it is possible to see the mapping results individually.

  1. Show Mapping Results. A new table will be displayed (see figure 3). The resulting table shows the GO mapping results for a particular sequence. See Table section to manipulate/extract the results from this table.

  2. Show GO Descriptions. GO ID, description, type, and definition are given for all GO terms associated with the selected sequence. The GO ID is linked to the AmiGO browser at the Gene Ontology site while the show option displays the DAG representation of the GO term.

  3. Annotate Sequence. This function allows changing annotation parameters for the selected sequence and re-running automatic annotation.

  4. Change Annotation and Description. This function edits the annotation of the selected and allows typing and deleting of annotation or sequence description. A manual annotation check-box (see figure 5 in Gene Ontology Annotation section) is available for marking sequences with manual annotation. The sequence will get the pink label on the Main Sequence Table.

  5. Make Graph of GO-Mapping-Results with Annotation Score. Displays a DAG with all GO terms related to one sequence. Shows all the GOs from the mapping step as well as final annotations (highlighted). The wizard (figure 4 allows filtering the hits which will be taken into account (see Gene Ontology Graphs section for more details about visualization in OmicsBox)

    1. Hit Filter. Nodes can be filtered out by a number of hits: only nodes with more than a given number of BLAST-Hits will be shown in the graph.

    2. HSP-Hit Coverage CutOff: Includes only those hits which are overage with the HSP for a given percentage.

For Mapping statistic charts see the Charts and Statistics page of this user manual.


Figure 2: Show Mapping Results

Figure 3: Mapping Results for sequence C02006A02

Figure 4: Single Graph Drawing Configuration

Export Mapping Results

A tab separator text file can be exported with the corresponding mapping results (File > Export > Export Mapping Results).