Gene Ontology Mapping
Mapping is the process of retrieving GO terms associated with the Hits obtained by the BLAST search. OmicsBox performs four different mappings steps:
- BLAST result accessions are used to retrieve gene names or symbols making use of two mapping files provided by the NCBI (gene_info, gene2accession). Identified gene names are then searched in the species-specific entries of the gene-product table of the GO database.
- GeneBank identifiers (gi), the primary blast Hit ids, are used to retrieve UniProt IDs making use of a mapping file from PIR (Non-redundant Reference Protein Database) including PSD, UniProt, Swiss-Prot, TrEMBL, RefSeq, GenPept and PDB.
- Accessions are searched directly in the dbxref table of the GO database.
- BLAST result accessions are searched directly in the gene-product table of the GO database.
Figure 1: Mapping options
- Run Mapping. Mapping will start.
- Remove Mapping. Delete Mapping results for the selected sequences.
The mapping step needs protein ids to run. Make sure you ran blast against a protein database.
blastx - if one has nucleotide sequences
blastp - if one has protein sequences
If a BLAST result is successfully mapped to one or several GO terms, these will be shown in the GOs column of the Main Sequence Table and this sequence row will turn light-green. Assigned GOs can be reviewed in the BLAST results Table (see Show BLAST Results section and BLAST figure 8 of that section).
Three different charts are available to summarise the mapping step:
- GO Mapping Distribution: Shows the distribution of the amount of Gene Ontology candidate terms assigned to each sequence during the GO Mapping step.
- EC Distribution for Blast Hits (figure 5): Evidence Codes associated to the obtained GO pool
- EC Distribution for Sequences (figure 6): This chart shows the distribution of GO evidence codes for the functional terms obtained during the mapping step. It gives an idea about how many annotations derive from automatic/computational annotations or manually curated ones.
- DB Resources of Mapping (figure 7): This chart gives the distribution of the number of annotations (GO-terms) retrieved from the different source databases e.g. UniProt, PDB, TAIR etc.
Commonly IEA (electronic annotation) is overwhelmed in the mapping results. However, the contribution of this (and other) type of annotation to the finally assigned annotation to the query set can be modulated at the annotation step.
Mapping Statistics Graphs:
Figure 5: Evidence Code Distribution of BLAST hits
Figure 6: Evidence Code Distribution for sequences
Figure 7: DB Resources of Mapping
Export Mapping Results
A tab separator text file can be exported with the corresponding mapping results (File > Export > Export Mapping Results).