Functional Annotation Analysis Steps
This section provides a quick run-through of a basic functional annotation process done within OmicsBox. More detailed descriptions of the different analysis steps and more advanced features are described in the remaining sections of this documentation.
Go to File Load > Load Sequences > Load Fasta File and select your .fasta file containing the set of sequences in FASTA format. Alternatively, you can load the example sequences into OmicsBox choosing File > Load > Load Example Sequences. Please download example files to try and test OmicsBox: b2g_example_files.zip
Click on Functional Analysis (toolbar) > Blast. In the Blast Configuration Dialog (BLAST) select the way in which Blast will be executed (CloudBlast, NCBI Blast or Local Blast), the type of Blast mode which is appropriate for your sequence type (Blastx for nucleotide and Blastp for protein data) and the taxonomies you want to blast against. Click Next for the advanced settings and to choose where to save the Blast results, and click Run to start the Blast search.
Once your BLAST analysis is finished visualize your results at Functional Analysis (toolbar) > Charts and Statistics > Blast.
On the Main Sequence Table, right-click on a sequence to open the Single Sequence Menu (BLAST). Select Show BLAST Result to the BLAST Browser for that sequence.
If you are running blast using CloudBlast we recommend to run blastx-fast or blastp-fast as it is faster and fewer computation units will be consumed.
By clicking on the InterPro icon the corresponding Wizard will be shown. If InterProScan is executed via the EMBL-EBI web service, please provide a valid email address. This is not needed if InterProScan is run via CloudIPS. It is highly recommended to run IPS in order to improve the quality of the annotations. Once InterProScan results are retrieved use Merge InterProScan GOs to Annotation to add GO terms obtained through motifs/domains to the current annotations. InterProScan can be run in parallel with BLAST.
Click on Functional Analysis > Blast2GO Mapping > Run GO Mapping to start mapping GO terms. Mapped sequences will turn green. Once Mapping is completed visualize your results at Functional Analysis > Charts and Statistics > Mapping.
Click on Functional Analysis > Blast2GO Annotation > Run Annotation to open the Annotation Configuration Window. Click Next to change the evidence codes and finally click Run to start the annotation. Annotated sequences will turn blue.
Once the annotation is completed you are able to visualize your results with Functional Analysis > Charts and Statistics > Annotation.
On the Main Sequence Table, right-click on a sequence to open the Single Sequence Menu. Select Make Graph of GO-Mapping with Annotation Score to visualize the annotation on the GO DAG for that sequence.
If desired, modify the annotation by clicking with the right mouse button and select Change Annotation and Description or reducing to a GO-Slim representation Functional Analysis > Blast2GO Annotation > GO-Slim > Run GO- Slim (online).
During the annotation process, Enzyme Codes (EC) will be also given when a GO-term/EC number equivalence is available.
OmicsBox provides tools for the statistical analysis of GO term frequency differences between two sets of sequences. Go to Functional Analysis > Enrichment Analysis > Enrichment Analysis (Fisher's Exact Test) and a new Dialog window will open (Fisher's Exact Test). Select a .txt file or an ID list containing the sequence IDs for a subset of sequences. A test-set example file can be downloaded the OmicsBox website. Select the second set of sequences as reference set if desired. If no reference set is provided all annotations of the corresponding project will be used as the reference. Click Run to start the analysis. A table containing the results of this analysis will be displayed in a new tab.
Click on Make Enriched Graph icon to visualize the results of the Fisher’s Test on the GO DAG.
Click on Show Bar Chart to obtain a bar chart representation of GO frequencies.
The results can be reduced to more specific GO terms in the corresponding icon and saved as text format (Save as Text).
OmicsBox can visualize the combined annotation for a group of sequences on the GO DAG. Select a group of sequences to generate their combined graph at Functional Analysis > Tools > Select > Select Sequences. Now Select by Features and Select by Name or ID. You can use the Demo Test Set used previously for this. Alternatively, you can select sequences using the sequence check boxes of the Main Sequence Table. Now go to Functional Analysis > Gene Ontology Graphs > Make Combined Graph. Now click Run.
File > Save saves the current OmicsBox project as .box file.
File > Export allows exporting the generated data in many different formats.
File > Export Annotations exports the actual annotation results as .annot file or generate own formatted annotation file as .txt file.
The enrichment analysis results can be exported in various formats from the Fisher Exact Test Result Viewer. ‘‘Save as Text’’ exports the results as a tabulator separated text file.
To export GO graphs use the sidebar of the corresponding graph viewer. Graphs can be saved/exported in .pdf, .png, .svg and .txt.
Once finished any step or at the beginning, we can obtain a general chart in which it shows the state of the analysis of the entire data.
We will be able to know the number of sequences that belong to a concrete state (functional analysis > Charts and Statistics).
The data distribution can be visualized in two different charts, one as a bar chart and the other as a pie chart (Figures 1 and 2).
These are the different states we are going to find in the charts:
Total: The total amount of sequences in the project (only in the bar chart).
Without Analysis: Sequences without processing or have been reset in the BLAST menu (functional analysis > Blast > Remove Blast Results).
With Only InterProScan: Sequences that only have InterProScan and nothing else.
Without Blast Hits: Sequences that have been sent to BLAST but no hits have been found.
With Blast: Successful sequences after BLAST step or have been reset in the Mapping menu (functional analysis > Blast2GO Mapping > Remove Mapping).
With Mapping: Successful sequences after Mapping step or they have been reset in the Annotation menu (functional analysis > Blast2GO Annotation > Remove Annotation).
With GO Annotation: Successful sequences after Annotation step.
With Manual Annotation: Manually annotated sequences before or after executing the annotation step.
With GO-Slim Annotation: Sequences with GO-Slim Annotation.
Each state will have assigned a specific colour.
It is also possible to see the progress of the analysis (Figure 3).
From the 1000 sequences, 700 have blast results.
From the chart, it could suggest there are still some analysis to be completed, such as mapping, annotation and specially InterProScan.
Once all the analysis steps have been executed the Analysis Progress chart should be similar to the one in Figure 4.
Figure 1: Data Distribution Bar chart
Figure 2: Data Distribution Pie chart
Figure 3: Analysis Progress Chart
Figure 4: Analysis Progress Chart after running all analysis