Functional Annotation Analysis Steps
This section provides a quick run-through of a basic functional annotation process done within OmicsBox. More detailed descriptions of the different analysis steps and more advanced features are described in the remaining sections of this documentation.
Go to File Load > Load Sequences > Load Fasta File and select your .fasta file containing the set of sequences in FASTA format. Alternatively, you can load the example sequences into OmicsBox by choosing Load > Load > Load Example Sequences. Please download example files to try and test OmicsBox: b2g_example_files.zip
Click on Blast from the Side Panel. In the Blast Configuration Dialog (BLAST) select the way in which Blast will be executed (CloudBlast, NCBI Blast or Local Blast), the type of Blast mode which is appropriate for your sequence type (Blastx for nucleotide and Blastp for protein data) and the taxonomies you want to blast against. Click Next for the advanced settings and to choose where to save the Blast results, and click Run to start the Blast search.
Once your BLAST analysis is finished visualize your results from the Side Panel > Charts.
On the Main Sequence Table, right-click on a sequence to open the Single Sequence Menu (BLAST). Select Show BLAST Result to the BLAST Browser for that sequence.
If you are running blast using CloudBlast we recommend to run blastx-fast or blastp-fast as it is faster and fewer computation units will be consumed.
By clicking on the InterPro icon on the Side Panel the corresponding Wizard will be shown. If InterProScan is executed via the EMBL-EBI web service, please provide a valid email address. This is not needed if InterProScan is run via CloudIPS. It is highly recommended to run IPS in order to improve the quality of the annotations. Once InterProScan results are retrieved using Merge GOs to add GO terms obtained through motifs/domains to the current annotations. InterProScan can be run in parallel with BLAST.
Click on Run GO Mapping from the Side Panel to open to start mapping GO terms. Mapped sequences will turn green. Once Mapping is completed visualize your results at the Side Panel > Charts.
Click on Run GO Annotation on the Side Panel to open the Annotation Configuration Window. Click Next to change the evidence codes and finally click Run to start the annotation. Annotated sequences will turn blue.
Once the annotation is completed you are able to visualize your results with Charts.
On the Main Sequence Table, right-click on a sequence to open the Single Sequence Menu. Select GO-Mapping Graph with Annotation Score to visualize the annotation on the GO DAG for that sequence.
If desired, modify the annotation by clicking the right mouse button and selecting Change Annotation and Description or reducing it to a GO-Slim representation Run GO- Slim from the Side Panel under the Functional Analysis menu.
During the annotation process, Enzyme Codes (EC) will be also given when a GO-term/EC number equivalence is available.
OmicsBox provides tools for the statistical analysis of GO term frequency differences between two sets of sequences. On the Side Panel go to Functional Analysis > Enrichment Analysis and a new Dialog window will open to choose between the Fisher's Exact Test or GSEA. Select a .txt file or an ID list containing the sequence IDs for a subset of sequences. A test-set example file can be downloaded from the OmicsBox website. Select the second set of sequences as a reference set if desired. If no reference set is provided all annotations of the corresponding project will be used as the reference. Click Run to start the analysis. A table containing the results of this analysis will be displayed in a new tab.
Click on Make Enriched Graph icon to visualize the results of the Fisher’s Test on the GO DAG.
Click on Show Bar Chart to obtain a bar chart representation of GO frequencies.
The results can be reduced to more specific GO terms in the corresponding icon and saved as text format (Export Table).
OmicsBox can visualize the combined annotation for a group of sequences on the GO DAG. Select a group of sequences to generate their combined graph at the Side Panel under Selection > Select Sequences. Now Select by Features and Select by Name or ID. You can use the Demo Test Set used previously for this. Alternatively, you can select sequences using the sequence checkboxes of the Main Sequence Table. Now on the Side Panel under Functional Analysis > Combined Graph. Now click Run.
File > Save saves the current OmicsBox project as .box file.
Export allows exporting the generated data in many different formats.
Export GO Annotations exports the actual annotation results as a .annot file or generate your own formatted annotation file as .txt file.
The enrichment analysis results can be exported in various formats from the Fisher’s Exact Test Result Viewer. ‘‘Export Table’’ exports the results as a tabulator separated text file.
To export GO graphs use the sidebar of the corresponding graph viewer. Graphs can be saved/exported in .png and .txt.
Once finished with any step or at the beginning, we can obtain a general chart that shows the state of the analysis of the entire data.
We will be able to know the number of sequences that belong to a concrete state (Tools > General Charts).
The data distribution can be visualized in two different charts, one as a bar chart and the other as a pie chart (Figures 1 and 2).
These are the different states we are going to find in the charts:
Total: The total amount of sequences in the project (only in the bar chart).
Without Analysis: Sequences without processing or have been reset in the BLAST menu (functional analysis > Blast > Remove Blast Results).
With Only InterProScan: Sequences that only have InterProScan and nothing else.
Without Blast Hits: Sequences that have been sent to BLAST but no hits have been found.
With Blast: Successful sequences after BLAST step or have been reset in the Mapping menu (functional analysis > Blast2GO Mapping > Remove Mapping).
With Mapping: Successful sequences after Mapping step or they have been reset in the Annotation menu (functional analysis > Blast2GO Annotation > Remove Annotation).
With GO Annotation: Successful sequences after Annotation step.
With Manual Annotation: Manually annotated sequences before or after executing the annotation step.
With GO-Slim Annotation: Sequences with GO-Slim Annotation.
Each state will have assigned a specific colour.
It is also possible to see the progress of the analysis (Figure 3).
From the 1000 sequences, 700 have blast results.
From the chart, it could suggest there are still some analyses to be completed, such as mapping, annotation and specially InterProScan.
Once all the analysis steps have been executed the Analysis Progress chart should be similar to the one in Figure 4.
Figure 1: Data Distribution Bar chart
Figure 2: Data Distribution Pie chart
Figure 3: Analysis Progress Chart
Figure 4: Analysis Progress Chart after running all analysis