InterPro provides functional analysis of proteins by classifying them into families and predicting domains and important sites. To classify proteins in this way, InterPro uses predictive models, known as signatures, provided by several different databases (referred to as member databases) that make up the InterPro consortium. InterPro combines protein signatures from these member databases into a single searchable resource, capitalising on their individual strengths to produce a powerful integrated database and diagnostic tool.
Please Cite InterProScan:
Blum M, Chang H, Chuguransky S, Grego T, Kandasaamy S, Mitchell A, Nuka G, Paysan-Lafosse T, Qureshi M, Raj S, RichardsonL, Salazar GA, Williams L, Bork P, Bridge A, Gough J, Haft DH, Letunic I, Marchler-Bauer A, Mi H, Natale DA, Necci M, Orengo CA, Pandurangan AP, Rivoire C, Sigrist CJA, Sillitoe I, Thanki N, Thomas PD, Tosatto SCE, Wu CH, Bateman A and Finn RD The InterPro protein families and domains database: 20 years on. Nucleic Acids Research, Nov 2020, (doi: 10.1093/nar/gkaa977)
The functionality of InterPro annotations in OmicsBox allows to retrieved domain/motif information in a sequence-wise manner. Corresponding GO terms are then transferred to the sequences and merged with already existing GO terms. InterProScan results can be viewed through the Single Sequence Menu (right-click on a sequence) and saved in TXT and XML format (figure 4). When working with nucleotide sequences, OmicsBox translates them to the longest open reading frame and then sends them to InterProScan.
The following options can be found under functional analysis → InterProScan.
Run InteProScan. Start sending sequences to the EBI or OmicsBox Cloud.
Merge InterProScan GOs to Annotation. Add GO terms obtained through motifs/domains to the current annotations.
Remove InterProScan. Delete InterProScan results for the selected sequences.
There are two options to run InterProScan in OmicsBox, either with CloudIPS or via the public web service at EBI.
CloudIPS is a cloud-based OmicsBox community resource for fast and reliable InterPro analysis for everything from small to big data sets. It allows executing the original InterPro algorithms against up-to-date databases in our dedicated computing cloud. This is a high-performance, secure and cost-optimized solution for your analysis.
The public EMBL-EBI InterPro web-service scans your sequences against InterPro's signatures and performance and results depend on the EBI web-server.
Figure 1: Run InterProScan Options
The last page allows to save the InterProScan results in different file formats, in tab-separated values (TVS), XML, which is the default output, GFF3, and the input (query) sequence itself (figure 4).
Once the InterProScan has finished it is possible to view the results of each sequence via the context menu (figure 5). The sequences will turn violet if no other analysis has been executed before.
InteProScan can only be performed if the sequences are shown in the sequence table that contains the actual sequence information (loaded via fasta file). You have to be careful if you created a project via a blast XML file or if you loaded a .annot file.
To add the sequences to the current OmicsBox project see Add sequences to existing OmicsBox project section.
For InterPro statistic charts see Charts and Statistics page in the user manual.
Figure 2: Selection of Member Databases
Figure 3: Selection of Member Databases
Figure 4: Save InterProScan Results
Figure 5: InterProScan Results
Merge InterProScan GOs to Annotation
The InterProScan GOs results can now be added to the already existing annotations based on the BLAST results. This option is available from the InterProScan submenu.
Once the merge has finished a distribution chart is displayed in the Results menu showing the number of GOs that have been added to (or confirmed) the current annotation results.
Figure 6: Statistics after merging InterProScan to GO Annotation