Retrieve Blast Top-Hit
This feature allows retrieving the sequence information of Top Blast Hits from an OmicsBox project to improve the annotation of a dataset.
A possible use case scenario would be a so-called "Double-Blast'': The blast results of a first-run are used to replace the sequence data for a second run against a different set of query sequences. Imagine an RNA-seq data-set with a high percentage of sequences without any alignments against a protein database (e.g. blastx against NR). This feature could be used to select and extract the sequences without hits (red ones) into a new project. These sequences could be basted first against a set of EST sequences. The initial unaligned sequences are now replaced with the ESTs. Now the initial blastx search is repeated against the protein database.
Run Retrieve Blast Top-Hit
It can be found under functional analysis → Blast → Retrieve Blast Top-Hit.
Data can be obtained from the NCBI, Ensembl or Uniprot web services and stored in a new project or replace the existing IDs/sequences (figure 1).
Filters Applied to Top-Hit: For each Top-Hit (first significant alignment from an already performed BLAST), apply the filters (bottom part of the dialog) and search them in the corresponding database (online).
Figure 1: Retrieve Blast Top-Hit Dialog.
Depending on the configuration a new project will be generated or the current one will be changed.