  1. Set-to-Sense (Based on Best-Blast-Hit): Convert all selected sequences with a negative reading frame Best-Blast-Hit to anti-sense i.e. query-sequences will be translated to its reverse complement (e.g.: ATTG ->CAAT). The tag "_antisense" will be added to the end of the sequence names. Use the batch rename function to undo the name change.
  2. Translate Longest ORF: Convert all selected sequences to its longest ORF protein sequence. The tag "_ORF'' will be added to the sequence names. Use the batch rename function to undo the name change. The user may select the reading frame, the genetic code depending to the species that will be considered to the prediction.
  3. Search Loaded Annotations in Another Annotation Set: Compare a set of annotations for a given group of sequences against the annotations already loaded in OmicsBox.
  4. Find Duplicated Sequences: Mark as selected or directly remove all sequences in the dataset which have the exact same sequence string.
  5. Find Similar Sequences: Detect, Select and/or remove similar sequences within one project.
  6. Batch Rename: Perform a batch rename of all selected sequences by converting, replacing or adding text to the actual sequence name. Link here for a detailed explanation on how to use this tool.

Find Similar Sequences

This function allows searching for similar sequences within a dataset. The search for similar sequences is done via BLAT alignments. The function searches a list of sequences against itself and reports all alignments above a certain similarity percentage. It is possible to remove similar sequences from the project or to extract a less redundant result dataset into a new project.

Find Duplicated Sequences

This function allows to quickly identify and remove redundant sequences (exactly the same sequences) within a dataset.