Figure 1: Tools options
Data distribution bar chart: Bar chart showing the number of sequences with Blast (with or without hits), GO Mapping and GO Annotation results.
Data distribution pie chart: Pie chart showing the number of sequences with Blast (with or without hits), GO Mapping and GO Annotation results.
Analysis Progress: Bar chart showing the cumulative number of sequences with Blast hits, InterProScan, GO Mapping and GO Annotation results.
Sequence Length Distribution: Area chart showing the number of sequences for each sequence length.
Figure 2: Project Statistics
Figure 3: Data Distribution Bar Chart
Figure 4: Analysis Progress
Figure 5: Sequence Length
Find Duplicated Sequences
This function allows to quickly identify and remove redundant sequences (exactly the same sequences) within a dataset.
It is possible to select mark as selected or directly remove or Create an ID-List of all sequences in the dataset which have the exact same sequence string.
Figure 6: Find Duplicated Sequences wizard
Find Similar Sequences
This function allows searching for similar sequences within a dataset. The search for similar sequences is done via BLAT alignments. The function searches a list of sequences against itself and reports all alignments above a certain similarity percentage. It is possible to remove similar sequences from the project or remove or to extract a less redundant result dataset into a new project.
Figure 7: Find Similar Sequences wizard
Set to Sense (Based on Best-Blast-Hit)
Convert all selected sequences with a negative reading frame Best-Blast-Hit to anti-sense i.e. query-sequences will be translated to its reverse complement (e.g.: ATTG ->CAAT). The tag "_antisense" will be added to the end of the sequence names. Use the batch rename function to undo the name change.
Figure 8: Set to Sense wizard
Perform a batch rename of all selected sequences by converting, replacing or adding text to the actual sequence name. Link here for a detailed explanation on how to use this tool.
Figure 9: Bach Rename wizard
Translate Longest ORF
Convert all selected sequences to its longest ORF protein sequence. The tag "_ORF'' will be added to the sequence names. Use the batch rename function to undo the name change. The user may select the reading frame, the genetic code depending to the species that will be considered to the prediction.
Figure 10: Bach Rename wizard