RFAM

Content of this page:

Introduction

The Rfam database is a collection of RNA families, each represented by multiple sequence alignments, consensus secondary structures and covariance models (CMs). The families in Rfam break down into three broad functional classes: non-coding RNA genes, structured cis-regulatory elements and self-splicing RNAs. Typically these functional RNAs often have a conserved secondary structure which may be better preserved than the RNA sequence. The CMs used to describe each family are a slightly more complicated relative of the profile hidden Markov models (HMMs) used by Pfam. CMs can simultaneously model RNA sequence and the structure in an elegant and accurate fashion (Rfam description from: http://rfam.xfam.org/).

Please cite: Nawrocki, E. P., Burge, S. W., Bateman, A., Daub, J., Eberhardt, R. Y., Eddy, S. R., Floden, E. W., Gardner, P. P., Jones, T. A., Tate, J., et al. (2014). Rfam 12.0: updates to the rna families database. Nucleic acids research, page gku1063.

This functionality can be found under Functional Analysis → Coding Potential → Run RfamA dialog screen appears (see image below). Sequences longer than a given length can be skipped during the analysis.


Figure 1: Rfam Dialog

Click on the Run button to start the analysis. It may take a while depending on the number of sequences and the EMBL-EBI servers.

Results Table

Once Rfam analysis has begun a table with the corresponding results will be displayed in a new tab. Sequences will turn red/orange depending if Rfam found hits for them (red if no hits were found, orange otherwise). White rows are sequences that have not been analysed yet. For each sequence It is possible to consult details about each one of their hits using the context menu (similar to consult Blast results).


Figure 2: Rfam Table Results

Sidebar options

In the sidebar there are located all possible action that can be performed for the Rfam result, including one option for the visual display of the results:

  1. Hit Distribution: This chart shows a distribution chart of the number sequences with hits in the Rfam analysis.

  2. Biotypes Pie Chart: This pie chart shows the distribution of the Rfam families of the sequences.

  3. Biotypes Distribution: The same as the former but in a bar-style.

  4. E-Value Distribution: This chart plots the distribution of E-values for the Rfam hits.

  5. Create GFF: This will create a GFF file for the Rfam results.
  6. Open as Treemap: This visualisation allows to see the Rfam families (hierarchical, tree-structured data in general) as a set of nested rectangles.

Image rfam_hit_dist



Image rfam_biotypes_dist

Image rfam_biotype_bar_chart

Image rfam_evalue_dist

Figure 3: Rfam Statistics Graphs and Visualization


Additionally, like many others results in OmicsBox, It is possible to display the Rfam result in a different way: the Treemap representation to see the Rfam families (hierarchical, tree-structured data in general) as a set of nested rectangles.

Image rfam_tree_map

Figure 4: Rfam Tree Map