This Title All WIREs
How to cite this WIREs title:
Impact Factor: 9.957

Computational analysis of noncoding RNAs

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

Abstract Noncoding RNAs have emerged as important key players in the cell. Understanding their surprisingly diverse range of functions is challenging for experimental and computational biology. Here, we review computational methods to analyze noncoding RNAs. The topics covered include basic and advanced techniques to predict RNA structures, annotation of noncoding RNAs in genomic data, mining RNA‐seq data for novel transcripts and prediction of transcript structures, computational aspects of microRNAs, and database resources. These authors contributed equally WIREs RNA 2012. doi: 10.1002/wrna.1134 This article is categorized under: RNA Structure and Dynamics > RNA Structure, Dynamics, and Chemistry RNA Evolution and Genomics > Computational Analyses of RNA Regulatory RNAs/RNAi/Riboswitches > Regulatory RNAs

This WIREs title offers downloadable PowerPoint presentations of figures for non-profit, educational use, provided the content is not modified and full credit is given to the author and publication.

Download a PowerPoint presentation of all images

Outline of the main topics covered in this article. Many topics overlap, depend on each other or share similar concepts. The most important of these interconnections are shown by arrows.

[ Normal View | Magnified View ]

Growth of miRBase and Rfam over the past 8 years. For miRBase the number of miRNAs are shown, for Rfam the number of structure families and the number of sequences found to be member of an Rfam family. (The drop of the number of Rfam sequences in 2011 is the result of the re‐organization of some large families and the elimination of pseudogenes.)

[ Normal View | Magnified View ]

Reconstructing transcript models from RNA‐seq data. Two splice isoforms of RNAs are shown for which the RNA‐seq experiment generated short sequence fragments. One approach (left) to reconstruct the transcript is mapping the fragments to a reference genome. Spliced reads that span exons boundaries can be used to infer the connectivity graph. The paths through this graph correspond to the different isoforms. Alternatively, the transcripts can be re‐constructed by de novo assembly of the reads into transcripts (right). If available, the assembled RNA transcripts can be mapped to a reference genome afterwards to obtain the intron‐exon structure of the isoforms.

[ Normal View | Magnified View ]

Kinetic folding pathways. (a) Schematic energy landscape and associated barrier tree. A barrier tree shows the local minima and the minimum energy barriers between them. (Reprinted with permission from Ref 81. Copyright 2010 RNA Society). (b) Barrier tree of the small RNA xbix. (c) Exact folding kinetics of xbix starting from the open chain. Probability of local minima over time. While the minimum free energy (MFE) structure is finally most prominent, other ‘intermediary’ structures (2, 3, and 4) are temporarily more probable.(Reprinted with permission from Ref 82. Copyright 2004 IOP Publishing Ltd.)

[ Normal View | Magnified View ]

RNA/RNA interactions. (a) Simple hybridization, no internal structure of RNAs. The simplest interaction prediction approaches predict only the hybridization at a single site without considering internal structure. (b) Hybridization and restricted internal structure as in the co‐folding model. Interactions can occur at several sites, however only between external bases. When concatenating the two interacting RNAs, the structure of all inter‐ and intramolecular base‐pairs is pseudoknot‐free, such that it can be predicted from the concatenation by (a variant of) Zuker's algorithm. (c) Interaction structure as predictable by the most complex dynamic programming algorithms. Such structures are free of pseudoknots, crossing interactions, and zig‐zags (see d). (d) Zig‐zag. Intramolecular stems in each RNA cover a common interaction as well as interactions to the outside of the stems.

[ Normal View | Magnified View ]

Tertiary structure prediction. Example prediction from the MC‐Fold and MC‐Sym pipeline.68 (a) Secondary structure including canonical (bold lines) and noncanonical base‐pairs (non‐bold lines) as predicted by MC‐Fold. (b) Tertiary structure predicted from secondary structure (a) by MC‐Sym. The prediction (blue) is compared to the experimental structure (gold).

[ Normal View | Magnified View ]

Pseudoknot types. (a) The simplest type of pseudoknot (H‐type) formed by two crossing stems. The most efficient algorithms predict only this most common form of pseudoknot. (b) Three‐chain or kissing hairpin. Two hairpin loops are connected by one or more base pairs. (c) Three‐knot. Three stems cross each other. This configuration is predicted only by the expensive algorithm by Rivas and Eddy51 (d) Four‐chain, closed by a fifth stem. This complex motif cannot be predicted by the algorithm of Rivas and Eddy, but would require an even more costly algorithm. (e) Canonical pseudoknot. A pseudoknot formed by two perfect stems of canonical base pairs that are maximally extended, that is, they cannot be extended further by canonical base pairs; the figure indicates the latter by the dashed ‘conflict’‐arcs between non‐canonical base pairs AA, GA, CA, and CC (from left to right). The most space‐efficient pseudoknot prediction algorithm52 predicts only canonical pseudoknots.

[ Normal View | Magnified View ]

Principles of comparative analysis for RNA structure prediction. (a) A short sequence that can fold into a hairpin is aligned to three other sequences with different mutation patterns. Mutated bases are indicated by a lightning symbol. The affected base pair is shown in blue, green and red for the case of a consistent mutation, a compensatory double mutation, or a inconsistent mutation that destroys the base pair. (b) Sequence based alignment versus structural alignment. Consensus structure predicted for two aligned sequences. First, the alignment is optimized to match the sequences resulting in a poor consensus structure with few conserved base‐pairs (green). Second, the two sequences are aligned to optimize a common structure, resulting in a much better consensus structure with more conserved pairs. (All structures are shown in ‘dot/bracket’ notation, in which base‐pairs are indicated by brackets and unpaired positions are shown as dots.)

[ Normal View | Magnified View ]

Principles of RNA structure prediction (a) RNA secondary structure can be represented as an outerplanar graph (right). The backbone is arranged as a circle and base pairs are represented as arcs. The faces of this graph correspond to different structural elements. This formalization is the basis for most structure prediction algorithms. Any structure can be uniquely decomposed into these basic elements which are independent from each other. This allows for efficient folding algorithms based on the ‘dynamic programming’ principle that breaks down the problem into smaller subproblems (see also Box 1). (b) Example of energy evaluation of a small RNA structure. Thermodynamic folding algorithms assign free energies to the structural elements. In the example shown, two stacks and a symmetric interior loop stabilize the structure (negative free energy) while the hairpin loop destabilizes the structure (positive free energy). The total free energy of the structure is the sum of the energy of all structural elements.

[ Normal View | Magnified View ]

Related Articles

Regulatory Non-Coding RNAs

Browse by Topic

Regulatory RNAs/RNAi/Riboswitches > Regulatory RNAs
RNA Structure and Dynamics > RNA Structure, Dynamics, and Chemistry
RNA Evolution and Genomics > Computational Analyses of RNA

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts