Home
This Title All WIREs
WIREs RSS Feed
How to cite this WIREs title:
WIREs RNA
Impact Factor: 4.928

Computational Biology in microRNA

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

MicroRNA (miRNA) is a class of small endogenous noncoding RNA species, which regulate gene expression post‐transcriptionally by forming imperfect base‐pair at the 3′ untranslated regions of the messenger RNAs. Since the 1993 discovery of the first miRNA let‐7 in worms, a vast number of studies have been dedicated to functionally characterizing miRNAs with a special emphasis on their roles in cancer. A single miRNA can potentially target ∼400 distinct genes, and there are over a 1000 distinct endogenous miRNAs in the human genome. Thus, miRNAs are likely involved in virtually all biological processes and pathways including carcinogenesis. However, functionally characterizing miRNAs hinges on the accurate identification of their mRNA targets, which has been a challenging problem due to imperfect base‐pairing and condition‐specific miRNA regulatory dynamics. In this review, we will survey the current state‐of‐the‐art computational methods to predict miRNA targets, which are divided into three main categories: (1) sequence‐based methods that primarily utilizes the canonical seed‐match model, evolutionary conservation, and binding energy; (2) expression‐based target prediction methods using the increasingly available miRNA and mRNA expression data measured for the same sample; and (3) network‐based method that aims identify miRNA regulatory modules, which reflect their synergism in conferring a global impact to the biological system of interest. We hope that the review will serve as a good reference to the new comers to the ever‐growing miRNA research field as well as veterans, who would appreciate the detailed review on the technicalities, strength, and limitations of each representative computational method. WIREs RNA 2015, 6:435–452. doi: 10.1002/wrna.1286 This article is categorized under: Regulatory RNAs/RNAi/Riboswitches > Regulatory RNAs
A bipartite graph illustrates the many‐to‐many relationship between miRNA and their mRNA targets. Notably, the graph can also be considered as a directed acycilic graph (DAG), where miRNAs only regulate genes and genes are only regulated by miRNAs. This very property enables exact inference in several frameworks.
[ Normal View | Magnified View ]
TargetScore schematics. (a) Cartoon of miRNA overexpression or transfection experiment. A miRNA mimic of interest or control hairpin is transfected into a cell. True target genes are expected to exhibit expression decreases relative to the control cell. Additional to the expression fold‐change (xf) due to miRNA transfection, the input data also consists of sequence‐based scores (x1, x2, …, xL). All input variables are continuous. (b) For each score type of gene n (xn), we used a Variational Bayesian‐Gaussian Mixture Model (VB‐GMM) to infer the posterior distribution of the binary target status (znk), given the observed feature scores xn ∈ (x1, x2, …, xL). The plate model indicates a repeating pattern of the generative model for all of the N genes.
[ Normal View | Magnified View ]
PanMiRa schematics. (a) The pan‐cancer study enabled by TCGA was performed on 12 distinct cancer types each containing hundreds of patient samples each measured by mRNA/miRNA expression, DNA methylation, and copy number. (b) Suppose target RNA expression (yi,t,d) in sample t of cancer type d is a function of DNA methylation (xDMi,t,d), copy number (xi,t,dCN) and miRNA regulation (xk,t,dmiR). The expression change across samples for the target RNA is modelled as the response variable in a multivariate linear regression framework using the input variables as indicted above. (ç) The resulting linear coefficient βmiRi,k,ds indicate the corresponding interaction between miRNA k and target gene i of cancer type d and are transformed into z‐scores, which are then subsequently subjected to local false discovery rate (locfdr) estimation. (d) The joint posteriors for the recurrent interactions given the z‐scores are inferred by empirical Bayes using the probabilistic quantities obtained from the locfdr procedure above.
[ Normal View | Magnified View ]
ProMISe schematics. (a) Expressed seed match of mRNA i for miRNA k is defined as the product of the number of target sites ci,k and the total expression of mRNA i. The probability of mRNA i ‘attracting’ miRNA k takes into account the expression of miRNA k and the total expression of other mRNA xt that carries compatible seed match for miRNA k. (b) Conversely, the expressed seed k for mRNA i is the number of target sites that miRNA k can recognize on mRNA i multiplied by the expression of miRNA k. The probability of miRNA k targeting mRNA i considers the total expression of mRNA i with respect to all of the other miRNA expression z that can recognize the target sites of mRNA i.
[ Normal View | Magnified View ]
Canonical miRNA Watson‐Crick base pairs to the 3′ UTR of the mRNA target site. The most critical region is a 6mer site termed as the ‘seed’ occurs at position 2–7 of the 5′ end of the miRNA. Three other variations centring at the 6mer seed are also known to be (more) conserved: 7mer‐m8 site, a seed match + a Watson‐Crick match to miRNA nucleotide 8; 7mer‐t1A site, a seed match + a downstream A in the 3′ UTR; 8mer, a seed match + both m8 and t1A. The site efficacy has also been proposed in the order of 8mer > 7mer‐m8 > 7mer‐A1 > 6mer. ORF, open reading frame; (NNNNN), the additional nucleotides to the shortest 19 nt miRNA; [A|N], A or other nucleotides; Poly(A), polyadenylated tail.
[ Normal View | Magnified View ]
RACER schematics. (a) The mRNA expression of gene g (yg) is modelled as a function of the following input variables (left to right): TF‐binding signals (bg,TF), DNA methylation (mg), copy number variation (ng), miRNA–mRNA interactions implicated in the sequence‐based seed match (cg,miR) 3′ UTR regions of the mRNA and miRNA expression (zmiR). (b) Two‐stage regression analysis. At stage 1, RACER estimates the sample‐specific TF and miRNA activities for each sample t. At stage 2, RACER uses the inferred regulatory activities of TFs and miRNAs to estimate the interaction scores wg,TF and wg,miR between gene g and TF and between gene g and miRNA miR across all of the T samples, respectively.
[ Normal View | Magnified View ]
PicTar schematic view. Given a 3′ UTR and a set of K miRNA sequences, PicTar uses a (K + 1)‐state HMM to infer whether each segment of the 3′ UTR represents a seed match to one of the K miRNAs or background (BKG). As a simple illustration, only three miRNAs (miR1, 2, 3) are shown in the HMM model on the right.
[ Normal View | Magnified View ]

Related Articles

ISBM 2016

Browse by Topic

Regulatory RNAs/RNAi/Riboswitches > Regulatory RNAs

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts