Home
This Title All WIREs
WIREs RSS Feed
How to cite this WIREs title:
WIREs Data Mining Knowl Discov
Impact Factor: 2.541

Multivariate random forests

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

Abstract Random forests have emerged as a versatile and highly accurate classification and regression methodology, requiring little tuning and providing interpretable outputs. Here, we briefly outline the genesis of, and motivation for, the random forest paradigm as an outgrowth from earlier tree‐structured techniques. We elaborate on aspects of prediction error and attendant tuning parameter issues. However, our emphasis is on extending the random forest schema to the multiple response setting. We provide a simple illustrative example from ecology that showcases the improved fit and enhanced interpretation afforded by the random forest framework. © 2011 John Wiley & Sons, Inc. WIREs Data Mining Knowl Discov 2011 1 80‐87 DOI: 10.1002/widm.12 This article is categorized under: Algorithmic Development > Hierarchies and Trees Algorithmic Development > Ensemble Methods Technologies > Machine Learning Technologies > Prediction

Cross‐validated prediction error profiles for trees grown to maximal size on the (a) splice site identification data and (b) letter recognition data from mlbench.

[ Normal View | Magnified View ]

(a) species and (b) environment characteristics of the four habitats.

[ Normal View | Magnified View ]

Metric multidimensional scaling of the spider data based on the MRF proximity matrix. Sites are designated 1 ‐ 28. Colors of sites and convex hulls indicate PAM cluster membership; the two sites in gray have negative silhouette widths, suggesting low clustering confidence. Letters, A‐D, are located at the cluster means.

[ Normal View | Magnified View ]

MRF variable importance measures.

[ Normal View | Magnified View ]

MRT for the spider data. Terminal nodes are indicated by colored dots. Barplots show average abundances of the 12 species at each terminal node. The number of sites in each terminal nodes is given below the barplots.

[ Normal View | Magnified View ]

PE profiles for (a) MRT and (b) MRF. (a) Cross‐validated errors are plotted against tree size. The vertical bars represent PE standard error. The orange spot indicates final tree size as selected by the 1‐SE rule or overall PE minimum. (b) OOB errors are plotted against number of trees in MRF. The dotted line is the minimum error rate.

[ Normal View | Magnified View ]

Related Articles

Classification and regression trees

Browse by Topic

Technologies > Machine Learning
Technologies > Prediction
Algorithmic Development > Ensemble Methods
Algorithmic Development > Hierarchies and Trees

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts