Home
This Title All WIREs
WIREs RSS Feed
How to cite this WIREs title:
WIREs Comp Stat

Model exploration using conditional visualization

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

Abstract Ideally, statistical parametric model fitting is followed by various summary tables which show predictor contributions, visualizations which assess model assumptions and goodness of fit, and test statistics which compare models. In contrast, modern machine‐learning fits are usually black box in nature, offer high‐performing predictions but suffer from an interpretability deficit. We examine how the paradigm of conditional visualization can be used to address this, specifically to explain predictor contributions, assess goodness of fit, and compare multiple, competing fits. We compare visualizations from techniques including trellis, condvis, visreg, lime, partial dependence, and ice plots. Our examples use random forest fits, but all techniques presented are model agnostic. This article is categorized under: Statistical and Graphical Methods of Data Analysis > Statistical Graphics and Visualization Statistical Learning and Exploratory Methods of the Data Sciences > Exploratory Data Analysis Statistical Learning and Exploratory Methods of the Data Sciences > Modeling Methods
Random forest fit relating volume to hightemp, season, and dayType for the RailTrail dataset
[ Normal View | Magnified View ]
For the RailTrail random forest fit relating volume to hightemp, lowtemp, and precip, plot (a) compares the lime explainer fit in blue to the random forest fit in red, varying hightemp. Plot (b) shows the weights for hightemp in the lime local linear approximation. Both plots fix lowtemp = 32 and precip = 0
[ Normal View | Magnified View ]
A lime explainer plot for the RailTrail random forest fit relating volume to hightemp, lowtemp, and precip. Cases 16, 4, and 26 have predictor values (hightemp, lowtemp, and precip) of (a) (54, 32,0), (b) (96, 61,0), and (c) (81, 65,1.4)
[ Normal View | Magnified View ]
Condvis plots of the RailTrail random forest fit relating volume to hightemp, dayType, lowtemp (LT), precip (P), cloudcover (C), and season (S). The section variables are hightemp and dayType and the conditioning values of (LT,P, C,S) are (a) (32,0, 3.6, spring), (b) (61,0, 2.6, summer), and (c) (65,1.4, 10, summer). Weekend points and fits are in purple, weekday points are in orange
[ Normal View | Magnified View ]
The conditioning values of lowtemp and precip used in Figure , marked with a cross. Red (green, blue) points near the red (green, blue) cross are visible in Figure a–c
[ Normal View | Magnified View ]
Condvis plots of the RailTrail random forest fit relating volume to hightemp, lowtemp (LT), and precip (P). The section variable is hightemp and the conditioning values of (LT,P) are (a) (32,0), (b) (61,0), and (c) (65,1.4)
[ Normal View | Magnified View ]
Visreg plots of the RailTrail random forest fit relating volume to hightemp, lowtemp (LT), and precip (P). The section variable is hightemp and the conditioning values of (LT,P) are (a) (32,0), (b) (61,0), and (c) (65,1.4)
[ Normal View | Magnified View ]
An ice plot with partial dependence curve outlined in yellow for hightemp, for the random forest fit to the RailTrail dataset, relating volume to hightemp, lowtemp, and precip
[ Normal View | Magnified View ]

Browse by Topic

Statistical Learning and Exploratory Methods of the Data Sciences > Modeling Methods
Statistical Learning and Exploratory Methods of the Data Sciences > Exploratory Data Analysis
Statistical and Graphical Methods of Data Analysis > Statistical Graphics and Visualization

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts