Home
This Title All WIREs
WIREs RSS Feed
How to cite this WIREs title:
WIREs Data Mining Knowl Discov
Impact Factor: 2.541

Subgroup identification for precision medicine: A comparative review of 13 methods

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

Abstract Natural heterogeneity in patient populations can make it very hard to develop treatments that benefit all patients. As a result, an important goal of precision medicine is identification of patient subgroups that respond to treatment at a much higher (or lower) rate than the population average. Despite there being many subgroup identification methods, there is no comprehensive comparative study of their statistical properties. We review 13 methods and use real‐world and simulated data to compare the performance of their publicly available software using seven criteria: (a) bias in selection of subgroup variables, (b) probability of false discovery, (c) probability of identifying correct predictive variables, (d) bias in estimates of subgroup treatment effects, (e) expected subgroup size, (f) expected true treatment effect of subgroups, and (g) subgroup stability. The results show that many methods fare poorly on at least one criterion. This article is categorized under: Technologies > Machine Learning Algorithmic Development > Hierarchies and Trees Algorithmic Development > Statistics Application Areas > Health Care
Plots of variable selection frequencies in Table . Each frequency value is marked with a short vertical bar; horizontal lines connect the smallest and largest selection frequencies for each method; dashed vertical lines mark two simulation standard errors around unbiasedness level of 0.10
[ Normal View | Magnified View ]
Survival curves of SeqBT subgroup (in green) and its complement for heart data. Sample size and treatment effect (log relative risk of treated vs. untreated) printed beside and below each node
[ Normal View | Magnified View ]
Survival curves of SIDES subgroup (in green) and its complement for heart data. Sample size and treatment effect (log relative risk of treated vs. untreated) printed beside and below each node
[ Normal View | Magnified View ]
Survival curves of PRIM subgroup (in green) and its complement for heart data. Sample size and treatment effect (log relative risk of treated vs. untreated) printed beside and below each node
[ Normal View | Magnified View ]
MOBm tree for heart data. Sample size and treatment effect (log relative risk of treated vs. untreated) printed beside and below each node. Node with selected subgroup is in green color
[ Normal View | Magnified View ]
MOBc tree for heart data. Sample size and treatment effect (log relative risk of treated vs. untreated) printed beside and below each node. Node with selected subgroup is in green color
[ Normal View | Magnified View ]
Glin tree for heart data. Sample size printed beside node and treatment effect (log relative risk of treated vs. untreated) and name of linear prognostic variable printed below node. Node with selected subgroup is in green color
[ Normal View | Magnified View ]
Gcon tree for heart data. Sample size and treatment effect (log relative risk of treated vs. untreated) printed beside and below each node. Node with selected subgroup is in green color
[ Normal View | Magnified View ]
SeqBT subgroup (in green) for breast cancer data; sample sizes and estimated treatment effects (log relative risks) beside and below nodes
[ Normal View | Magnified View ]
PRIM subgroup (in green) for breast cancer data; sample sizes and estimated treatment effects (log relative risks) beside and below nodes
[ Normal View | Magnified View ]
SIDES subgroup (in green) for breast cancer data; sample sizes and estimated treatment effects (log relative risks) beside and below nodes
[ Normal View | Magnified View ]
Gcon subgroup (in green) for breast cancer data; sample sizes and estimated treatment effects (log relative risks) beside and below nodes
[ Normal View | Magnified View ]
Conditional median relative bias of estimated treatment effects
[ Normal View | Magnified View ]
Conditional median true treatment effect of subgroups
[ Normal View | Magnified View ]
Conditional mean subgroup size as proportion of test observations
[ Normal View | Magnified View ]
Probability of selecting a predictive variable at the first split for tree methods. For nontree methods, it is the probability that a predictive variable is among the selected variables
[ Normal View | Magnified View ]
Plots of probability of false discovery (Type I error). For Gcon, MOBc and VT, the probabilities are upper bounds. Vertical dotted lines mark the 0.05 level
[ Normal View | Magnified View ]

Browse by Topic

Technologies > Machine Learning
Algorithmic Development > Hierarchies and Trees
Algorithmic Development > Statistics
Application Areas > Health Care

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts