This Title All WIREs
How to cite this WIREs title:
WIREs Data Mining Knowl Discov
Impact Factor: 4.476

Learning decision trees through Monte Carlo tree search: An empirical evaluation

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

Abstract Decision trees (DTs) are a widely used prediction tool, owing to their interpretability. Standard learning methods follow a locally optimal approach that trades off prediction performance for computational efficiency. Such methods can however be far from optimal, and it may pay off to spend more computational resources to increase performance. Monte Carlo tree search (MCTS) is an approach to approximate optimal choices in exponentially large search spaces. We propose a DT learning approach based on the Upper Confidence Bound applied to tree (UCT) algorithm, including procedures to expand and explore the space of DTs. To mitigate the computational cost of our method, we employ search pruning strategies that discard some branches of the search tree. The experiments show that proposed approach outperformed the C4.5 algorithm in 20 out of 31 datasets, with statistically significant improvements in the trade‐off between prediction performance and DT complexity. The approach improved locally optimal search for datasets with more than 1,000 instances, or for smaller datasets likely arising from complex distributions. This article is categorized under: Algorithmic Development > Hierarchies and Trees Application Areas > Data Mining Software Tools Fundamental Concepts of Data and Knowledge > Data Concepts
Two nodes of the search tree, n and n′, with their states, s(n) and s(n′). Executing the action a(n′) on s(n) adds a split to leaf , resulting in two new leaves L and R. The depth of a search node corresponds to the number of upstream actions that led to it, and is equal to the number of splits of its decision tree (DT). For example, node n′ has search depth equal to 3, and s(n′) has three internal nodes (in black) and four leaves (in gray)
[ Normal View | Magnified View ]
Effect of bootstrapping on the C4.5b policy as a function of dataset size. The plots display the difference in average F1 score between (a) the C4.5b and validation‐set (VS) performance policies, and (b) the C4.5b and C4.5 policies. Each dot does represents a dataset. Bootstrapping the induction and validation sets during simulation helped improve the performance for datasets with less than 730 examples
[ Normal View | Magnified View ]
Difference between the F1 obtained with Upper Confidence Bound applied to tree (UCT)–decision tree (DT) algorithm using the validation‐set performance policy and the C4.5 algorithm. Pruning methods are (a) none, (b) value‐based, (c) decision tree‐based, and (d) both. Each point represents a dataset
[ Normal View | Magnified View ]
Maximum search depth of Upper Confidence Bound applied to tree (UCT)–decision tree (DT) algorithm using the validation‐set performance policy (a) without search pruning, (b) with value‐based pruning, (c) with decision tree‐based pruning, and (d) with both methods. Each dot represents a dataset
[ Normal View | Magnified View ]
Illustration of one iteration of the UCT‐DT approach
[ Normal View | Magnified View ]

Browse by Topic

Fundamental Concepts of Data and Knowledge > Data Concepts
Application Areas > Data Mining Software Tools
Algorithmic Development > Hierarchies and Trees

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts