Home
This Title All WIREs
WIREs RSS Feed
How to cite this WIREs title:
WIREs Data Mining Knowl Discov
Impact Factor: 2.111

Evolutionary design of decision trees

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

Decision tree (DT) is one of the most popular symbolic machine learning approaches to classification with a wide range of applications. Decision trees are especially attractive in data mining. It has an intuitive representation and is, therefore, easy to understand and interpret, also by nontechnical experts. The most important and critical aspect of DTs is the process of their construction. Several induction algorithms exist that use the recursive top‐down principle to divide training objects into subgroups based on different statistical measures in order to achieve homogeneous subgroups. Although being robust and fast, generally providing good results, their deterministic and heuristic nature can lead to suboptimal solutions. Therefore, alternative approaches have developed which try to overcome the drawbacks of classical induction. One of the most viable approaches seems to be the use of evolutionary algorithms, which can produce better DTs as they are searching for globally optimal solutions, evaluating potential solutions with regard to different criteria. We review the process of evolutionary design of DTs, providing the description of the most common approaches as well as referring to recognized specializations. The overall process is first explained and later demonstrated in a step‐by‐step case study using a dataset from the University of California, Irvine (UCI) machine learning repository. © 2012 Wiley Periodicals, Inc.

Figure 1.

An example of a DT (predicting whether a person makes more than 50k, from Adult dataset11).

[ Normal View | Magnified View ]
Figure 2.

The general process of DT construction.

[ Normal View | Magnified View ]
Figure 3.

The general process of EAs (for DT construction).

[ Normal View | Magnified View ]
Figure 4.

A decision tree's direct representation as a tree data structure. A node is a quadruple of values: node type, testing attribute or decision class, and splitting function (testing operator and testing value).

[ Normal View | Magnified View ]
Figure 5.

An example of the cross‐over operator on a tree‐based representation of a DT.

[ Normal View | Magnified View ]
Figure 6.

Some examples of the mutation operator on a tree‐based representation of a DT.

[ Normal View | Magnified View ]
Figure 7.

The overview of the knowledge discovery in data process.

[ Normal View | Magnified View ]
Figure 8.

Introns and duplicated leaves.

[ Normal View | Magnified View ]
Figure 9.

The fitness surface of initial population (generation 0) with regard to accuracy and tree size (note: the lower fitness is the better).

[ Normal View | Magnified View ]
Figure 10.

The fitness surface of population at generation 100 with regard to accuracy and tree size (note: the lower fitness is the better).

[ Normal View | Magnified View ]
Figure 11.

DT's confusion matrix template.

[ Normal View | Magnified View ]
Figure 12.

DT example.

[ Normal View | Magnified View ]
Figure 13.

Fitness distribution within population before and after selection; no genetic operators have been used here yet (note: the lower fitness is, the better).

[ Normal View | Magnified View ]
Figure 14.

Comparison of fitness distribution within two different populations: in initial population (generation 0) and after 1000 generations (note: lower fitness is better).

[ Normal View | Magnified View ]
Figure 15.

The obtained solution: decision tree induced within the described evolutionary process for the Adult dataset.

[ Normal View | Magnified View ]

Related Articles

Evolutionary design of decision trees for medical application
The use of decision trees for cost‐sensitive classification: an empirical study in software quality prediction
Classification and regression trees

Browse by Topic

Technologies > Machine Learning
Technologies > Classification

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts