Home
This Title All WIREs
WIREs RSS Feed
How to cite this WIREs title:
WIREs Data Mining Knowl Discov
Impact Factor: 2.111

Mining uncertain data

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

As an important data mining and knowledge discovery task, association rule mining searches for implicit, previously unknown, and potentially useful pieces of information—in the form of rules revealing associative relationships—that are embedded in the data. In general, the association rule mining process comprises two key steps. The first key step, which mines frequent patterns (i.e., frequently occurring sets of items) from data, is more computationally intensive than the second key step of using the mined frequent patterns to form association rules. In the early days, many developed algorithms mined frequent patterns from traditional transaction databases of precise data such as shopping market basket data, in which the contents of databases are known. However, we are living in an uncertain world, in which uncertain data can be found almost everywhere. Hence, in recent years, researchers have paid more attention to frequent pattern mining from probabilistic databases of uncertain data. In this paper, we review recent algorithmic development on mining uncertain data in these probabilistic databases for frequent patterns. © 2011 John Wiley & Sons, Inc. WIREs Data Mining Knowl Discov 2011 1 316–329 DOI: 10.1002/widm.31

Figure 1.

A traditional transaction database D1 of precise data.

[ Normal View | Magnified View ]
Figure 2.

A probabilistic database D2 of uncertain data, in which items in each transaction are independent.

[ Normal View | Magnified View ]
Figure 3.

Possible worlds for D2.

[ Normal View | Magnified View ]
Figure 4.

An FP‐tree for capturing the contents of D1.

[ Normal View | Magnified View ]
Figure 5.

The global UF‐tree for capturing the contents of D2 (for mining all frequent patterns).

[ Normal View | Magnified View ]
Figure 6.

A UF‐tree for capturing the contents of {d}‐projected database for D2 (i.e., contents of only transactions containing the singleton pattern {d}).

[ Normal View | Magnified View ]
Figure 7.

The global UF‐tree for capturing the contents of D2 (for mining frequent patterns that satisfy a SUC constraint CSUC).

[ Normal View | Magnified View ]
Figure 8.

An SUF‐tree (with a sliding window of w = 3 batches) for capturing the contents of D3.

[ Normal View | Magnified View ]
Figure 9.

A probabilistic dataset D3 containing streams of uncertain data.

[ Normal View | Magnified View ]
Figure 10.

An extended H‐struct for capturing the contents of D2.

[ Normal View | Magnified View ]
Figure 11.

Some samples of instantiated ‘possible worlds’ of D2.

[ Normal View | Magnified View ]
Figure 12.

Frequent patterns mined from D2 based on expected support and probabilistic support.

[ Normal View | Magnified View ]
Figure 13.

A probabilistic database D4 of uncertain data, in which items in each transaction are mutually exclusive.

[ Normal View | Magnified View ]

Browse by Topic

Algorithmic Development > Association Rules

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts