This Title All WIREs
How to cite this WIREs title:
WIREs Data Mining Knowl Discov
Impact Factor: 2.111

Process discovery from event data: Relating models and logs through abstractions

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

Event data are collected in logistics, manufacturing, finance, health care, customer relationship management, e‐learning, e‐government, and many other domains. The events found in these domains typically refer to activities executed by resources at particular times and for a particular case (i.e., process instances). Process mining techniques are able to exploit such data. In this article, we focus on process discovery. However, process mining also includes conformance checking, performance analysis, decision mining, organizational mining, predictions, recommendations, and so on. These techniques help to diagnose problems and improve processes. All process mining techniques involve both event data and process models. Therefore, a typical first step is to automatically learn a control‐flow model from the event data. This is very challenging, but in recent years, many powerful discovery techniques have been developed. It is not easy to compare these techniques since they use different representations and make different assumptions. Users often need to resort to trying different algorithms in an ad‐hoc manner. Developers of new techniques are often trying to solve specific instances of a more general problem. Therefore, we aim to unify existing approaches by focusing on log and model abstractions. These abstractions link observed and modeled behavior: Concrete behaviors recorded in event logs are related to possible behaviors represented by process models. Hence, such behavioral abstractions provide an “interface” between both of them. We discuss four discovery approaches involving three abstractions and different types of process models (Petri nets, block‐structured models, and declarative models). The goal is to provide a comprehensive understanding of process discovery and show how to develop new techniques. Examples illustrate the different approaches and pointers to software are given. The discussion on abstractions and process representations is also presented to reflect on the gap between process mining literature and commercial process mining tools. This facilitates users to select an appropriate process discovery technique. Moreover, structuring the role of internal abstractions and representations helps broaden the view and facilitates the creation of new discovery approaches.

This article is categorized under:

  • Algorithmic Development > Spatial and Temporal Data Mining
  • Application Areas > Business and Industry
  • Technologies > Machine Learning
  • Application Areas > Data Mining Software Tools
Overview positioning the different types of process mining and the role of log abstractions and model abstractions
[ Normal View | Magnified View ]
Screenshots from Disco and Celonis to show that one should be careful to interpret results from informal models like a filtered directly‐follows graph correctly. (a) Disco showing the full directly‐follows graph, (b) Celonis showing the full directly‐follows graph, (c) Disco showing the full directly‐follows graph, and (d) Celonis showing a filtered directly‐follows graph
[ Normal View | Magnified View ]
A simple synthetic event log and two formal models derived from it. (a) Frequency distribution of the traces in the event log, (b) Petri net discovered using the Alpha miner, and (c) process tree (visualized in BPMN style) discovered using the inductive miner
[ Normal View | Magnified View ]
Approach based on abstractions
[ Normal View | Magnified View ]
A declarative model
[ Normal View | Magnified View ]
Declarative notations: Eight example constraints
[ Normal View | Magnified View ]
Process tree →(a, ↺(∧(b, c), →(e, f)), d)
[ Normal View | Magnified View ]
A Petri net with six transitions {t1, t2, …, t6} and seven places {p1, p2, …, p7}. The initial marking M init = [p1] is shown. M final = [p7] is the final marking
[ Normal View | Magnified View ]
Extracting an event log (right) from a collection of events (left). Each event in an event log has a case, activity, and timestamp
[ Normal View | Magnified View ]
The control‐flow perspective is the basis for the other process perspectives (left). Independent of the perspectives included, process mining techniques can be used in online and offline settings (right)
[ Normal View | Magnified View ]
The four basic types of process mining: process discovery (van der Aalst, ), conformance checking (van der Aalst, ), process reengineering (van der Aalst, Adriansyah, & van Dongen, ) (changing the process model), and operational support (van der Aalst et al., ) (influencing the process without reengineering it)
[ Normal View | Magnified View ]
Screenshots of five different process mining tools. (a) Visual inductive miner (ProM) showing a process tree, (b) ILP miner (ProM) showing a Petri net, (c) Heuristic miner (ProM) showing a C‐net, (d) Disco (Fluxicon) showing a directly follows graph, and (e) Celonis showing a directly follows graph
[ Normal View | Magnified View ]

Browse by Topic

Technologies > Machine Learning
Algorithmic Development > Spatial and Temporal Data Mining
Application Areas > Business and Industry
Application Areas > Data Mining Software Tools

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts