This Title All WIREs
How to cite this WIREs title:
WIREs Data Mining Knowl Discov
Impact Factor: 7.250

Time series motif discovery: dimensions and applications

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

Time series motifs are repeated segments in a long time series that, if exist, carry precise information about the underlying source of the time series. Motif discovery in time series data has received significant attention in the data mining community since its inception, principally because, motif discovery is meaningful and more likely to succeed when the data is large. Algorithms for motif discovery generally deal with three aspects the definition of the motifs, domain based preprocessing, and finally, the algorithmic steps. Typical definitions of motifs signify the similarity or the support of the motifs. Domains impose preprocessing requirements to meaningful motif finding such as data alignment, interpolation, and transformation. Motif discovery algorithms vary based on exact or approximate evaluation of the definition. In addition, algorithms require different representations [Symbolic Aggregate approXimation (SAX), DFT etc.] and similarity measures [correlation, dynamic time warping (DTW) distance etc.] for time series segments. In this paper, we discuss these three facets in detail with examples taken from the literature. We briefly describe a set of applications of time series motif in various domains and elaborate on a certain application in entomology to analyze insect behavior. This article is categorized under: Algorithmic Development > Spatial and Temporal Data Mining Technologies > Structure Discovery and Clustering
Top: The output steam flow telemetry of the Steamgen dataset has a motif of length 640 beginning at locations 589 and 8895. Bottom: By overlaying the two motifs we can see how remarkably similar they are to each other.
[ Normal View | Magnified View ]
The motif of length 400 found in an EPG trace of length 78,254. In‐set: Using the motifs as templates we can find several other occurrences in the same dataset.
[ Normal View | Magnified View ]
The motif of length 480 found in the insect telemetry shown in Figure . Although the two instances occur minutes apart they are uncannily similar.
[ Normal View | Magnified View ]
An electrical penetration graph of insect behavior. The data is complex and highly nonstationary, with wandering baseline, noise, dropouts, etc.
[ Normal View | Magnified View ]
A schematic diagram showing the apparatus used to record insect behavior.
[ Normal View | Magnified View ]
Left: A scatter plot where each point represents the Euclidean distance (x‐axis) and the dynamic time warping (DTW) distance (y‐axis) of a pair of time series. Some data points had values greater than 12 and they were truncated for clarity. Right: A zoom‐in of the plot on the left.
[ Normal View | Magnified View ]
Motif distance is usually much less than the nearest neighbor distance, and thus weak, low‐dimensional representations are often sufficient for motif discovery.
[ Normal View | Magnified View ]
Top: An EEG trace of length 10,000. Bottom‐left: First motif found by definition 2 for 256 length and 0.8 correlation. Bottom‐right: First motif found by using definition 1 for 256 length. The correlation between the repetitions is 0.87.
[ Normal View | Magnified View ]
A two‐dimensional toy dataset with clusters of points that potentially represent motifs.
[ Normal View | Magnified View ]

Browse by Topic

Technologies > Structure Discovery and Clustering
Algorithmic Development > Spatial and Temporal Data Mining

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts