Home
This Title All WIREs
WIREs RSS Feed
How to cite this WIREs title:
WIREs Comp Stat

Parallel coordinate and parallel coordinate density plots

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

Abstract The parallel coordinate plot (PCP)—which represents a p‐dimensional data point in Cartesian coordinates by a polyline (or curve) intercepting p‐parallel axes—is a viable tool for hyperdimensional data visualization. It enables the human visual system to spot informative patterns in complex data and gain better understanding of the underlying geometry of hyperdimensional objects. Correlated records, conceptual clusters, and outliers are easy to discern with the PCP. The parallel coordinate density plot integrates the PCP with density estimation techniques to visualize concentrated information instead of the profiles themselves. Thus mitigating the visual cluttering burden inherent in the plot for a few thousand records. In this article, we give an overview of the PCP, their generalizations, the use of orthogonal bases to smooth out the system, and density estimation techniques to overcome the visual cluttering limitations inherent in the plot. We discuss the duality theorem and its usability in identifying patterns visually or by automatic means. We discuss the effect of scaling the data and the profiles. We provide some visualization examples on different datasets. WIREs Comp Stat 2011 3 134–148 DOI: 10.1002/wics.145 This article is categorized under: Statistical and Graphical Methods of Data Analysis > Statistical Graphics and Visualization

The min–max envelop for parallel coordinate plot; (a) TPCP plot, (b) the incorrect min–max envelop, and (c) the correct min–max envelop. The Q1Q3 envelop; (d) TPCP, (e) the incorrect Q1Q3 envelop, and (f) the correct Q1Q3 envelop.

[ Normal View | Magnified View ]

Pollen data visualization with the enhanced traditional PCP. The enhancement using α‐channel (a), Frequency plot (b), line density estimation (c), KDE‐GPCP (d), and QGPCP (e). The effect of axes permutations on these enhancement techniques is shown in (f–j). Unlike the other methods, KDE‐GPCP, and QGPCP shows a consistent high intensity band in the whole plot which stands for the hidden pattern.

[ Normal View | Magnified View ]

Parallel planes: Three‐dimensional parallel coordinates for the wine data (178 observations in 13 variables with three known classes. The data can be obtained from: http://archive.ics.uci.edu/ml/datasets/Wine). The classes are brushed with different colors and viewed from different perspectives.

[ Normal View | Magnified View ]

Animated PCP: Different views of the word EUREKA in parallel coordinates, each color corresponds to one letter. Because these letters on a flat are separated from each other they appear in the plot as separated on the axes, between the axes and perhaps there is one projection with zero variance between the axes without rotation and after rotation.

[ Normal View | Magnified View ]

Scatterplot of the word EUREKA after brushing each letter with different color. It shows that these letters on a flat but they are distant from each other.

[ Normal View | Magnified View ]

The point ↔ line duality: How the line crossing in PCP reflects the correlation levels in the original data. Clearly, the higher the negatively correlated variables the higher the line crossing between the corresponding axes and vice versa. Please look at the line crossing between two perfectly negatively correlated variables x1,x2 and between two perfectly positively correlated variables x6,x7.

[ Normal View | Magnified View ]

Examples of some patterns that can be discovered using GPCP. The scatterplots of these patterns and their images in piecewise PCP and smooth PCP are shown in the first, second, and third columns, respectively. Patterns that can be discovered in GPCP based on the gaps on and between the parallel axes are shown in the first row. Patterns that can be discovered based on the gap on one axis only are shown in the second and third row. Those that can be discovered based on the line (or curve) crossings only are shown in third row.

[ Normal View | Magnified View ]

Visualizing the so‐called pollen data (the ASA challenge dataset of 3848 observations on five variables with only 128 observations forming the word EUREKA hidden in the hypersphere). The data can be obtained from http://lib.stat.cmu.edu/datasets/pollen.data. The profiles with green color represent the data on the surface of the hypersphere, whereas the profiles with red color represent the hidden pattern.

[ Normal View | Magnified View ]

Visualizing five‐dimensional observations (1, 3, 1, 4, 2) and (3, 1, 4, 2, 1) using the TPCP. The profile line with red color corresponds the first observation and the profile with green color corresponds to the second observation.

[ Normal View | Magnified View ]

The CIE on automobile dataset with different values of c. The class separation increases as the c‐values decreases, and the class centriods are very highly separated in all variables. The gap on the axes can assist in determining the variables with high discriminant power in this data.

[ Normal View | Magnified View ]

Related Articles

Exploratory data analysis
Scientific Visualization

Browse by Topic

Statistical and Graphical Methods of Data Analysis > Statistical Graphics and Visualization

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts