Home
This Title All WIREs
WIREs RSS Feed
How to cite this WIREs title:
WIREs Comp Stat

STATIS and DISTATIS: optimum multitable principal component analysis and three way metric multidimensional scaling

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

STATIS is an extension of principal component analysis (PCA) tailored to handle multiple data tables that measure sets of variables collected on the same observations, or, alternatively, as in a variant called dual‐STATIS, multiple data tables where the same variables are measured on different sets of observations. STATIS proceeds in two steps: First it analyzes the between data table similarity structure and derives from this analysis an optimal set of weights that are used to compute a linear combination of the data tables called the compromise that best represents the information common to the different data tables; Second, the PCA of this compromise gives an optimal map of the observations. Each of the data tables also provides a map of the observations that is in the same space as the optimum compromise map. In this article, we present STATIS, explain the criteria that it optimizes, review the recent inferential extensions to STATIS and illustrate it with a detailed example.

We also review, and present in a common framework, the main developments of STATIS such as (1) X‐STATIS or partial triadic analysis (PTA) which is used when all data tables collect the same variables measured on the same observations (e.g., at different times or locations), (2) COVSTATIS, which handles multiple covariance matrices collected on the same observations, (3) DISTATIS, which handles multiple distance matrices collected on the same observations and generalizes metric multidimensional scaling to three way distance matrices, (4) Canonical‐STATIS (CANOSTATIS), which generalizes discriminant analysis and combines it with DISTATIS to analyze multitable discriminant analysis problems, (5) power‐STATIS, which uses alternative criteria to find STATIS optimal weights, (6) ANISOSTATIS, which extends STATIS to give specific weights to each variable rather than to each whole table, (7) (K + 1)‐STATIS (or external‐STATIS), which extends STATIS (and PLS‐methods and Tucker inter battery analysis) to the analysis of the relationships of several data sets and one external data set, and (8) double‐STATIS (or DO‐ACT), which generalizes (K + 1)‐STATIS and analyzes two sets of data tables, and STATIS‐4, which generalizes double‐STATIS to more than two sets of data.

These recent developments are illustrated by small examples. WIREs Comput Stat 2012, 4:124–167. doi: 10.1002/wics.198

Figure 1.

The different steps of STATIS.

[ Normal View | Magnified View ]
Figure 2.

Inner product map. Projection of the assessors onto the first and second components. Note that the first dimension is positive because all the inner products are positive.

[ Normal View | Magnified View ]
Figure 3.

Compromise of the 10 tables. (a) Factor scores (wines). (b) Assessors' partial factor scores projected into the compromise as supplementary elements. Each assessor is represented by a dot, and for each wine a line connects the wine factor scores to the partial factors scores of a given assessor for this wine. (λ1 = 0.053, τ1 = 63%; λ2 = 0.008, τ2 = 9%).

[ Normal View | Magnified View ]
Figure 4.

Partial factor scores and variable loadings for the first two dimensions of the compromise space. The loadings have been re‐scaled to have a variance equal the singular values of the compromise analysis.

[ Normal View | Magnified View ]
Figure 5.

Contributions of the tables to the compromise. The sizes of the assessors' icons are proportional to their contribution to components 1 and 2.

[ Normal View | Magnified View ]
Figure 6.

Partial inertias of the studies. The sizes of the assessors' icons are proportional to their explained inertia for Components 1 and 2.

[ Normal View | Magnified View ]
Figure 7.

Supplementary table: Chemical components of the wines. Supplementary partial scores and loadings. (cf., Figure 3a).

[ Normal View | Magnified View ]
Figure 8.

Projection of the supplementary table into the inner product map.

[ Normal View | Magnified View ]
Figure 9.

Bootstrap ratio plot for Components 1 and 2.

[ Normal View | Magnified View ]
Figure 10.

Bootstrap confidence ellipses plotted on Components 1 and 2.

[ Normal View | Magnified View ]
Figure 11.

X‐STATIS or partial triadic analysis. Biplot of the wines and the four common variables.

[ Normal View | Magnified View ]
Figure 12.

Canonical‐statis. PCA of the inner product map: the assessors.

[ Normal View | Magnified View ]
Figure 13.

Canonical‐STATIS. Left panel: Compromise factor scores for the three wine regions. Right panel: Compromise factor scores for the three wine regions with bootstrapped 95% confidence intervals.

[ Normal View | Magnified View ]
Figure 14.

ANISOSTATIS. Map of the factor scores for Dimensions 1 and 2 using Criterion 1 of ANISOSTATIS (because the α weights are very similar for all three criteria of ANISOSTATIS, this map also corresponds to the solutions obtained with Criteria 2 and 3).

[ Normal View | Magnified View ]
Figure 15.

(K + 1)‐STATIS. (a) Compromise factor scores. (b) Factor scores for the H matrix: Chemical components of the wines (compare with Figures 3a and 7).

[ Normal View | Magnified View ]

Related Articles

Statistical Methods

Browse by Topic

Data Visualization > Visualization of High Dimensional Data
Data Visualization > Dimensional Reduction
Data Mining > Exploratory Data Analysis
Statistical Methods > Statistical Theory and Applications
blog comments powered by Disqus

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts

Twitter: WileyCompSci Follow us on Twitter

    Julie Wilson lists the 5 things she learned about Peer Review at #ESOF2014. Read more: http://t.co/nN3pogtN74 via @WileyExchanges