# STATIS and DISTATIS: optimum multitable principal component analysis and three way metric multidimensional scaling

Advanced Review

Published Online: Jan 05 2012

DOI: 10.1002/wics.198

Can't access this content? Tell your librarian.

Abstract STATIS is an extension of principal component analysis (PCA) tailored to handle multiple data tables that measure sets of variables collected on the same observations, or, alternatively, as in a variant called dual‐STATIS, multiple data tables where the same variables are measured on different sets of observations. STATIS proceeds in two steps: First it analyzes the between data table similarity structure and derives from this analysis an optimal set of weights that are used to compute a linear combination of the data tables called the compromise that best represents the information common to the different data tables; Second, the PCA of this compromise gives an optimal map of the observations. Each of the data tables also provides a map of the observations that is in the same space as the optimum compromise map. In this article, we present STATIS, explain the criteria that it optimizes, review the recent inferential extensions to STATIS and illustrate it with a detailed example. We also review, and present in a common framework, the main developments of STATIS such as (1) X‐STATIS or partial triadic analysis (PTA) which is used when all data tables collect the same variables measured on the same observations (e.g., at different times or locations), (2) COVSTATIS, which handles multiple covariance matrices collected on the same observations, (3) DISTATIS, which handles multiple distance matrices collected on the same observations and generalizes metric multidimensional scaling to three way distance matrices, (4) Canonical‐STATIS (CANOSTATIS), which generalizes discriminant analysis and combines it with DISTATIS to analyze multitable discriminant analysis problems, (5) power‐STATIS, which uses alternative criteria to find STATIS optimal weights, (6) ANISOSTATIS, which extends STATIS to give specific weights to each variable rather than to each whole table, (7) (K + 1)‐STATIS (or external‐STATIS), which extends STATIS (and PLS‐methods and Tucker inter battery analysis) to the analysis of the relationships of several data sets and one external data set, and (8) double‐STATIS (or DO‐ACT), which generalizes (K + 1)‐STATIS and analyzes two sets of data tables, and STATIS‐4, which generalizes double‐STATIS to more than two sets of data. These recent developments are illustrated by small examples. WIREs Comput Stat 2012, 4:124–167. doi: 10.1002/wics.198 This article is categorized under: Statistical and Graphical Methods of Data Analysis > Dimension Reduction Statistical Learning and Exploratory Methods of the Data Sciences > Exploratory Data Analysis Statistical and Graphical Methods of Data Analysis > Analysis of High Dimensional Data