Home
This Title All WIREs
WIREs RSS Feed
How to cite this WIREs title:
WIREs Comp Stat

Data analysis on nonstandard spaces

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

Abstract The task to write on data analysis on nonstandard spaces is quite substantial, with a huge body of literature to cover, from parametric to nonparametrics, from shape spaces to Wasserstein spaces. In this survey we convey simple (e.g., Fréchet means) and more complicated ideas (e.g., empirical process theory), common to many approaches with focus on their interaction with one‐another. Indeed, this field is fast growing and it is imperative to develop a mathematical view point, drawing power, and diversity from a higher level of abstraction, for example, by introducing generalized Fréchet means. While many problems have found ingenious solutions (e.g., Procrustes analysis for principal component analysis [PCA] extensions on shape spaces and diffusion on the frame bundle to mimic anisotropic Gaussians), more problems emerge, often more difficult (e.g., topology and geometry influencing limiting rates and defining generic intrinsic PCA extensions). Along this survey, we point out some open problems, that will, as it seems, keep mathematicians, statisticians, computer and data scientists busy for a while. This article is categorized under: Statistical and Graphical Methods of Data Analysis > Analysis of High Dimensional Data
Empirical variances of intrinsic sample means times sample size versus sample size. (a) On the circle. Solid blue: a smeary distribution. Solid green and red: distributions with lower probability density at the antipodal of the mean, their asymptotic variance is dashed. Solid cyan and purple: distributions close to smeary distribution, but zero in around the antipodal of the mean. Solid black: the height of the Euclidean variance where all curves start. (b) On the three‐spider with three points with equal weight. Two lie on different legs at distance 1 from the origin and one lies at distance 2 + 3t on the third leg such that their mean lies at t on the third leg. The number labeling each curve denotes t. The height of the Euclidean variance which all curves eventually slightly overshoot, is solid black
[ Normal View | Magnified View ]

Browse by Topic

Statistical and Graphical Methods of Data Analysis > Analysis of High Dimensional Data

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts