This Title All WIREs
How to cite this WIREs title:
WIREs Comp Stat

Aggregated inference

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

Aggregated inference on distributed data becomes more and more important due to the larger size of data collected in different industries. Modeling and inference are needed in the case where data cannot be obtained at a central location; aggregated statistical inference is a major tool to solve the aforementioned problems. In the literature, problems under the setting of regression model (more generally, M‐estimator) are extensively studied. There are at least two popular techniques for distributed estimation: (a) averaging estimators from local locations and (b) the one‐step approach, which combines the simple averaging estimator with a classical Newton's method (using the local Hessian matrices) to generate a “one‐step” estimator. It is proved that under certain assumptions, the above constructed estimators enjoy the same asymptotic properties as the centralized estimator, which is obtained as if all data were available at a central location. We review the aforementioned two major estimations. It can be seen that, in Big‐Data problems, dividing the data to multiple machines and then using the aggregation technique to solve the estimation problem in parallel can speed up the computation with little compromise of the quality of the estimators. We discuss potential extensions to other models, such as support vector machine, principle component analysis, and so on. Numerical examples are omitted due to the space limitation; they can be easily found in the literature. This article is categorized under: Statistical Learning and Exploratory Methods of the Data Sciences > Knowledge Discovery Statistical Learning and Exploratory Methods of the Data Sciences > Modeling Methods Statistical Models > Fitting Models Statistical and Graphical Methods of Data Analysis > Modeling Methods and Algorithms
Diagram of aggregated statistical estimation: The first step is obtaining the local parameter estimation using the local data and the second step is computing the centralized estimation using the local estimations
[ Normal View | Magnified View ]

Browse by Topic

Statistical and Graphical Methods of Data Analysis > Modeling Methods and Algorithms
Statistical Models > Fitting Models
Statistical Learning and Exploratory Methods of the Data Sciences > Modeling Methods
Statistical Learning and Exploratory Methods of the Data Sciences > Knowledge Discovery

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts