Arjevani,, Y., & Shamir,, O. (2015). Communication complexity of distributed convex learning and optimization. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, & R. Garnett (Eds.), Advances in Neural Information Processing Systems 28 (pp. 1756–1764). Red Hook, NY: Curran Associates, Inc.
Balcan,, M.‐F., Kanchanapally,, V., Liang,, Y., & Woodruff,, D. (2014, August). Improved distributed principal component analysis. Technical Report. ArXiv. Retrieved from http://arxiv.org/abs/1408.5823
Battey,, H., Fan,, J., Liu,, H., Lu,, J., & Zhu,, Z. (2015). Distributed estimation and inference with statistical guarantees. arXiv preprint arXiv:1509.05457.
Bickel,, P. J. (1975). One‐step huber estimates in the linear model. Journal of the American Statistical Association, 70(350), 428–434.
Boyd,, S., Parikh,, N., Chu,, E., Peleato,, B., & Eckstein,, J. (2011). Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning, 3(1), 1–122.
Chen,, X., & Xie,, M.‐g. (2014). A split‐and‐conquer approach for analysis of extraordinarily large data. Statistica Sinica, 24(4), 1655–1684.
Corbett,, J. C., Dean,, J., Epstein,, M., Fikes,, A., Frost,, C., Furman,, J. J., … others. (2013). Spanner: Google`s globally distributed database. ACM Transactions on Computer Systems (TOCS), 31(3), 8.
El Gamal,, M., & Lai,, L. (2015). Are slepian‐wolf rates necessary for distributed parameter estimation? In Communication, Control, and Computing (Allerton), 2015 53rd Annual Allerton Conference (pp. 1249–1255).
Fan,, J., & Chen,, J. (1999). One‐step local quasi‐likelihood estimation. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 61(4), 927–943.
Fan,, J., Feng,, Y., & Song,, R. (2011). Nonparametric independence screening in sparse ultra‐high‐dimensional additive models. Journal of the American Statistical Association, 106(494), 544–557.
Fan,, J., Han,, F., & Liu,, H. (2014). Challenges of big data analysis. National Science Review, 1, 293–314.
Fan,, J., & Li,, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456), 1348–1360.
Forero,, P. A., Cano,, A., & Giannakis,, G. B. (2010). Consensus‐based distributed support vector machines. The Journal of Machine Learning Research, 11, 1663–1707.
Gabay,, D., & Mercier,, B. (1976). A dual algorithm for the solution of nonlinear variational problems via finite element approximation. Computers %26 Mathematics with Applications, 2(1), 17–40.
Huang,, C., & Huo,, X. (2015). A distributed one‐step estimator. arXiv preprint arXiv:1511.01443.
Jaggi,, M., Smith,, V., Takác,, M., Terhorst,, J., Krishnan,, S., Hofmann,, T., & Jordan,, M. I. (2014). Communication‐efficient distributed dual coordinate ascent. In Advances in Neural Information Processing Systems (pp. 3068–3076).
Javanmard,, A., & Montanari,, A. (2014). Confidence intervals and hypothesis testing for high‐dimensional regression. The Journal of Machine Learning Research, 15(1), 2869–2909.
Jordan,, M. I., Lee,, J. D., & Yang,, Y. (2018). Communication‐efficient distributed statistical inference. Journal of the American Statistical Association. https://doi.org/10.1080/01621459.2018.1429274
Lee,, J. D., Sun,, Y., Liu,, Q., & Taylor,, J. E. (2015). Communication‐efficient sparse regression: a one‐shot approach. arXiv preprint arXiv:1503.04337.
Liu,, Q., & Ihler,, A. T. (2014). Distributed estimation, information loss and exponential families. In Advances in Neural Information Processing Systems (pp. 1098–1106).
Mitra,, S., Agrawal,, M., Yadav,, A., Carlsson,, N., Eager,, D., & Mahanti,, A. (2011). Characterizing web‐based video sharing workloads. ACM Transactions on the Web (TWEB), 5(2), 8.
Nowak,, R. D. (2003). Distributed EM algorithms for density estimation and clustering in sensor networks. IEEE Transactions on Signal Processing, 51(8), 2245–2253.
Ravikumar,, P., Lafferty,, J., Liu,, H., & Wasserman,, L. (2009). Sparse additive models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71(5), 1009–1030.
Rosenblatt,, J. D., & Nadler,, B. (2016). On the optimality of averaging in distributed statistical learning. Information and Inference: A Journal of the IMA, 5(4), 379–404.
Shamir,, O., Srebro,, N., & Zhang,, T. (2014). Communication‐efficient distributed optimization using an approximate Newton‐type method. In International Conference on Machine Learning (pp. 1000–1008).
Song,, Q., & Liang,, F. (2015). A split‐and‐merge Bayesian variable selection approach for ultrahigh dimensional regression. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 77(5), 947–972.
Städler,, N., Bühlmann,, P., & Van De Geer,, S. (2010). ℓ1‐penalization for mixture regression models. TEST, 19(2), 209–256.
Tibshirani,, R. (1996). Regression shrinkage and selection via the LASSO. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288.
Van de Geer,, S., Bühlmann,, P., Ritov,, Y., & Dezeure,, R. (2014). On asymptotically optimal confidence regions and tests for high‐dimensional models. The Annals of Statistics, 42(3), 1166–1202.
Van der Vaart,, A. W. (2000). Asymptotic statistics (Vol. 3). Cambridge, UK: Cambridge University Press.
Walker,, E., Hernandez,, A. V., & Kattan,, M. W. (2008). Meta‐analysis: Its strengths and limitations. Cleveland Clinic Journal of Medicine, 75(6), 431–439.
Wang,, X., Peng,, P., & Dunson,, D. B. (2014). Median selection subset aggregation for parallel inference. In Advances in Neural Information Processing Systems (pp. 2195–2203).
Yang,, Y., & Barron,, A. (1999). Information‐theoretic determination of minimax rates of convergence. Annals of Statistics, 27(5), 1564–1599.
Zhang,, C.‐H. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38(2), 894–942.
Zhang,, T. (2004). Statistical behavior and consistency of classification methods based on convex risk minimization. The Annals of Statistics, 32(1), 56–85.
Zhang,, Y., Duchi,, J., Jordan,, M. I., & Wainwright,, M. J. (2013). Information‐theoretic lower bounds for distributed statistical estimation with communication constraints. In Advances in Neural Information Processing Systems (pp. 2328–2336).
Zhang,, Y., Wainwright,, M. J., & Duchi,, J. C. (2012). Communication‐efficient algorithms for statistical optimization. In Advances in Neural Information Processing Systems (pp. 1502–1510).
Zhao,, T., Cheng,, G., & Liu,, H. (2016). A partially linear framework for massive heterogeneous data. Annals of Statistics, 44(4), 1400–1437.
Zinkevich,, M., Weimer,, M., Li,, L., & Smola,, A. J. (2010). Parallelized stochastic gradient descent. In Advances in Neural Information Processing Systems (pp. 2595–2603).
Zou,, H., & Li,, R. (2008). One‐step sparse estimates in nonconcave penalized likelihood models. Annals of Statistics, 36(4), 1509–1533.