Ackermann,, M. R., Märtens,, M., Raupach,, C., Swierkot,, K., Lammersen,, C., & Sohler,, C. (2012). Streamkm++: A clustering algorithm for data streams. Journal of Experimental Algorithmics, 17, 2.4:2.1–2.4:2.30. https://doi.org/10.1145/2133803.2184450
Agarwal,, P., Har‐Peled,, S., & Varadarajan,, K. (2004). Approximating extent measures of points. Journal of the ACM, 51(4), 606–635.
Agarwal,, P., Har‐Peled,, S., & Varadarajan,, K. (2005). Geometric approximation via coresets. Combinatorial and Computational Geometry, 52, 1–30.
Agarwal,, P. K., & Har‐Peled,, S. (2001). Maintaining the approximate extent measures of moving points. In Proceedings of the 12th SODA (pp. 148–157).
Agarwal,, P. K., Jones,, M., Murali,, T. M., & Procopiuc,, C. M. (2002). A Monte Carlo algorithm for fast projective clustering. In Proceedings of ACM‐SIGMOD International Conference on management of data (pp. 418–427).
Agarwal,, P. K., & Mustafa,, N. H. (2004). k‐means projective clustering. In Proceedings of 23rd ACM SIGACT‐SIGMOD‐SIGART symposium on principles of database systems (PODS) (pp. 155–165).
Agarwal,, P. K., & Procopiuc,, C. M. (2000). Approximation algorithms for projective clustering. In Proceedings of 11th annu. ACM‐SIAM symposium on discrete algorithms (SODA) (pp. 538–547).
Agarwal,, P. K., & Procopiuc,, C. M. (2003). Approximation algorithms for projective clustering. Journal of Algorithms, 46(2), 115–139.
Agarwal,, P. K., Procopiuc,, C. M., & Varadarajan,, K. R. (2002). Approximation algorithms for k‐line center. In European symposium on algorithms (pp. 54–63).
Aggarwal,, A., Deshpande,, A., & Kannan,, R. (2009). Adaptive sampling for k‐means clustering. In Proceedings of the 25th approx (pp. 15–28).
Akavia,, A., Feldman,, D., & Shaul,, H. (2018). Secure search on encrypted data via multi‐ring sketch. In Proceedings of the ACM SIGSAC conference on computer and communications security, CCS 2018 (pp. 985–1001). Retrieved from https://arxiv.org/abs/1708.05811
Akavia,, A., Feldman,, D., & Shaul,, H. (2019). Secure data retrieval on the cloud: Homomorphic encryption meets coresets.
Anthony,, M., & Bartlett,, P. L. (1999). Neural network learning: Theoretical foundations. Cambridge, England: Cambridge University Press.
Assadi,, S., & Khanna,, S. (2017). Randomized composable coresets for matching and vertex cover. CoRR, abs/1705.08242. Retrieved from http://arxiv.org/abs/1705.08242
Bachem,, O., Lucic,, M., & Krause,, A. (2015). Coresets for nonparametric estimation–the case of dp‐means. In International conference on machine learning (icml).
Bădoiu,, M., & Clarkson,, K. L. (2003). Smaller core‐sets for balls. In Proceedings of the 14th SODA (pp. 801–802).
Bădoiu,, M., & Clarkson,, K. L. (2008). Optimal core‐sets for balls. Computational Geometry, 40(1), 14–22.
Bambauer,, J., Muralidhar,, K., & Sarathy,, R. (2013). Fool`s gold: An illustrated critique of differential privacy. Vanderbilt Journal of Entertainment %26 Technology Law, 16, 701.
Barger,, A., & Feldman,, D. (2016). k‐means for streaming and distributed big sparse data. In Proceedings of the 2016 SIAM International conference on data mining (sdm`16).
Batson,, J., Spielman,, D. A., & Srivastava,, N. (2014). Twice‐ramanujan sparsifiers. SIAM Review, 56(2), 315–334.
Baykal,, C., Liebenwein,, L., Gilitschenski,, I., Feldman,, D., & Rus,, D. (2019). Data‐dependent coresets for compressing neural networks with applications to generalization bounds. In International conference on learning representations (ICLR).
Bentley,, J. L., & Saxe,, J. B. (1980). Decomposable searching problems i. static‐to‐dynamic transformation. Journal of Algorithms, 1(4), 301–358.
Blumer,, A., Ehrenfeucht,, A., Haussler,, D., & Warmuth,, M. K. (1989). Learnability and the vapnik‐chervonenkis dimension. Journal of the ACM, 36(4), 929–965.
Boutsidis,, C., Zouzias,, A., Mahoney,, M. W., & Drineas,, P. (2015). Randomized dimensionality reduction for k‐means clustering. IEEE Transactions on Information Theory, 61(2), 1045–1062.
Braverman,, V., Feldman,, D., & Lang,, H. (2016). New frameworks for offline and streaming coreset constructions. arXiv Preprint arXiv:1612.00889.
Campbell,, T. (2018). Baysien coresets (Tech. Rep.). MIT. https://github.com/trevorcampbell/bayesian-coresets
Charikar,, M., & Guha,, S. (2005). Improved combinatorial algorithms for facility location problems. SIAM Journal on Computing, 34(4), 803–824.
Charikar,, M., Guha,, S., Tardos,, É., & Shmoys,, D. B. (2002). A constant‐factor approximation algorithm for the k‐median problem. Journal of Computer and System Sciences, 65(1), 129–149.
Chazelle,, B., Edelsbrunner,, H., Grigni,, M., Guibas,, L., Sharir,, M., & Welzl,, E. (1993). Improved bounds on weak &egr;‐nets for convex sets. In Proceedings of the twenty‐fifth annual ACM symposium on theory of computing (STOC) (pp. 495–504). ACM. doi: https://doi.org/10.1145/167088.167222
Chen,, K. (2009). On coresets for k‐median and k‐means clustering in metric and euclidean spaces and their applications. SIAM Journal on Computing, 39(3), 923–947.
Choi,, S., Kim,, T., & Yu,, W. (1997). Performance evaluation of ransac family. Journal of Computer Vision, 24(3), 271–300.
Clarkson,, K. L. (2010). Coresets, sparse greedy approximation, and the frank‐wolfe algorithm. ACM Transactions on Algorithms (TALG), 6(4), 63.
Clarkson,, K. L., & Woodruff,, D. P. (2009). Numerical linear algebra in the streaming model. In Proceedings of the 41st stoc (pp. 205–214).
Cohen,, M. B., Elder,, S., Musco,, C., Musco,, C., & Persu,, M. (2015). Dimensionality reduction for k‐means clustering and low rank approximation. In Proceedings of the forty‐seventh annual ACM on symposium on theory of computing, STOC 2015 (pp. 163–172).
Cohen,, M. B., Lee,, Y. T., Musco,, C., Musco,, C., Peng,, R., & Sidford,, A. (2015). Uniform sampling for matrix approximation. In Proceedings of the 2015 conference on innovations in theoretical computer science (pp. 181–190). New York, NY: ACM. http://doi.acm.org/10.1145/2688073.2688113
Dasgupta,, A., Drineas,, P., Harb,, B., Kumar,, R., & Mahoney,, M. W. (2008). Sampling algorithms and coresets for ℓp‐regression. In Proceedings of 19th annual ACM‐SIAM symposium on discrete algorithms (SODA) (pp. 932–941). Retrieved from http://doi.acm.org/10.1145/1347082.1347184
Dasgupta,, S., & Schulman,, L. J. (2000). A two‐round variant of em for gaussian mixtures. In Proceedings of the sixteenth conference on uncertainty in artificial intelligence (pp. 152–159).
Deshpande,, A., Rademacher,, L., Vempala,, S., & Wang,, G. (2006). Matrix approximation and projective clustering via volume sampling. In Proceedings of 17th annual ACM‐SIAM symp. on discrete algorithms (SODA) (pp. 1117–1126).
Drineas,, P., Mahoney,, M. W., & Muthukrishnan,, S. (2006). Sampling algorithms for l2 regression and applications. In Soda.
Edwards,, M., & Varadarajan,, K. (2005). No coreset, no cry: Ii. In International conference on foundations of software technology and theoretical computer science (pp. 107–115).
Effros,, M., & Schulman,, L. J. (2004). Rapid near‐optimal {VQ} design with a deterministic data net. In Proceedings of the 2004 International Symposium on Information Theory (p. 298). Chicago, Illinois: IEEE. https://doi.org/10.1109/ISIT.2004.1365336
Epstein,, D., & Feldman,, D. (2018). Quadcopter tracks quadcopter via real‐time shape fitting. IEEE Robotics and Automation Letters, 3(1), 544–550.
Feigin,, M., Feldman,, D., & Sochen,, N. (2011). From high definition image to low space optimization. In Proceedings of 3rd interternational conference on scale space and variational methods in computer vision (SSVM 2011).
Feldman,, D. (2018). Code for recent coresets papers (Tech. Rep.). Robotics & Big Data Lab, University of Haifa. https://sites.hevra.haifa.ac.il/rbd/
Feldman,, D., Fiat,, A., Kaplan,, H., & Nissim,, K. (2009). Private coresets. In Proceedings of the forty‐first annual ACM symposium on theory of computing (pp. 361–370).
Feldman,, D., Fiat,, A., Segev,, D., & Sharir,, M. (2007). Bi‐criteria linear‐time approximations for generalized k‐mean/median/center. In Proceedings of 23rd ACM symposium on computational geometry (SOCG) (pp. 19–26).
Feldman,, D., Fiat,, A., & Sharir,, M. (2006). Coresets for weighted facilities and their applications. In Focs (pp. 315–324).
Feldman,, D., Krause,, A., & Faulkner,, M. (2011). Scalable training of mixture models via coresets. In Proceedings of 25th conference on neural information processing systems (nips).
Feldman,, D., & Langberg,, M. (2011). A unified framework for approximating and clustering data. In Proceedings of 34th annual ACM symposium on theory of computing (STOC). http://arxiv.org/abs/1106.1379
Feldman,, D., Monemizadeh,, M., & Sohler,, C. (2007). A ptas for k‐means clustering based on weak coresets. In Proceedings of the 23rd ACM symposium on Computational Geometry (SocG) (pp. 11–18).
Feldman,, D., Monemizadeh,, M., Sohler,, C., & Woodruff,, D. P. (2010). Coresets and sketches for high dimensional subspace approximation problems. In Proceedings of the twenty‐first annual ACM‐SIAM symposium on discrete algorithms (pp. 630–649).
Feldman,, D., Ozer,, S., & Rus,, D. (2017). Coresets for vector summarization with applications to network graphs. In Proceedings of the 34th international conference on machine learning, ICML 2017, Sydney, NSW, Australia, 6‐11 august 2017 (pp. 1117–1125). Retrieved from http://proceedings.mlr.press/v70/feldman17a.html
Feldman,, D., Schmidt,, M., & Sohler,, C. (2013). Turning big data into tiny data: Constant‐size coresets for k‐means, pca and projective clustering. In Proceedings of the twenty‐fourth annual ACM‐SIAM symposium on discrete algorithms (pp. 1434–1453).
Feldman,, D., & Schulman,, L. J. (2012). Data reduction for weighted and outlier‐resistant clustering. In Proceedings of the twenty‐third annual ACM‐SIAM symposium on discrete algorithms (pp. 1343–1354).
Feldman,, D., Sugaya,, A., & Rus,, D. (2012). An effective coreset compression algorithm for large scale sensor networks. In Information processing in sensor networks (ipsn), 2012 ACM/IEEE 11th international conference on (pp. 257–268).
Feldman,, D., Sugaya,, A., Sung,, C., & Rus,, D. (2013). idiary: from gps signals to a text‐searchable diary. In Proceedings of the 11th ACM conference on embedded networked sensor systems (p. 6).
Feldman,, D., Sung,, C., & Rus,, D. (2012). The single pixel gps: learning big data signals from tiny coresets. In Proceedings of the 20th international conference on advances in geographic information systems (pp. 23–32).
Feldman,, D., & Tassa,, T. (2015). More constraints, smaller coresets: constrained matrix approximation of sparse big data. In Proceedings of the 21th ACM sigkdd international conference on knowledge discovery and data mining (kdd`15) (pp. 249–258).
Feldman,, D., Volkov,, M., & Rus,, D. (2016). Dimensionality reduction of massive sparse datasets using coresets. In Advances in neural information processing systems (nips).
Feldman,, D., Xiang,, C., Zhu,, R., & Rus,, D. (2017). Coresets for differentially private k‐means clustering and applications to privacy in mobile sensor networks. In Information processing in sensor networks (ipsn), 2017 16th ACM/IEEE international conference on (pp. 3–16).
Foster,, I. (1995). Designing and building parallel programs: concepts and tools for parallel software engineering. Addison‐Wesley Longman Publishing Co., Inc.
Funke,, S., & Laue,, S. (2007). Bounded‐hop energy‐efficient broadcast in low‐dimensional metrics via coresets. In Annual symposium on theoretical aspects of computer science (pp. 272–283).
Har‐Peled,, S. (2004). No coreset, no cry. In Proceedings of the 24th IARCS annual conference on foundations of software technology and theoretical computer science (FSTTCS) (pp. 324–335).
Har‐Peled,, S. (2006). Coresets for discrete integration and clustering. In 26th fsttcs (pp. 33–44).
Har‐Peled,, S., & Kushal,, A. (2005). Smaller coresets for k‐median and k‐means clustering. In Proceedings of the 25th SODA (pp. 126–134).
Har‐Peled,, S., & Kushal,, A. (2007). Smaller coresets for k‐median and k‐means clustering. Discrete %26 Computational Geometry, 37(1), 3–19. https://doi.org/10.1007/s00454-006-1271-x
Har‐Peled,, S., & Mazumdar,, S. (2004). Coresets for k‐means and k‐median clustering and their applications. In Proceedings of the 36th ACM symposium on the theory of computing (STOC) (pp. 291–300).
Har‐Peled,, S., & Varadarajan,, K. R. (2002). Projective clustering in high dimensions using coresets. In Proceedings of 18th ACM symposium on computational geometry (socg) (pp. 312–318).
Haussler,, D. (1990). Decision theoretic generalizations of the pac learning model. In Proceedings of the 1st International Workshop on Algorithmic Learning Theory (ALT) (pp. 21–41).
Haussler,, D., & Welzl,, E. (1986). Epsilon‐nets and simplex range queries. In Annual ACM symposium on computational geometry (socg).
Hellerstein,, J. (2008). Parallel programming in the age of big data. Gigaom Blog November, 9, 2008.
Hoeffding,, W. (1963). Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301), 13–30.
IBM: What is big data? bringing big data to the enterprise. (2012). Website. Retrieved from ibm.com/software/data/bigdata/
Inaba,, M., Katoh,, N., & Imai,, H. (1994). Applications of weighted voronoi diagrams and randomization to variance‐based k‐clustering. In Symposium on computational geometry (pp. 332–339).
Indyk,, P., Mahabadi,, S., Mahdian,, M., & Mirrokni,, V. S. (2014). Composable core‐sets for diversity and coverage maximization. In Proceedings of the 33rd ACM sigmod‐sigact‐sigart symposium on principles of database systems (pp. 100–108).
Joshi,, S., Kommaraji,, R. V., Phillips,, J. M., & Venkatasubramanian,, S. (2011). Comparing distributions and shapes using the kernel distance. In Proceedings of the twenty‐seventh annual symposium on computational geometry (pp. 47–56).
Langberg,, M., & Schulman,, L. J. (2010). Universal epsilon‐approximators for integrals. In Proceedings of the 21st ACM‐SIAM symposium on discrete algorithms (SODA) (pp. 598–607).
Li,, Y., Long,, P. M., & Srinivasan,, A. (2001). Improved bounds on the sample complexity of learning. Journal of Computer and System Sciences (JCSS), 62, 516–527.
Liberty,, E. (2017). Sketches for frequent directions (Tech. Rep.) Amazon LTD. Retrieved from https://edoliberty.github.io/
Löffler,, M., & Phillips,, J. M. (2009). Shape fitting on point sets with probability distributions. In European symposium on algorithms (pp. 313–324).
Maalouf,, A., Jubran,, I., & Feldman,, D. (2019). Fast and accurate least‐mean‐squares solvers. arXiv Preprint 1906.04705.
Mahajan,, M., Nimbhorkar,, P., & Varadarajan,, K. (2009). The planar k‐means problem is np‐hard. In International workshop on algorithms and computation (pp. 274–285).
Mahoney,, M. W. (2011). Randomized algorithms for matrices and data. Foundations and Trends in Machine Learning, 3(2), 123–224.
Matouaek,, J. (2003). New constructions of weak epsilon‐nets. In Proceedings of the nineteenth annual symposium on computational geometry (pp. 129–135).
Matousek,, J. (1995). Approximations and optimal geometric divide‐an‐conquer. Journal of Computer and System Sciences, 50(2), 203–208.
McLachlan,, G., & Krishnan,, T. (2007). The em algorithm and extensions (Vol. 382). John Wiley & Sons.
Munteanu,, A., & Schwiegelshohn,, C. (2018). Coresets‐methods and history: A theoreticians design pattern for approximation and streaming algorithms. KI‐Künstliche Intelligenz, 32(1), 37–53.
Muthukrishnan,, S. (2005). Data streams: Algorithms and applications. Foundations and Trends® in Theoretical Computer Science, 1(2), 117–236.
Nasser,, S., Jubran,, I., & Feldman,, D. (2015). Low‐cost and faster tracking systems using core‐sets for pose‐estimation. arXiv preprint arXiv:1511.09120.
Ostrovsky,, R., Rabani,, Y., Schulman,, L. J., & Swamy,, C. (2006). The effectiveness of lloyd‐type methods for the k‐means problem. In Foundations of computer science, 2006. focs`06. 47th annual IEEE symposium on (pp. 165–176).
Paul,, R., Feldman,, D., Rus,, D., & Newman,, P. (2014). Visual precis generation using coresets. In Robotics and automation (icra), 2014 IEEE international conference on (pp. 1304–1311).
Phillips,, J. M. (2016). Coresets and sketches, near‐final version of chapter 49 in handbook on discrete and computational geometry. 3rd ed. CoRR, abs/1601.00617. Retrieved from http://arxiv.org/abs/1601.00617
Rosman,, G., Volkov,, M., Feldman,, D., Fisher, III, J. W., & Rus,, D. (2014). Coresets for k‐segmentation of streaming data. In Advances in neural information processing systems (nips) (pp. 559–567).
Sanabria,, J. M. (2018). Randomized matrix algorithms (Tech. Rep.). TravelPerk. Retrieved from https://github.com/jomsdev/randNLA
Segaran,, T., & Hammerbacher,, J. (2009). Beautiful data: The stories behind elegant data solutions. O`Reilly Media, Inc.
Sener,, O., & Savarese,, S. (2017). Active learning for convolutional neural networks: A core‐set approach. Stat, 1050, 27.
Shyamalkumar,, N., & Varadarajan,, K. (2007). Efficient subspace approximation algorithms. In Proceedings of the 18th ACM‐SIAM symposium on discrete algorithms (SODA) (pp. 532–540).
Tolochinsky,, E., & Feldman,, D. (2018). Coresets for monotonic functions with applications to deep learning. arXiv preprint arXiv:1802.07382.
Tutz,, G., & Binder,, H. (2007). Boosting ridge regression. Computational Statistics %26 Data Analysis, 51(12), 6044–6059.
Varadarajan,, K., & Xiao,, X. (2012a). A near‐linear algorithm for projective clustering integer points. In Proceedings of the ACM‐SIAM symposium on discrete algorithms (SODA).
Varadarajan,, K., & Xiao,, X. (2012b). On the sensitivity of shape fitting problems. In Proceedings of the 32nd annual conference on IARCS annual conference on foundations of software technology and theoretical computer science (FSTTCS) (pp. 486–497).
Yahoo! (2018). Data sketches (Tech. Rep.). Author. Retrieved from https://datasketches.github.io/