Donoho, DL. High‐dimensional data analysis: the curses and blessings of dimensionality. AMS Math Challenges Lect 2000, 1–32.
Fan, J, Li, R. Statistical challenges with high dimensionality: feature selection in knowledge discovery. In: Proceedings of the International Congress of Mathematicians, vol. 3, 2006, 595–622.
Fan, J, Lv, J. A selective overview of variable selection in high dimensional feature space (invited review article). Stat Sin 2010, 20:101–148.
Faraway, JJ. Linear Models with R. Boca Raton: CRC Press; 2004.
Ahn, J, Marron, J, Muller, KM, Chi, Y‐Y. The high‐dimension, low‐sample‐size geometric representation holds under mild conditions. Biometrika 2007, 940:760–766.
Hall, P, Marron, J, Neeman, A. Geometric representation of high dimension, low sample size data. J R Stat Soc [Ser B] 2005, 670:427–444.
Bickel, PJ, Li, B, Tsybakov, AB, van de Geer, SA, Yu, B, Valdés, T, Rivero, C, Fan, J, van der Vaart, A. Regularization in statistics. Test 2006, 150:271–344.
Hoerl, AE, Kennard, RW. Ridge regression: biased estimation for nonorthogonal problems. Technometrics 1970, 120:55–67.
Tibshirani, R. Regression shrinkage and selection via the lasso. J R Stat Soc [Ser B] 1996, 58:267–288.
Candes, E, Tao, T. The dantzig selector: statistical estimation when p is much larger than n. Ann Stat 2007, 35:2313–2351.
Frank, LE, Friedman, JH. A statistical view of some chemometrics regression tools. Technometrics 1993, 350:109–135.
Knight, K, Fu, W. Asymptotics for lasso‐type estimators. Ann Stat 2000, 28:1356–1378.
Donoho, DL, Johnstone, IM. Ideal spatial adaptation by wavelet shrinkage. Biometrika 1994, 810:425–455.
Donoho, DL, Johnstone, IM, Kerkyacharian, G, Picard, D. Wavelet shrinkage: asymptopia? J R Stat Soc [Ser B] 1995, 57:301–369.
Chen, SS, Donoho, DL, Saunders, MA. Atomic decomposition by basis pursuit. SIAM J Sci Comput 1998, 200:33–61.
Zhao, P, Yu, B. On model selection consistency of lasso. J Mach Learn Res 2006, 7:2541–2563.
Candes, EJ, Wakin, MB, Boyd, SP. Enhancing sparsity by reweighted 1 minimization. J Fourier Anal Appl 2008, 140:877–905.
Zhang, C‐H, Huang, J. The sparsity and bias of the lasso selection in high‐dimensional linear regression. Ann Stat 2008, 360:1567–1594.
Bickel, PJ, Ritov, Y, Tsybakov, AB. Simultaneous analysis of lasso and dantzig selector. Ann Stat 2009, 370:1705–1732.
Wainwright, MJ. Sharp thresholds for high‐dimensional and noisy sparsity recovery. IEEE Trans Inf Theory 2009, 550:2183–2202.
Fan, J, Li, R. Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 2001, 960:1348–1360.
Fan, J, Peng, H. Nonconcave penalized likelihood with a diverging number of parameters. Ann Stat 2004, 320:928–961.
Zou, H. The adaptive lasso and its oracle properties. J Am Stat Assoc 2006, 1010:1418–1429.
Zhang, C‐H. Nearly unbiased variable selection under minimax concave penalty. Ann Stat 2010, 380:894–942.
Efron, B, Hastie, T, Johnstone, IM, Tibshirani, R. Least angle regression. Ann Stat 2004, 320:407–499.
Rosset, S, Zhu, J. Piecewise linear regularized solution paths. Ann Stat 2007, 35:1012–1030.
Fu, WJ. Penalized regressions: the bridge versus the lasso. J Comput Graph Stat 1998, 70:397–416.
Wu, TT, Lange, K. Coordinate descent algorithms for lasso penalized regression. Ann Appl Stat 2008, 2:224–244.
Fan, J, Lv, J. Nonconcave penalized likelihood with np‐dimensionality. IEEE Trans Inf Theory 2011, 570:5467–5484.
Mazumder, R, Friedman, JH, Hastie, T. Sparsenet: Coordinate descent with nonconvex penalties. J Am Stat Assoc 2011, 1060:1125–1138.
Zou, H, Li, R. One‐step sparse estimates in nonconcave penalized likelihood models. Ann Stat 2008, 360:1509–1533.
Fan, J, Xue, L, Zou, H. Strong oracle optimality of folded concave penalized estimation. arXiv preprint arXiv:1210.5992, 2012.
Zhang, C‐H, Zhang, T. A general theory of concave regularization for high‐dimensional sparse estimation problems. Stat Sci 2012, 270:576–593.
Wang, L, Kim, Y, Li, R. Calibrating nonconvex penalized regression in ultra‐high dimension. Ann Stat 2013, 410:2505–2536.
Wang, Z, Liu, H, Zhang, T. Optimal computational and statistical rates of convergence for sparse nonconvex learning problems. arXiv preprint arXiv:1306.4960, 2013.
Fan, J, Lv, J. Sure independence screening for ultrahigh dimensional feature space. J R Stat Soc [Ser B] 2008, 700:849–911.
Wang, H. Forward regression for ultra‐high dimensional variable screening. J Am Stat Assoc 2009, 1040:1512–1524.
Huang, J, Horowitz, JL, Ma, S. Asymptotic properties of bridge estimators in sparse high‐dimensional regression models. Ann Stat 2008, 360:587–613.
Weng, H, Feng, Y, Qiao, X. Regularization after retention in ultrahigh dimensional linear regression models. arXiv preprint arXiv:1311.5625, 2013.
Vapnik, V. The Nature of Statistical Learning Theory. New York: Springer; 1999.
Vapnik, V. Statistical Learning Theory. New York: John Wiley %26 Sons; 1998.
Cortes, C, Vapnik, V. Support‐vector networks. Mach Learn 1995, 200:273–297.
Cristianini, N, Shawe‐Taylor, J. An Introduction to Support Vector Machines and Other Kernel‐Based Learning Methods. Cambridge: Cambridge University Press; 2000.
Bradley, PS, Mangasarian, OL. Feature selection via concave minimization and support vector machines. In: International Conference on Machine Learning, vol. 98, 1998, 82–90.
Zhu, J, Rosset, S, Hastie, T, Tibshirani, R. 1‐norm support vector machines. Adv Neural Inf Process Syst 2004, 160:49–56.
Liu, Y, Helen Zhang, H, Park, C, Ahn, J. Support vector machines with adaptive Lq penalty. Comput Stat Data Anal 2007, 510:6380–6394.
Zou, H, Yuan, M. The f∞‐norm support vector machine. Stat Sin 2008, 18:379–398.
Zhang, HH, Liu, Y, Wu, Y, Zhu, J. Variable selection for the multicategory SVM via adaptive sup‐norm regularization. Electron J Stat 2008, 2:149–167.
Fan, J, Fan, Y. High dimensional classification using features annealed independence rules. Ann Stat 2008, 360:2605–2637.
Wu, Y, Liu, Y. Variable selection in quantile regression. Stat Sin 2009, 190:801–817.
Shen, X, Tseng, GC, Zhang, X, Wong, WH. On Ψ‐learning. J Am Stat Assoc 2003, 980:724–734.
Liu, Y, Shen, X. Multicategory Ψ‐learning. J Am Stat Assoc 2006, 1010:500–509.
Wu, Y, Liu, Y. Robust truncated hinge loss support vector machines. J Am Stat Assoc 2007, 1020:974–983.
Fan, Y, Li, R. Variable selection in linear mixed effects models. Ann Stat 2012, 400:2043–2068.
Zou, H, Hastie, T. Regularization and variable selection via the elastic net. J R Stat Soc [Ser B] 2005. ISSN 1467‐9868, 670:301–320.
Liu, Y, Wu, Y. Variable selection via a combination of the l0 and l1 penalties. J Comput Graph Stat 2007, 160:782–798.
Lv, J, Fan, Y. A unified approach to model selection and sparse recovery using regularized least squares. Ann Stat 2009, 370:3498–3528.
Zou, H, Zhang, HH. On the adaptive elastic‐net with a diverging number of parameters. Ann Stat 2009, 370:1733–1751.
Wang, L, Zhu, J, Zou, H. The doubly regularized support vector machine. Stat Sin 2006, 160:589–615.