Gibaja, E, Ventura, S. Multi‐label learning: a review of the state of the art and ongoing research. Wiley Interdiscip Rev Data Mining Knowl Discov 2014, 4:411–444.

Clare, A, King, RD. Knowledge discovery in multi‐label phenotype data. In: *European Conference on Data Mining and Knowledge Discovery (2001)*, Freiburg, Germany, 2001, 42–53.

Zhang, M‐L, Zhou, Z‐H. Multilabel neural networks with applications to functional genomics and text categorization. IEEE Trans Knowl Data Eng 2006, 18:1338–1351.

Elisseeff, A, Weston, J. A Kernel method for multi‐labelled classification. In: *Advances in Neural Information Processing Systems (NIPS 2001)* Vancouver, Canada. 2001, 681–687.

McCallum, AK. Multi‐label text classification with a mixture model trained by EM. In: *AAAI 99 Workshop on Text Learning*, Orlando, Florida, 1999.

Ghamrawi, N, McCallum, A. Collective multi‐label classification. In: *ACM International Conference on Information and Knowledge Management*, Bremen, Germany, 2005, 195–200. New York, NY, USA: ACM.

Schapire, RE, Singer, Y. Boostexter: a boosting‐based system for text categorization. Mach Learn 2000, 39:135–168.

Dembczyński, K, Waegeman, W, Cheng, W, Hüllermeier, E. On label dependence and loss minimization in multi‐label classification. Mach Learn 2012, 88:5–45.

Dembczyński, K, Cheng, W, Hüllermeier, E. Bayes optimal multilabel classification via probabilistic classifier chains. In: *ICML (2010)*, Haifa, Israel, 2010, 279–286.

Montañés, E, Quevedo, JR, del Coz, JJ. Aggregating independent and dependent models to learn multi‐label classifiers. In: *ECML/PKDD’11—Volume Part II*, Athens, Greece, 2011, 484–500. Berlin Heidelberg: Springer‐Verlag.

Montañés, E, Senge, R, Barranquero, J, Quevedo, JR, del Coz, JJ, Hüllermeier, E. Dependent binary relevance models for multi‐label classification. Pattern Recogn 2014, 47:1494–1508.

Read, J, Pfahringer, B, Holmes, G, Frank, E. Classifier chains for multi‐label classification. Mach Learn 2011, 85:333–359.

Tsoumakas, G, Katakis, I, Vlahavas, I. Mining multi‐label data. In: Data Mining and Knowledge Discovery Handbook. New York, US: Springer; 2010, 667–685.

Cheng, W, Hüllermeier, E. Combining instance‐based learning and logistic regression for multilabel classification. Mach Learn 2009, 76:211–225.

Godbole, S, Sarawagi, S. Discriminative methods for multi‐labeled classification. In: *Pacific‐Asia Conference on Knowledge Discovery and Data Mining (2004)*, Sydney, Australia, 2004, 22–30.

Fürnkranz, J, Hüllermeier, E, Loza Menca, E, Brinker, K. Multilabel classification via calibrated label ranking. Mach Learn 2008, 73:133–153.

Qi, GJ, Hua, XS, Rui, Y, Tang, J, Mei, T, Zhang, HJ. Correlative multi‐label video annotation. In: *Proceedings of the International Conference on Multimedia*, Augsburg, Germany, 2007, 17–26. New York: ACM.

Read, J, Pfahringer, B, Holmes, G. Multi‐label classification using ensembles of pruned sets. In: *IEEE International Conference on Data Mining*, Pisa, Italy, 2008, 995–1000.

Tsoumakas, G, Vlahavas, I. Random k‐labelsets: an ensemble method for multilabel classification. In: *ECML/PKDD’07*, LNCS, Warsaw, Poland, 2007, 406–417. Berlin Heidelberg: Springer.

Dembczynski, K, Waegeman, W, Hüllermeier, E. An analysis of chaining in multi‐label classification. In: Raedt, LD, Bessière, C, Dubois, D, Doherty, P, Frasconi, P, Heintz, F, Lucas, PJF, eds. ECAI: Frontiers in Artificial Intelligence and Applications, Montpellier, France, vol. 242. Amsterdam, Netherlands: IOS Press; 2012, 294–299.

Kumar, A, Vembu, S, Menon, AK, Elkan, C. Learning and inference in probabilistic classifier chains with beam search. In: *ECML/PKDD (2012)*, Bristol, UK, 2012, 665–680.

Kumar, A, Vembu, S, Menon, AK, Elkan, C. Beam search algorithms for multilabel learning. Mach Learn 2013, 92:65–89.

Read, J, Martino, L, Luengo, D. Efficient monte carlo methods for multi‐dimensional learning with classifier chains. Pattern Recogn 2014, 47:1535–1546.

Read, J, Martino, L, Olmos, PM, Luengo, D. Scalable multi‐output label prediction: from classifier chains to classifier trellises. Pattern Recogn 2015, 48:2096–2109.

Luaces, Ó, Dez, J, Barranquero, J, del Coz, JJ, Bahamonde, A. Binary relevance efficacy for multilabel classification. Prog Artif Intell 2012, 4:303–313.

Read, J, Pfahringer, B, Holmes, G, Frank, E. Classifier chains for multi‐label classification. In: *ECML/PKDD’09*, LNCS, Bled, Slovenia, 2009, 254–269. Berlin Heidelberg: Springer.

Senge, R, del Coz, JJ, Hüllermeier, E. On the problem of error propagation in classifier chains for multi‐label classification. In: *Conference of the German Classification Society on Data Analysis, Machine Learning and Knowledge Discovery (2012)*, Hildesheim, Germany, 2012.

Senge, R, del Coz, JJ, Hüllermeier, E. Rectifying classifier chains for multi‐label classification. In: *LWA 2013: Lernen, Wissen %26 Adaptivität, Workshop Proceedings Bamberg*, Bamberg, Germany, 2013, 151–158.

Read, J, Martino, L, Olmos, PM, Luengo, D. Scalable multi‐output label prediction: from classifier chains to classifier trellises. Pattern Recogn 2015, 48:2096–2109.

Lin, C‐J, Weng, RC, Keerthi, SS. Trust region Newton method for logistic regression. J Mach Learn Res 2008, 9:627–650.

Brier, GW. Verification of forecasts expressed in terms of probability. Mon Weather Rev 1950, 78:1–3.