Christen, P. Privacy‐preserving data linkage and geocoding: current approaches and research directions. In: Workshop on Privacy Aspects of Data Mining, Hong Kong; 2006.
Clifton, C, Kantarcioglu, M, Doan, A, Schadow, G, Vaidya, J, Elmagarmid, A, Suciu, D. Privacy‐preserving data integration and sharing. In: ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, Paris; 2004, 19–26.
Christen, P. Data Matching: Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection. Data‐Centric Systems and Applications. Berlin: Springer; 2012.
Elmagarmid, AK, Ipeirotis, PG, Verykios, VS. Duplicate record detection: a survey. IEEE Trans Knowl Data Eng 2007, 19:1–16.
Herzog, T, Scheuren, F, Winkler, W. Record linkage. WIRES Comput Stat 2010, 2:535–543.
Naumann, F, Herschel, M. An Introduction to Duplicate Detection. Synthesis Lectures on Data Management, vol 3. San Rafael: Morgan and Claypool; 2010.
Vaidya, J, Clifton, C, Zhu, M., eds. Privacy Preserving Data Mining. Advances in Information Security, vol 19. Berlin: Springer; 2006.
Hall, R, Fienberg, S. Privacy‐preserving record linkage. In: Privacy in Statistical Databases, Corfu, Greece. Lecture Notes in Computer Science, vol 6344. Berlin: Springer; 2010, 269–283.
Vatsalan, D, Christen, P, Verykios, VS. A taxonomy of privacy‐preserving record linkage techniques. Inform Syst 2013, 38:946–969.
Herzog, T, Scheuren, F, Winkler, W. Data Quality and Record Linkage Techniques. Berlin: Springer Verlag; 2007.
Christen, P. A survey of indexing techniques for scalable record linkage and deduplication. IEEE Trans Knowl Data Eng 2012, 12:1537–1555.
Christen, P. A comparison of personal name matching: Techniques and practical issues. In: Workshop on Mining Complex Data, Hong Kong; 2006.
Cohen, WW, Ravikumar, P, Fienberg, S. A comparison of string distance metrics for name‐matching tasks. In: Workshop on Information Integration on the Web, Acapulco, Guerrero, Mexico; 2003, 73–78.
Cohen, WW, Richman, J. Learning to match and cluster large high‐dimensional data sets for data integration. In: ACM SIGKDD. Edmonton, Canada; 2002, 475–480.
Verykios, VS, Elmagarmid, A, Houstis, E. Automating the approximate record‐matching process. Inform Sci 2000, 126:83–98.
Christen, P. Automatic training example selection for scalable unsupervised record linkage. In: PAKDD, Osaka, Japan. Lecture Notes in Artificial Intelligence, vol 5012, Springer; 2008, 511–518.
Trepetin, S. Privacy‐preserving string comparisons in record linkage systems: a review. Inform Secur J: A Global Persp 2008, 17:253–266.
Quantin, C, Bouzelat, H, Allaert, F, Benhamiche, A, Faivre, J, Dusserre, L. How to ensure data quality of an epidemiological follow‐up: quality assessment of an anonymous record linkage procedure. Int J Med Inform 1998, 49:117–122.
Quantin, C, Bouzelat, H, Allaert, F‐A, Benhamiche, A‐M, Faivre, J, Dusserre, L. Automatic record hash coding and linkage for epidemiological follow‐up data confidentiality. Methods Inf Med 1998, 37:271–277.
Quantin, C, Bouzelat, H, Dusserre, L. Irreversible encryption method by generation of polynomials. Med Inform Internet Med 1996, 21:113–121.
Schneier, B. Applied Cryptography: Protocols, Algorithms, and Source Code in C. 2nd ed. New York: John Wiley and Sons; 1996.
Song, D, Wagner, D, Perrig, A. Practical techniques for searches on encrypted data. In: IEEE Symposium on Security and Privacy, Berkeley, CA; 2000, 44–55.
Durham, E, Xue, Y, Kantarcioglu, M, Malin, B. Quantifying the correctness, computational complexity, and security of privacy‐preserving string comparators for record linkage. Inf Fusion 2012, 13:245–259.
Karakasidis, A, Verykios, VS. Advances in privacy preserving record linkage. In: E‐Activity and Innovative Technology. Advances in Applied Intelligence Technologies Book Series. Hershey, PA: IGI Global; 2010, 22–34.
Verykios, VS, Karakasidis, A, Mitrogiannis, V. Privacy preserving record linkage approaches. Int J Data Mining Model Manag 2009, 1:206–221.
Churches, T, Christen, P. Some methods for blindfolded record linkage. BMC Med Inform Decis Mak 2004, 4:9.
Al‐Lawati, A, Lee, D, McDaniel, P. Blocking‐aware private record linkage. In: International Workshop on Information Quality in Information Systems, Baltimore, MD; 2005, 59–68.
Krawczyk, H, Bellare, M, Canetti, R. HMAC: keyed‐hashing for message authentication. In: Internet RFCs, United States; 1997.
Karakasidis, A, Verykios, VS, Christen, P. Fake injection strategies for private phonetic matching. In: International Workshop on Data Privacy Management, Leuven, Belgium; 2011.
Navarro, G. A guided tour to approximate string matching. ACM Comput Surv 2001, 33:31–88.
Atallah, M, Kerschbaum, F, Du, W. Secure and private sequence comparisons. In: Workshop on Privacy in the Electronic Society. New York: ACM; 2003, 39–44.
Bloom, B. Space/time trade‐offs in hash coding with allowable errors. Commun ACM 1970, 13:422–426.
Kuzu, M, Kantarcioglu, M, Durham, E, Malin, B. A constraint satisfaction cryptanalysis of Bloom filters in private record linkage. In: Privacy Enhancing Technologies. Berlin: Springer; 2011, 226–245.
Schnell, R, Bachteler, T, Reiher, J. Privacy‐preserving record linkage using Bloom filters. BMC Med Inform Decis Mak 2009, 9:1.
Schnell, R, Bachteler, T, Reiher, J. Private record linkage with Bloom filters. In: Proceedings of Statistics Canada Symposium 2010: Social Statistics: The Interplay among Censuses, Surveys and Administrative Data, Ottawa; 2010, 304–309.
Schnell, R, Bachteler, T, Reiher, J. A novel error‐tolerant anonymous linking code. Technical report. Working Paper Series No. WP‐GRLC‐2011‐02. Nürnberg, Germany: German Record Linkage Center; 2011.
Durham, EA. A Framework for Accurate, Efficient Private Record Linkage. Ph.D. thesis, Faculty of the Graduate School of Vanderbilt University, Nashville, TN; 2012.
Vatsalan, D, Christen, P. An iterative two‐party protocol for scalable privacy‐preserving record linkage. In: AusDM, CRPIT, vol 134, Sydney, Australia; 2012.
Spoerri, A, Schmidlin, K, Schnell, R, Clough‐Gorr, K. Privacy preserving probabilistic record linkage (P3RL) from a to z: a GRLS example from SwissLinkage. In: Statistics Canada Symposium, Strategies for Standardization of Methods and Tools, Ottawa, Canada; 2011, 128.
Fair, M. Generalized record linkage system: Statistics Canada`s record linkage software. Austrian J Stat 2004, 33:37–53.
Scannapieco, M, Figotin, I, Bertino, E, Elmagarmid, A. Privacy preserving schema and data matching. In: ACM SIGMOD, Bejing, People`s Republic of China; 2007, 653–664.
Hjaltason, G, Samet, H. Properties of embedding methods for similarity searching in metric spaces. IEEE Trans Pattern Anal Mach Intell 2003, 25:530–549.
Yakout, M, Atallah, M, Elmagarmid, A. Efficient private record linkage. In IEEE ICDE, Shanghai, People`s Republic of China; 2009, 1283–1286.
Pang, C, Gu, L, Hansen, D, Maeder, A. Privacy‐preserving fuzzy matching using a public reference table. Intell Patient Manag 2009, 189:71–89.
Vatsalan, D, Christen, P, Verykios, VS. An efficient two‐party protocol for approximate matching in private record linkage. In: AusDM, CRPIT, vol 121, Ballarat, Australia; 2011.
Hernandez, MA, Stolfo, SJ. The merge/purge problem for large databases. In: ACM SIGMOD, San Jose. CA; 1995, 127–138.
Karakasidis, A, Verykios, VS. A sorted neighborhood approach to multidimensional privacy preserving blocking. In: IEEE International Conference on Data Mining, Workshops (ICDMW), Brussels, Belgium; 2012, 937–944.
Vatsalan, D, Christen, P. Sorted nearest neighborhood clustering for efficient private blocking. In: PAKDD, Gold Coast, Australia; 2013.
Draisbach, U, Naumann, F, Szott, S, Wonneberg, O. Adaptive windows for duplicate detection. In: IEEE International Conference on Data Engineering (ICDE). Washington DC: IEEE; 2012, 1073–1083.
Inan, A, Kantarcioglu, M, Bertino, E, Scannapieco, M. A hybrid approach to private record linkage. In: IEEE ICDE, Cancun, Mexico; 2008, 496–505.
Inan, A, Kantarcioglu, M, Ghinita, G, Bertino, E. Private record matching using differential privacy. In: International Conference on Extending Database Technology, Lausanne, Switzerland; 2010, 123–134.
Sweeney, L. K‐anonymity: A model for protecting privacy. Int J Uncertain Fuzziness Knowl Based Syst 2002, 10:557–570.
Dwork, C. Differential privacy. In: Automata, Languages and Programming. Lecture Notes in Computer Science, vol 4052. Berlin; Springer; 2006, 1–12.
Sadinle, M, Hall, R, Fienberg, S. Approaches to multiple record linkage. In: Proceedings of International Statistical Institute, Dublin; 2011, 1064–1071.
Christen, P, Goiser, K. Quality and complexity measures for data linkage and deduplication. In: Guillet, F, Hamilton, H, eds., Quality Measures in Data Mining. Studies in Computational Intelligence, vol 43. Berlin: Springer; 2007, 127–151.
Guazzelli, A, Zeller, M, Lin, W‐C, Williams, G. PMML: An open standard for sharing models. R J 2009, 1:60–65.
Bhattacharya, I, Getoor, L. Collective entity resolution in relational data. ACM Trans Knowl Discov Data 2007, 1:5.
Dong, X, Halevy, A, Madhavan, J. Reference reconciliation in complex information spaces. In: ACM SIGMOD, Baltimore, MD; 2005, 85–96.
Herschel, M, Naumann, F, Szott, S, Taubert, M. Scalable iterative graph duplicate detection. IEEE Trans Knowl Data Eng 2011, 24:2094–2108.
Kalashnikov, D, Mehrotra, S. Domain‐independent data cleaning via analysis of entity‐relationship graph. ACM Trans Database Syst 2006, 31:716–767.
Barone, D, Maurino, A, Stella, F, Batini, C. A privacy‐preserving framework for accuracy and completeness quality assessment. In: Emerging Paradigms in Informatics, Systems and Communication, Quaderni Disco; 2009, 83.