This Title All WIREs
How to cite this WIREs title:
WIREs Comp Stat

Data masking for disclosure limitation

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

Abstract Protecting confidentiality is essential to the functioning of systems for collecting and disseminating data on individuals and enterprises that are necessary for evidence‐based public policy formulation. Deidentification of records, defined as removing obvious identifiers such as name and address, is not sufficient to protect confidentiality. Microdata have characteristics that lead to increased disclosure risk, such as existence of identification files, geographical detail, outliers, many/detailed attribute variables, or longitudinal or panel structure in the data. Data stewardship organizations can lower disclosure risk through disclosure limitation methods and through the construction of synthetic data. Both record and attribute suppression can be represented by matrix masks, as can perturbation through noise addition, and data swapping. Also sampling and aggregation have matrix mask representations. Distinct from masking methods, synthetic data construction considers the microdata to be a realization of some statistical model. It then replaces the true microdata with samples generated according to the model. The released data consist of records of individual synthetic units rather than records for the actual units. The organization must recognize uncertainty in both model form and values of model parameters. This argues for the relevance of hierarchical and mixture models to generate the synthetic data. Synthetic data has an advantage as a disclosure limitation method over masked data because it is easier for the user to analyze. Copyright © 2009 Wiley Periodicals, Inc., A Wiley Company This article is categorized under: Data: Types and Structure > Data Preparation and Processing

Two methods of generating restricted data from source data.

[ Normal View | Magnified View ]

The R‐U confidentiality map provides an analytical framework for examining the trade‐off between risk and utility of a disclosure limitation method.

[ Normal View | Magnified View ]

Related Articles

Record linkage
Data confidentiality

Browse by Topic

Data: Types and Structure > Data Preparation and Processing

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts