  How to cite this WIREs title:
WIREs Comp Stat

# Nonparametric regression with missing data

Can't access this content? Tell your librarian.

Optimal estimation of a regression function, when either the response or the predictor may be missed at random, is considered. Missing at random (MAR) means that the conditional probability of missing, given response and predictor, does not depend on a variable whose values may be missed. Mean integrated squared error (MISE) is the used statistical criteria, and a nonparametric approach implies that no assumption about shape of the regression function is made. It is shown that optimal estimation depends on which variable, the response or the predictor, is missed. For a setting with missed responses, optimal estimation is based only on complete cases of observations and incomplete ones can be ignored. For a setting with missed predictors, optimal estimation is based on all cases, both complete and incomplete, and the procedure includes estimation of the conditional probability of missing the predictor given the response. Proposed estimators are completely data‐driven, do not involve imputation of missing values, and adapt to missing mechanism and smoothness of an estimated regression function. Theoretical results are complemented by the analysis of a credit score survey data. WIREs Comput Stat 2014, 6:265–275. doi: 10.1002/wics.1303 This article is categorized under: Statistical and Graphical Methods of Data Analysis > Nonparametric Methods
Simulated regression data.
[ Normal View | Magnified View ]
Analysis of credit score and grade point average (GPA) data for class B with n = 74 and N = 50. Structure of the figure is identical to Figure .
[ Normal View | Magnified View ]
Analysis of credit score and grade point average (GPA) data for class A with n = 94 and N = 41. Credit score (X) is the predictor and GPA (Y) is the response. From the top to bottom diagrams the x‐coordinates are X, y, Y, and x, respectively.
[ Normal View | Magnified View ]

### Browse by Topic

Statistical and Graphical Methods of Data Analysis > Nonparametric Methods