A mean score method for missing and auxiliary covariate data in regression models

We consider regression analysis when incomplete or auxiliary covariate data are available for all study subjects and, in addition, for a subset called the validation sample, true covariate data of interest have been ascertained. The term auxiliary data refers to data not in the regression model, but...

Full description

Bibliographic Details
Published in:Biometrika
Main Authors: REILLY, MARIE, PEPE, MARGARET SULLIVAN
Format: Text
Language:English
Published: Oxford University Press 1995
Subjects:
Online Access:http://biomet.oxfordjournals.org/cgi/content/short/82/2/299
https://doi.org/10.1093/biomet/82.2.299
Description
Summary:We consider regression analysis when incomplete or auxiliary covariate data are available for all study subjects and, in addition, for a subset called the validation sample, true covariate data of interest have been ascertained. The term auxiliary data refers to data not in the regression model, but thought to be informative about the true missing covariate data of interest. We discuss a method which is nonparametric with respect to the association between available and missing data, allows missingness to depend on available response and covariate values, and is applicable to both cohort and case-control study designs. The method previously proposed by Flanders & Greenland (1991) and by Zhao & Lipsitz (1992) is generalised and asymptotic theory is derived. Our expression for the asymptotic variance of the estimator provides intuition regarding performance of the method. Optimal sampling strategies for the validation set are also suggested by the asymptotic results.