Estimation with Incomplete Data: The Linear Case

Estimation with Incomplete Data: The Linear Case

Karthika Mohan, Felix Thoemmes, Judea Pearl

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence

Traditional methods for handling incomplete data, including Multiple Imputation and Maximum Likelihood, require that the data be Missing At Random (MAR). In most cases, however, missingness in a variable depends on the underlying value of that variable. In this work, we devise model-based methods to consistently estimate mean, variance and covariance given data that are Missing Not At Random (MNAR). While previous work on MNAR data require variables to be discrete, we extend the analysis to continuous variables drawn from Gaussian distributions. We demonstrate the merits of our techniques by comparing it empirically to state of the art software packages.
Keywords:
Uncertainty in AI: Bayesian Networks
Uncertainty in AI: Graphical Models