Mathematics Ph.D. Dissertations


An Approach to Estimation and Selection in Linear Mixed Models with Missing Data

Date of Award


Document Type


Degree Name

Doctor of Philosophy (Ph.D.)



First Advisor

Junfeng Shang (Advisor)

Second Advisor

Hanfeng Chen (Committee Member)

Third Advisor

John Chen (Committee Member)

Fourth Advisor

Jonathan Bostic (Other)


In the case of analyzing multilevel, correlated, or longitudinal data, linear mixed models are often incorporated. Such models can be thought of an extension of linear models in the sense that the additional random components are introduced to capture the dependency in observations. In practice, missing data occur in many disciplines, especially in the area of longitudinal studies where observations are taken repeatedly over time on samples in an experiment. Our primary goal in the dissertation is to propose an approach to estimation and model selection in linear mixed models when missing data present.

The dissertation pays particular attention to the multivariate normal models. With such models, we propose an approach that incorporates the missingness in an indicator matrix and develop likelihood-based estimators under two specific covariance structures: compound symmetric and first-order autoregressive (AR(1)).

Distinguishing from the existing maximum likelihood estimation (MLE) that relies on Newton-Raphson (NR), Expectation-Maximization (EM), or Fisher algorithms for obtaining the final estimates, we implement matrix theories to circumvent the difficulties in the estimation process imposed by the inversion and the determinant of the variance-covariance matrix.

Numerous simulations are conducted in evaluations of the proposed approach. For instance, in the study of the comparison between the proposed method and MLE, the former yields better estimates in the variance component with the compound symmetric covariance and presents remarkable improvements in estimating both the variance and the autocorrelation components in AR(1).

In the study of investigating the model selection performance using the proposal estimation approach with the Schwarz Information Criterion (SIC) serving as the selection criterion, the simulation results demonstrate that the proposed approach to estimation performs effectively with a moderate amount of missing proportion regardless of the missing behaviors, missing completely at random (MCAR) or missing not at random (MNAR).

Two real data applications are provided for revealing the performance of the proposed approach in practice. In evidence of the developed method, the conducted simulations, and the applications, we provide the concluding remarks and the future research directions as the closing of the dissertation.