Mathematics Ph.D. Dissertations

Bootstrap-adjusted Quasi-likelihood Information Criteria for Mixed Model Selection

Date of Award

2019

Document Type

Dissertation

Degree Name

Doctor of Philosophy (Ph.D.)

Department

Mathematics/Mathematical Statistics

First Advisor

Junfeng Shang (Advisor)

Second Advisor

Hanfeng Chen (Committee Member)

Third Advisor

John Chen (Committee Member)

Fourth Advisor

Eric Worch (Other)

Abstract

Model selection has received much attention and significantly developed in the recent decades. When statistical modeling is utilized to analyze the data set and make predictions, it is always natural to ask whether the candidate model fitted is a good model or not. This question is to be investigated and answered by the process of model selection. The challenge of model selection lies in the choice of a suitable selection criterion which sets the guideline to choose the most appropriate model among a set of possible candidate models. To some extent, the consistence and efficiency of the selection criterion determine the performance of the final model. When longitudinal data are involved, numerous selection criteria have been developed recently but few of them are both consistent and efficient. Furthermore, most of the existing selection criteria are based on the original sample alone. As a result, the uncertainty of the data has not been taken into consideration.

In this dissertation, to achieve consistent and efficient model selection performance and address the issue of uncertainty, we focus on the construction of two model selection criteria, named by QAICb1 and QAICb2, as modifications of Akaike information criterion (AIC, Akaike, 1973) based on the quasi-likelihood function and the bootstrap approach. Similar to some of the quasi-likelihood-based model selection criteria, quasi-likelihood independence criterion (QIC(R), Pan, 2001) for example, QAICb1 and QAICb2 consist of a log quasi-likelihood function and an estimation of bias correction term. We are able to make our two proposed criteria more widely applicable because the use of quasi-likelihood and bootstrap does not rely on distribution assumptions.

We utilize the quasi-likelihood to extend the Kullback-Leibler (K-L, Kullback and Leibler, 1951) discrepancy from which QAICb1 and QAICb2 are derived and proven to be asymptotically equivalent and consistent estimators. In addition to the linear mixed model, we extend QAICb1 and QAICb2 to generalized linear models with random effects because of the simplicity of constructing the corresponding quasi-likelihood functions.

To implement QAICb1 and QAICb2 in the context of the longitudinal data, we employ the generalized estimating equations (GEE, Pan, 2001b) to obtain the estimated parameters of candidate models. We first compute the quasi-likelihood component of the original sample using GEE with a pre-specified working correlation matrix. Then we estimate the bias correction terms of QAICb1 and QAICb2 using the bootstrap approach and GEE as well. By combining the calculated quasi-likelihood function and the estimation of the bias correction term, we compute the values of QAICb1 and QAICb2 for each of the candidate models, with which we are able to select the most appropriate model with the minimum.

Simulation studies are conducted using different settings including both the linear mixed models and generalized linear models with random effects under the non-parametric and semi-parametric bootstrap methods. The results demonstrate that both the two proposed model selection criteria are, in general, more consistent and efficient than existing model selection criteria across various scenarios. Two applications are given at the end of the dissertation to evaluate the model selection performance of QAICb1 and QAICb2 using data sets from clinical trials.

Share

COinS