Variable Selection Procedures for Generalized Linear Mixed Models in Longitudinal Data Analysis

Show simple item record

dc.contributor.advisor Daowen Zhang, Committee Chair en_US
dc.contributor.advisor Hao Helen Zhang, Committee Co-Chair en_US
dc.contributor.advisor Dennis Boos, Committee Member en_US
dc.contributor.advisor Marie Davidian, Committee Member en_US
dc.contributor.author Yang, Hongmei en_US
dc.date.accessioned 2010-04-02T19:01:34Z
dc.date.available 2010-04-02T19:01:34Z
dc.date.issued 2007-08-03 en_US
dc.identifier.other etd-06062007-015506 en_US
dc.identifier.uri http://www.lib.ncsu.edu/resolver/1840.16/4816
dc.description.abstract Model selection is important for longitudinal data analysis. But up to date little work has been done on variable selection for generalized linear mixed models (GLMM). In this paper we propose and study a class of variable selection methods. Full likelihood (FL) approach is proposed for simultaneous model selection and parameter estimation. Due to the intensive computation involved in FL approach, Penalized Quasi-Likelihood (PQL) procedure is developed so that model selection in GLMMs can proceed in the framework of linear mixed models. Since the PQL approach will produce biased parameter estimates for sparse binary longitudinal data, Two-stage Penalized Quasi-Likelihood approach (TPQL) is proposed to bias correct PQL in terms of estimation: use PQL to do model selection at the first stage and existing software to do parameter estimation at the second stage. Marginal approach for some special types of data is also developed. A robust estimator of standard error for the fitted parameters is derived based on a sandwich formula. A bias correction is proposed to improve the estimation accuracy of PQL for binary data. The sampling performance of four proposed procedures is evaluated through extensive simulations and their application to real data analysis. In terms of model selection, all of them perform closely. As for parameter estimation, FL, AML and TPQL yield similar results. Compared with FL, the other procedures greatly reduce computational load. The proposed procedures can be extended to longitudinal data analysis involving missing data, and the shrinkage penalty based approach allows them to work even when the number of observations n is less than the number of parameters d. en_US
dc.rights I hereby certify that, if appropriate, I have obtained and attached hereto a written permission statement from the owner(s) of each third party copyrighted matter to be included in my thesis, dis sertation, or project report, allowing distribution as specified below. I certify that the version I submitted is the same as that approved by my advisory committee. I hereby grant to NC State University or its agents the non-exclusive license to archive and make accessible, under the conditions specified below, my thesis, dissertation, or project report in whole or in part in all forms of media, now or hereafter known. I retain all other ownership rights to the copyright of the thesis, dissertation or project report. I also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report. en_US
dc.subject variance component en_US
dc.subject Laplace approximation en_US
dc.subject generalized linear mixed model en_US
dc.subject quasi-likelihood en_US
dc.subject generalized estimation equation en_US
dc.subject approximate marginal likelihood en_US
dc.subject SCAD en_US
dc.title Variable Selection Procedures for Generalized Linear Mixed Models in Longitudinal Data Analysis en_US
dc.degree.name PhD en_US
dc.degree.level dissertation en_US
dc.degree.discipline Statistics en_US


Files in this item

Files Size Format View
etd.pdf 602.3Kb PDF View/Open

This item appears in the following Collection(s)

Show simple item record