Variable Selection Procedures for Generalized Linear Mixed Models in Longitudinal Data Analysis

dc.contributor.advisorDaowen Zhang, Committee Chairen_US
dc.contributor.advisorHao Helen Zhang, Committee Co-Chairen_US
dc.contributor.advisorDennis Boos, Committee Memberen_US
dc.contributor.advisorMarie Davidian, Committee Memberen_US
dc.contributor.authorYang, Hongmeien_US
dc.date.accessioned2010-04-02T19:01:34Z
dc.date.available2010-04-02T19:01:34Z
dc.date.issued2007-08-03en_US
dc.degree.disciplineStatisticsen_US
dc.degree.leveldissertationen_US
dc.degree.namePhDen_US
dc.description.abstractModel selection is important for longitudinal data analysis. But up to date little work has been done on variable selection for generalized linear mixed models (GLMM). In this paper we propose and study a class of variable selection methods. Full likelihood (FL) approach is proposed for simultaneous model selection and parameter estimation. Due to the intensive computation involved in FL approach, Penalized Quasi-Likelihood (PQL) procedure is developed so that model selection in GLMMs can proceed in the framework of linear mixed models. Since the PQL approach will produce biased parameter estimates for sparse binary longitudinal data, Two-stage Penalized Quasi-Likelihood approach (TPQL) is proposed to bias correct PQL in terms of estimation: use PQL to do model selection at the first stage and existing software to do parameter estimation at the second stage. Marginal approach for some special types of data is also developed. A robust estimator of standard error for the fitted parameters is derived based on a sandwich formula. A bias correction is proposed to improve the estimation accuracy of PQL for binary data. The sampling performance of four proposed procedures is evaluated through extensive simulations and their application to real data analysis. In terms of model selection, all of them perform closely. As for parameter estimation, FL, AML and TPQL yield similar results. Compared with FL, the other procedures greatly reduce computational load. The proposed procedures can be extended to longitudinal data analysis involving missing data, and the shrinkage penalty based approach allows them to work even when the number of observations n is less than the number of parameters d.en_US
dc.identifier.otheretd-06062007-015506en_US
dc.identifier.urihttp://www.lib.ncsu.edu/resolver/1840.16/4816
dc.rightsI hereby certify that, if appropriate, I have obtained and attached hereto a written permission statement from the owner(s) of each third party copyrighted matter to be included in my thesis, dis sertation, or project report, allowing distribution as specified below. I certify that the version I submitted is the same as that approved by my advisory committee. I hereby grant to NC State University or its agents the non-exclusive license to archive and make accessible, under the conditions specified below, my thesis, dissertation, or project report in whole or in part in all forms of media, now or hereafter known. I retain all other ownership rights to the copyright of the thesis, dissertation or project report. I also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report.en_US
dc.subjectvariance componenten_US
dc.subjectLaplace approximationen_US
dc.subjectgeneralized linear mixed modelen_US
dc.subjectquasi-likelihooden_US
dc.subjectgeneralized estimation equationen_US
dc.subjectapproximate marginal likelihooden_US
dc.subjectSCADen_US
dc.titleVariable Selection Procedures for Generalized Linear Mixed Models in Longitudinal Data Analysisen_US

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
etd.pdf
Size:
602.33 KB
Format:
Adobe Portable Document Format

Collections