Shrinkage-Based Variable Selection Methods for Linear Regression and Mixed-Effects Models

Show full item record

Title: Shrinkage-Based Variable Selection Methods for Linear Regression and Mixed-Effects Models
Author: Krishna, Arun
Advisors: Dr. Sujit K. Ghosh, Committee Co-Chair
Dr. Howard D. Bondell, Committee Chair
Abstract: KRISHNA, ARUN. Shrinkage-Based Variable Selection Methods for Linear Regression and Mixed-Effects Models. (Under the direction of Professors H. D. Bondell and S. K. Ghosh). In this dissertation we propose two new shrinkage-based variable selection approaches. We first propose a Bayesian selection technique for linear regression models, which allows for highly correlated predictors to enter or exit the model, simultaneously. The second variable selection method proposed is for linear mixed-effects models, where we develop a new technique to jointly select the important fixed and random effects parameters. We briefly summarize each of these methods below. The problem of selecting the correct subset of predictors within a linear model has received much attention in recent literature. Within the Bayesian framework, a popular choice of prior has been Zellner’s g-prior which is based on the inverse of empirical covariance matrix of the predictors. We propose an extension of Zellner’s gprior which allow for a power parameter on the empirical covariance of the predictors. The power parameter helps control the degree to which correlated predictors are smoothed towards or away from one another. In addition, the empirical covariance of the predictors is used to obtain suitable priors over model space. In this manner, the power parameter also helps to determine whether models containing highly collinear predictors are preferred or avoided. The proposed power parameter can be chosen via an empirical Bayes method which leads to a data adaptive choice of prior. Simulation studies and a real data example are presented to show how the power parameter is well determined from the degree of cross-correlation within predictors. The proposed modification compares favorably to the standard use of Zellner’s prior and an intrinsic prior in these examples. We propose a new method of simultaneously identifying the important predictors that correspond to both the fixed and random effects components in a linear mixedeffects model. A reparameterized version of the linear mixed-effects model using a modified Cholesky decomposition is proposed to aid in the selection by dropping out the random effect terms whose corresponding variance is set to zero. We propose a penalized joint log-likelihood procedure with an adaptive penalty for the selection and estimation of the fixed and random effects. A constrained EM algorithm is then used to obtain the final estimates. We further show that our penalized estimator enjoys the Oracle property, in that, asymptotically it performs as well as if the true model was known beforehand. We demonstrate the performance of our method based on a simulation study and a real data example.
Date: 2008-12-22
Degree: PhD
Discipline: Statistics

Files in this item

Files Size Format View
etd.pdf 702.8Kb PDF View/Open

This item appears in the following Collection(s)

Show full item record