Log In
New user? Click here to register. Have you forgotten your password?
NC State University Libraries Logo
    Communities & Collections
    Browse NC State Repository
Log In
New user? Click here to register. Have you forgotten your password?
  1. Home
  2. Browse by Author

Browsing by Author "Wenbin Lu, Committee Member"

Filter results by typing the first few letters
Now showing 1 - 15 of 15
  • Results Per Page
  • Sort Options
  • No Thumbnail Available
    Boosting methods for variable selection in high dimensional sparse models
    (2009-08-27) Hwang, Wook Yeon; Hao Helen Zhang, Committee Member; Howard Bondell, Committee Member; Wenbin Lu, Committee Member; Subhashis Ghosal, Committee Chair
    Firstly, we propose new variable selection techniques for regression in high dimensional linear models based on a forward selection version of the LASSO, adaptive LASSO or elastic net, respectively to be called as forward iterative regression and shrinkage technique (FIRST), adaptive FIRST and elastic FIRST. These methods seem to work better for an extremely sparse high dimensional linear regression model. We exploit the fact that the LASSO, adaptive LASSO and elastic net have closed form solutions when the predictor is one-dimensional. The explicit formula is then repeatedly used in an iterative fashion until convergence occurs. By carefully considering the relationship between estimators at successive stages, we develop fast algorithms to compute our estimators. The performance of our new estimators is compared with commonly used estimators in terms of predictive accuracy and errors in variable selection. It is observed that our approach has better prediction performance for highly sparse high dimensional linear regression models. Secondly, we propose a new variable selection technique for binary classification in high dimensional models based on a forward selection version of the Squared Support Vector Machines or one-norm Support Vector Machines, to be called as forward iterative selection and classification algorithm (FISCAL). This methods seem to work better for a highly sparse high dimensional binary classification model. We suggest the squared support vector machines using 1-norm and 2-norm simultaneously. The squared support vector machines are convex and differentiable except at zero when the predictor is one-dimensional. Then an iterative forward selection approach is applied along with the squared support vector machines until a stopping rule is satisfied. Also, we develop a recursive algorithm for the FISCAL to save computational burdens. We apply the processes to the original onenorm Support Vector Machines. We compare the FISCAL with other widely used binary classification approaches with regard to prediction performance and selection accuracy. The FISCAL shows competitive prediction performance for highly sparse high dimensional binary classification models.
  • No Thumbnail Available
    Generalized Estimators of the Attributable Benefit of an Optimal Treatment Regime
    (2008-09-12) Brinkley, Jason Scott; Wenbin Lu, Committee Member; Daowen Zhang, Committee Member; Marie Davidian, Committee Member; Anastasios Tsiatis, Committee Chair
  • No Thumbnail Available
    Human and Machine Co-Investigate Intelligence System (HM-CII) for Fault Diagnosis and Detection in Complex Systems.
    (2010-11-02) Kim, So Yeon; Simon Hsiang, Committee Chair; Yahya Fathi, Committee Member; Shu Fang, Committee Member; Wenbin Lu, Committee Member
  • No Thumbnail Available
    Improving the Efficiency of Tests and Estimators of Treatment Effect with Auxiliary Covariates in the Presence of Censoring
    (2008-05-30) Lu, Xiaomin; Marie Davidian, Committee Member; Anastasios A. Tsiatis, Committee Chair; Hao Zhang, Committee Member; Wenbin Lu, Committee Member
  • No Thumbnail Available
    Joint Retirement Decisions between Husbands and Wives
    (2010-04-23) Wang, Jinjing; Wenbin Lu, Committee Member; Robert Clark, Committee Member; Melinda Sandler Morrill, Committee Chair
    This thesis uses data obtained from the Rand Health and Retirement Study (HRS) to examine whether husbands and wives will decide to retire jointly or separately. This paper uses the new data from 1992 to 2006 to show the current prevailing retirement patterns of older couples and estimates a joint retirement function to find out factors that would affect the retirement decisions. This thesis shows 24.29% of couples with each spouse having career jobs prefer retiring jointly. Besides this paper finds both wives’ earnings before retirement and husbands’ incomes after retirement have negative effects on the decision to jointly retire, while wives’ retirement incomes have positive effects.
  • No Thumbnail Available
    Mapping Quantitative Trait Loci in Outbred Half-sib Populations
    (2009-05-20) Gong, Xiaohua; Zhao-Bang Zeng, Committee Chair; Melissa Ashwell, Committee Member; Wenbin Lu, Committee Member; Steffen Heber, Committee Member
    Quantitative trait loci (QTL) mapping in outbred populations faces some challenges unique to the divergent genetic background and complex pedigree relationships. Motivated by a dairy cattle half-sib data set from a grand daughter design, we present in this dissertation a series of endeavors to address various challenges along the analysis flow of QTL mapping. A first step is to infer the haplotypes in sires based on the observed genotypes in sires and his offspring. Our method was shown to outperform peer methods with greater robustness and accuracy yet with fast speed performance. Then in light of adapting the multiple interval mapping method to within-family QTL analysis, we extended the modeling framework by allowing for heteroscedastic residual variances and upgraded the Windows QTL Cartographer accordingly. The advantageous post-analysis result parsing from Windows QTL Cartographer and more importantly, the improved analysis outputs due to more powerful maximum likelihood-based mixture modeling than the least squares regression manifest our efforts in delivering better methodology via practically user friendly software. We further developed a mixed model approach for the purpose of QTL mapping across multiple families that was aimed at modeling QTL effects as both the fixed effect across families and the random effect within families. Our mixed model was shown to encompass similar or higher statistical testing performance on QTL variation than the widely used variance component modeling approach, yet still allowing permutations for obtaining chromosome-wide or genome-wide significance threshold. What's more, the flexibility of our mixed model in constructing alternative hypotheses testing on either fixed or random QTL effects or both was shown to offer interesting insight into the varying sources of signal that would not be unveiled by least squares regression or variance component methods. In concluding our comprehensive approach to QTL linkage mapping in dairy cattle populations, we continue to explore methods of fine mapping by combining both the linkage disequilibrium and linkage information and prospective method improvements are being sought.
  • No Thumbnail Available
    NE213 Scintillator Characterization Using n/gamma Digital Pulse Shape Discrimination
    (2008-02-25) Li, Andy On; Man-Sung Yim, Committee Member; Ayman I. Hawari, Committee Chair; Wenbin Lu, Committee Member
    NE213 scintillation detectors are excellent tools for use in a mixed gamma and neutron field due to its established pulse shape discrimination ability. With proper pulse shape discrimination, gamma or neutron responses may be obtained in a mixed radiation field. The neutron response needs to be deconvolved from the detector response function to obtain an energy spectrum. To perform unfolding of neutron spectra, mono-energetic responses are needed and the responses may be obtained via experiment or simulation of the NE213 detector. In this work, response functions were tested with the unfolding of a Cf-252 spectrum. Particularly, experiments were performed at the Los Alamos National Laboratory where Cf-252 spectra were obtained. During the experiment, different pre-amplifier set-ups were tested. Namely, the pulse shape discrimination ability of the system using 50 ohm, 500 ohm, and 1000 ohm termination resistors were compared. However, linear system responses were not observed with the different settings. Thus, the Amoeba Simplex fitting routine was used to augment the charge integration pulse shape discrimination technique to separate the neutron and the gamma signals. Furthermore, a new figure of merit scheme was explored to quantify the pulse shape discrimination ability of the said non-linear system. Alongside the Cf-252 spectra experimental measurements, the program Scinful was used to generate mono-energetic neutron responses needed for unfolding. Consequently, both the Cf-252 neutron spectra and the Scinful responses were used with the program FERD-PC for unfolding. Among the different termination resistors compared with the new figure of merit scheme, the 50 ohm resistor setting was observed to be superior. The resultant unfolded spectrum using the 50 ohm termination resistor shows excellent agreement between 2-10 MeV. However, even with the Amoeba Simplex method, results below 2 MeV are not accurate due to the poor pulse shape discrimination limit of the system used. Results above 10 MeV were not obtained due to the energy range of the used Cf-252 source. The excellent agreement for the unfolded spectrum between 2-10 MeV does allow for the confirmation of Scinful's responses' accuracy for that particular energy range. However, Scinful's accuracy for energy ranges outside of this bound is not confirmed in this work.
  • No Thumbnail Available
    Recombineering-based Gene Tagging in Arabidopsis.
    (2010-10-26) Zhou, Rongrong; Jose Alonso, Committee Chair; David Bird, Committee Chair; Wenbin Lu, Committee Member; Steffen Heber, Committee Member; Robert Franks, Committee Member
  • No Thumbnail Available
    Semiparametric Methods for Analysis of Randomized Clinical Trials and Arbitrarily Censored Time-to-event Data.
    (2009-04-03) Zhang, Min; Wenbin Lu, Committee Member; Marie Davidian, Committee Chair; Anastasios A. Tsiatis, Committee Co-Chair; Daowen Zhang, Committee Member
    This dissertation includes two parts. In part one, using the theory of semiparametrics, we develop a general approach to improving efficiency of nferences in randomized clinical trials using auxiliary covariates. In part two, we study "smooth" semiparametric regression analysis for arbitrarily censored time-to-event data. The primary goal of a randomized clinical trial is to make comparisons among two or more treatments. For example, in a two-arm trial with continuous response, the focus may be on the difference in treatment means; with more than two treatments, the comparison may be based on pairwise differences. With binary outcomes, pairwise odds-ratios or log-odds ratios may be used. In general, comparisons may be based on meaningful parameters in a relevant statistical model. Standard analyses for estimation and testing in this context typically are based on the data collected on response and treatment assignment only. In many trials, auxiliary baseline covariate information may also be available, and it is of interest to exploit these data to improve the efficiency of inferences. Taking a semiparametric theory perspective, we propose a broadly-applicable approach to adjustment for auxiliary covariates to achieve more efficient estimators and tests for treatment parameters in the analysis of randomized clinical trials. Simulations and applications demonstrate the performance of the methods. A general framework for regression analysis of time-to-event data subject to arbitrary patterns of censoring is proposed. The approach is relevant when the analyst is willing to assume that distributions governing model components that are ordinarily left unspecified in popular semiparametric regression models, such as the baseline hazard function in the proportional hazards model, have densities satisfying mild "smoothness" conditions. Densities are approximated by a truncated series expansion that, for fixed degree of truncation, results in a "parametric" representation, which makes likelihood-based inference coupled with adaptive choice of the degree of truncation, and hence flexibility of the model, computationally and conceptually straightforward with data subject to any pattern of censoring. The formulation allows popular models, such as the proportional hazards, proportional odds, and accelerated failure time models, to be placed in a common framework; provides a principled basis for choosing among them; and renders useful extensions of the models straightforward. The utility and performance of the methods are demonstrated via simulations and by application to data from time-to-event studies.
  • No Thumbnail Available
    Semiparametric Mixed Models for Censored Longitudinal Data.
    (2010-11-01) Huang, Mingyan; Daowen Zhang, Committee Chair; Hao Zhang, Committee Chair; Marie Davidian, Committee Member; Wenbin Lu, Committee Member; Richard Braham, Committee Member
  • No Thumbnail Available
    Smooth Inference for Survival Functions with Arbitrarily Censored Data
    (2006-09-17) Doehler, Kirsten Ann; Wenbin Lu, Committee Member; Marie Davidian, Committee Chair; Anastasios Tsiatis, Committee Member; Hao Zhang, Committee Member
    We propose a new procedure for estimating the survival function of a time-to-event random variable under arbitrary patterns of censoring. Under mild smoothness assumptions, this procedure allows a unified approach to handling different kinds of censoring, while in many cases increasing efficiency. Our approach uses a seminonparametric (SNP) density to represent the density of failure times. The SNP has a flexible "parametric" representation that admits a convenient expression for the likelihood and allows it to capture arbitrary shapes through choice of a tuning parameter, which may be carried out based on standard selection criteria such as AIC and BIC. We present simulation studies to validate our proposed methods. Using right-censored and interval-censored data from popular parametric models, we compare survival function estimators based on our SNP density approach to that of the corresponding nonparametric estimator. We also develop a test statistic where each survival curve is estimated via our SNP density approach, and demonstrate that it has reliable operating characteristics and can result in increased power relative to nonparametric tests. The new methods are applied to a number of data sets from biomedical studies.
  • No Thumbnail Available
    A Source-to-Sink Study of the Mekong River Delta: Hydrology, Delta Evolution, and Sediment Transport Modeling
    (2010-04-06) Xue, Zuo; Jingpu Liu, Committee Chair; Dave DeMaster, Committee Co-Chair; Elana Leithold, Committee Member; Ruoying He, Committee Member; Wenbin Lu, Committee Member
    The Mekong River is the third largest river in the Western Pacific. As the population and economy of the area booms, more and more dams are built in the Mekong basin. Concerns about negative impacts on downstream and the delta plain from upstream damming have been raised ever since the completion of the Manwan Dam, the first of the 13 major dams designed on the Upper Mekong, in 1993. The runoff of the Lower Mekong has a closer connection with the regional precipitation and El Niño Southern Oscillation during the post-dam period (1994-2005) than the pre-dam period (1950-1993). With ~ 200 new dams to be added to the basin in the next couples of decades, changes are expected in both hydrological regime and delta dynamics. The Mekong River delivers ~160 million tons of sediment per year to the South China Sea (SCS). The Mekong River Delta (MRD) has the third largest delta plain in the world. High-resolution seismic profiling and coring during 2006 and 2007 cruises reveals a low gradient, subaqueous delta system, up to 20 m thick, surrounding the modern MRD in the west of the SCS. A late Holocene sediment budget for the MRD has been determined, based on the area and thickness of deltaic sediment. Approximately 80% of Mekong delivered sediment has been trapped within the delta area, which, together with a falling sea-level, resulted in a rapidly prograding MRD over the past 3000 yr. The late Holocene evolution of the MRD has shown a morphological asymmetry indicated by a large down-drift area and a rapid progradation around Cape Camau, ~200 km downstream from the river mouth. The coupled hydrodynamic-sediment transport modeling using the Regional Ocean Modeling System (ROMS) and Community Sediment Transport Model System (CSTMS) showed that wind is a most important factor influencing the along-shelf sediment transport. This associates MRD’s asymmetric evolution with an increased wave influence during the Neoglaciation. Coastal currents formed by the geostrophically balanced Mekong plume are strengthened by intensified winter monsoons. Wave and tidal mixing re-suspends previously deposited Mekong sediments, which are then transported southwestward to the Gulf of Thailand. These results link sediment dynamics and delta evolution with variations in monsoonal activities during the late Holocene.
  • No Thumbnail Available
    Spectral Methods for Likelihood Approximation of Spatial Processes
    (2007-08-16) Ma, Liyun; Dennis Boos, Committee Member; Wenbin Lu, Committee Member; Richard Reynolds, Committee Member; Montserrat Fuentes, Committee Chair
  • No Thumbnail Available
    Topics in Longitudinal Studies with Coarsened Data
    (2007-01-11) Jiang, Liqiu; Anastasios A. Tsiatis, Committee Chair; John F. Monahan, Committee Member; Wenbin Lu, Committee Member; Marie Davidian, Committee Member
    In the first part of the dissertation, we derive two methods for responders analysis in longitudinal data with random missing data. Often a binary variable is generated by dichotomizing an underlying continuous variable measured at a specific point in time according to a prespecified threshold value. Ordinarily, a logistic regression model is used to estimate the effects of covariates on the binary response. In the event that the underlying continuous measurements are from a longitudinal study, the repeated measurements are often analyzed using a repeated measures model because of mathematical and computational convenience of available off-the-shelf software. This practical advantage motivates us to propose two methods: one is to use repeated measures model as an imputation approach in the presence of missing data on the responder status as a result of patient drop-out before completion of the study. We then apply the logistic regression model on the observed or otherwise imputed responder status; the other is to construct estimating equations based on the relationship of repeated measures model and logistic regression model. Large sample properties of the resulting estimators are derived and simulation studies carried out to assess the performance of the estimators in situations where either the model for the continuous repeated measurements is misspecified as following a multinormal distribution, when, in truth, it follows a logistic distribution that is compatible with the logistic regression model for the probability of response or when the model that the probability of response following a logistic regression model is misspecified because, in truth, the longitudinal data follow a multinormal distribution. We show that the resulting estimators are robust to misspecification and apply them to data from a clinical trial on a toenail disease. We adopt a semiparametric estimator to a longitudinal data with measurement error in the second part of the dissertation. In longitudinal studies, we are often interested in the relationship between a primary response and the profile of repeated measurements collected over time for a subject, which can be dictated by individual random effects in the framework of a generalized linear model. For example, if the longitudinal profile is linear, the relationship of individual intercept and slope and primary response would be of interest. The naive method by fitting a regression model to obtain estimates for individual random effects can lead to biased results. Li, Zhang, and Davidian (Biometrics 2004) developed conditional score approaches for generalized linear models which require no assumption on the distribution of the random effects and yield consistent inference regardless of the true distribution. However, the estimator can only be used for generalized linear models in canonical form with normally distributed measurement error. To overcome this limitation, we adopt locally efficient semiparametric estimators proposed by Tsiatis and Ma (Biometrika 2004) for functional measurement error models to use for such longitudinal studies. The distribution of random effects is allowed to be misspecified and the method will still yield consistent inference. Simulation studies are carried out to assess the performance of the estimator. We show that the estimator can give much better inference than the naive method in terms of bias and empirical probability coverage. The approach is applied to data from a study on woman's bone disease.
  • No Thumbnail Available
    Variable Selection in Semi-parametric Additive Models with Extensions to High Dimensional Data and Additive Cox Models
    (2008-06-27) Liu, Song; Hao Helen Zhang, Committee Chair; Dennis Boos, Committee Member; Wenbin Lu, Committee Member; John Monahan, Committee Member

Contact

D. H. Hill Jr. Library

2 Broughton Drive
Campus Box 7111
Raleigh, NC 27695-7111
(919) 515-3364

James B. Hunt Jr. Library

1070 Partners Way
Campus Box 7132
Raleigh, NC 27606-7132
(919) 515-7110

Libraries Administration

(919) 515-7188

NC State University Libraries

  • D. H. Hill Jr. Library
  • James B. Hunt Jr. Library
  • Design Library
  • Natural Resources Library
  • Veterinary Medicine Library
  • Accessibility at the Libraries
  • Accessibility at NC State University
  • Copyright
  • Jobs
  • Privacy Statement
  • Staff Confluence Login
  • Staff Drupal Login

Follow the Libraries

  • Facebook
  • Instagram
  • Twitter
  • Snapchat
  • LinkedIn
  • Vimeo
  • YouTube
  • YouTube Archive
  • Flickr
  • Libraries' news

ncsu libraries snapchat bitmoji

×