Statistical Nonparametric and Linear Mixed Model Analyses of Oligonucleotide DNA Chips Data

Abstract

Scientists investigate the dynamic relationships among genes and the associated phenotypes through gene expression array (microarray) studies. An essential step in the tasks is to identify the genes that actually interact with the phenotypic outcomes. This dissertation focuses on the selection of informative genes with statistical approaches. In chapter one, a nonparametric approach that combines the Bootstrap resampling method and the Kruskal-Wallis test (the BKW test) for gene selection is discussed. Principal component and clustering analyses are performed for disease multi-type classification. In chapter two, steps are outlined and described for a statistically rigorous approach to analyzing probe-level GeneChip™ data. The approach employs classical linear mixed models and operates on a gene-by-gene basis. The method can accommodate complex experiments involving many kinds of treatments and can test for their effects at the probe level. Furthermore, mismatch probe data can be incorporated in different ways or ignored altogether. In chapter three, an empirical comparison of the linear mixed model and the Li-Wong's multiplicative model is presented for a real data set, and it is found that the models perform quite similarly across most genes, but with some interesting and important distinctions. Results are also presented from a simulation study designed to assess inferential properties of the models, and a modified test statistic is presented for the Li-Wong model that provides an improvement in Type I error control. The analysis approaches discussed here are applied to the data from oligonucleotide DNA chips. However, the concepts are also applicable to the data from cDNA microarrays.

Description

Keywords

LINEAR, OLIGONUCLEOTIDE, NONPARAMETRIC, DNA CHIPS

Citation

Degree

PhD

Discipline

Statistics

Collections