Statistical methods for the analysis of genetics marker and microarray data
| dc.contributor.advisor | Bruce S. Weir, Committee Chair | en_US |
| dc.contributor.advisor | Dahlia M. Nielsen, Committee Co-Chair | en_US |
| dc.contributor.advisor | Greg Gibson, Committee Member | en_US |
| dc.contributor.advisor | Russell D. Wolfinger, Committee Member | en_US |
| dc.contributor.author | Yu, Xiang | en_US |
| dc.date.accessioned | 2010-04-02T18:33:54Z | |
| dc.date.available | 2010-04-02T18:33:54Z | |
| dc.date.issued | 2004-05-18 | en_US |
| dc.degree.discipline | Bioinformatics | en_US |
| dc.degree.level | dissertation | en_US |
| dc.degree.name | PhD | en_US |
| dc.description.abstract | With the advent of high-throughput technologies in genomics study, a large volume of data has been accumulated, leaving the challenge for bioinformaticists on how to manage, analyze, and interpret the data. Analysis of genetic marker and microarray data are two important aspects in current bioinformatics studies. In this dissertation work, we tend to explore some statistical issues for such problems. We discuss two extensions of the EM algorithm to infer haplotypes from genotype data, each for a particular sampling scenario. The first one applies to a random sample of both diploid and haploid individuals from the population, in which the haplotype information from the haploid individuals is incorporated into the estimation process. The second one applies to a sample of parent-offspring trios, in which the dependencies between the parental and the offspring genotypes are correctly handled in the analysis. We show that these two modified EM algorithms perform better than the usual one when applied to their corresponding specific samples, respectively. We study the experimental designs in two-color microarray experiments and resolve some of the outstanding issues that are controversial on the use of different experiment designs. We show that the loop and balanced block designs analyzed in a mixed model are more efficient that the reference designs from a statistical point of view. We also provide general guidelines on how to optimize experimental resources to get maximal efficiency using these designs. We present an application of the mixed model to identify transcription factor-gene interactions and to infer transcriptional regulatory structures in Sacchromyces cerevisiae using microarray experiments. We demonstrate the mixed model that pools the observations across all experiments to be a powerful approach. | en_US |
| dc.identifier.other | etd-05172004-203908 | en_US |
| dc.identifier.uri | http://www.lib.ncsu.edu/resolver/1840.16/3647 | |
| dc.rights | I hereby certify that, if appropriate, I have obtained and attached hereto a written permission statement from the owner(s) of each third party copyrighted matter to be included in my thesis, dissertation, or project report, allowing distribution as specified below. I certify that the version I submitted is the same as that approved by my advisory committee. I hereby grant to NC State University or its agents the non-exclusive license to archive and make accessible, under the conditions specified below, my thesis, dissertation, or project report in whole or in part in all forms of media, now or hereafter known. I retain all other ownership rights to the copyright of the thesis, dissertation or project report. I also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report. | en_US |
| dc.subject | Haplotype | en_US |
| dc.subject | EM algorithm | en_US |
| dc.subject | Microarray | en_US |
| dc.subject | Mixed model | en_US |
| dc.subject | experimental design | en_US |
| dc.title | Statistical methods for the analysis of genetics marker and microarray data | en_US |
Files
Original bundle
1 - 1 of 1
