Browsing by Author "Spencer V. Muse, Committee Chair"
Now showing 1 - 2 of 2
- Results Per Page
- Sort Options
- Software and Methods for Analyzing Molecular Genetic Marker Data(2003-07-18) Liu, Kejun; Edward Buckler, Committee Member; Montserrat Fuentes, Committee Member; Bruce S. Weir, Committee Member; Spencer V. Muse, Committee ChairGenetic analysis of molecular markers has allowed biologists to ask a wide variety of questions. This dissertation explores some aspects of the statistical and computational issues used in the genetic marker data analysis. Chapter 1 gives an introduction to genetic marker data, as well as a brief description to each chapter. Chapter 2 presents the different genetic analyses performed on a large data set and discusses the use of microsatellites to describe the maize germplasm and to improve maize germplasm maintenance. Considerable attention is focused on how the maize germplasm is organized and genetic variation is distributed. A novel maximum likelihood method is developed to estimate the historical contributions for maize inbred lines. Chapter 3 covers a new method for optimal selection of a core set of lines from a large germplasm collection. The simulated annealing algorithm for choosing an optimal k-subset is described and evaluated using the maize germplasm as an example; general constraints are incorporated in the algorithm, and the efficiency of the algorithms is compared to existing methods. Chapter 4 covers a two-stage strategy to partition a chromosomal region into blocks with extensive within-block linkage disequilibrium, and to select the optimal subset of SNPs that essentially captures the haplotype variation within a block. Population simulations suggest that the recursive bisection algorithm for block partitioning is generally reliable for recombination hotspots identification. Maximal entropy theory is applied to choose optimal subset of SNPs. The procedures are evaluated analytically as well as by simulation. The final chapter covers a new software package for genetic marker data analysis. The methods implemented in the package are listed. A brief tutorial is included to illustrate the features of the package. Chapter 5 also describes a new method for estimating population specific F-statistics and an extended algorithm for estimating haplotype frequencies.
- Testing Patterns of Nucleotide Substitution Rates at Multiple Genes(2002-11-17) Tao, Wenli; Spencer V. Muse, Committee ChairStudying patterns of nucleotide substitution rates at multiple loci can help provide clues to the evolution and function of genes. The computational drawback of the maximum likelihood version of relative ratio tests becomes a concern when a large number of pairwise comparisons are performed among multiple genes. We propose a new version of relative ratio test, including four procedures, based on the use of pairwise sequence distances. The first is based on ANOVA two-way model and allows covariances between branch lengths. The second method applies generalized estimation equations (GEEs) to Poisson regression in a log-linear model. The third one is a nonparametric approach based on bootstrap percentile confidence intervals. The fourth method is based on weighted least squares estimation with covariances. Simulation studies have been conducted to compare Type I errors and powers between the likelihood version of test and the first three proposed methods. The formula have been derived for the last method as well as the numerical steps. The ANOVA-based method is the least computationally expensive and it has desirable Type I errors in most cases as well as good powers. The bootstrap-based method is the slowest among the four methods, but with smallest Type I errors and powers similar to the ANOVA-based method. The Likelihood-based method is the second slowest and has more desirable Type I errors than those of the ANOVA-based method, but has less powers than the ANOVA-based method. The GEE-based method is suitable only for very long genes, but has good statistical properties. The ANOVA-based method is applied to mtDNA sequences from a broad range of animal mitochondrial genomes. The results indicate that it is uncommon that branch lengths are conserved well among animal mitochondrial genes.