Statistical Methods for Family-Based Association Studies for Complex Human Diseases: Single-Locus and Haplotype Methods

Abstract

Disease-gene fine-mapping is an important task in human genetics. Linkage and association analyses are the two main approaches for exploring disease susceptibility genes. In Chapter 1, we introduce the development of methods for disease-gene mapping in the past decades and present the rationale behind our new method development. Family-based association analyses have provided powerful tools for disease-gene mapping. The Association in the Presence of Linkage test (APL), a family-based association method, can use nuclear families with multiple affected siblings and infer missing parental genotypes properly in the linkage region. In Chapter 2, we generalized and extended APL so that it can be applied to general nuclear family structures using a bootstrap variance estimator. Unlike the original APL that can handle at most two affected siblings, the new APL can handle up to three affected siblings. We also extended APL from a single-marker test to a multiple-marker haplotype analysis. According to our simulations, the new APL has a correct type I error rate and more power than other family-based association methods such as PDT, FBAT⁄HBAT, and PDTPHASE in nuclear families with missing parents. The robustness of APL when there are rare alleles or haplotypes and when there is population substructure such that the allele frequencies in the population deviated from the Hardy-Weinberg Equilibrium (HWE) assumption was also examined in Chapter 2. Genes on the X chromosome play a role in many common diseases. Linkage analyses have identified regions on the X chromosome with high linkage peaks for several diseases. Currently there are few family-based association methods available for X-chromosome markers. In order to fill in this gap, we proposed a novel family-based association method, X-APL, in Chapter 3. X-APL is a modification of APL and shares some important properties with APL. X-APL can also perform haplotype analyses, which is the only family-based test of association we are aware of for testing haplotypes for the X-chromosome markers. Our simulation results showed that X-APL has a correct type I error rate and has more power than other family-based association methods for X chromosome such as XS-TDT, XPDT and XMCPDT for single-marker analysis in nuclear families. The robustness of X-APL when there are deviations of genotype frequencies from HWE was also examined in Chapter 3. Linkage and family-based association analyses are often applied simultaneously in the same data in order to maximize use of family data sets. However, it is not intuitively clear under what conditions association and linkage tests performed in the same data set may be correlated. In Chapter 4, we used computer simulations and theoretical statements to estimate the correlation between linkage statistics (affected sib pair maximum LOD scores) and family-based association statistics (PDT and APL) under various hypotheses. Different types of pedigrees were studied: nuclear families with affected sib pairs, extended pedigrees and incomplete pedigrees. Both simulation and theoretical results showed that when there is either no linkage or no association, the linkage and association statistics are not correlated. When there is linkage and association in the data, the two tests have a positive correlation.

Description

Keywords

complex diseases, family-based association analysis, X-linked diseases, linkage analysis, correlation

Citation

Degree

PhD

Discipline

Bioinformatics

Collections