Identifying Transcription Factor Targets and Studying Human Complex Disease Genes

Abstract

Transcription factors (TFs) have been characterized as mediators of human complex disease processes. The target genes of TFs also may be associated with disease. Identification of potential TF targets could further our understanding of gene-gene interactions underlying complex disease. We focused on two TFs, USF1 and ZNF217, because of their biological importance, especially their known genetic association with coronary artery disease (CAD), and the availability of chromatin immunoprecipitation microarray (ChIP-chip) results. First, we used USF1 ChIP-chip data as a training dataset to develop and evaluate several kernel logistic regression prediction models. Our most accurate predictor significantly outperformed standard PWM-based prediction methods. This novel prediction method enables a more accurate and efficient genome-scale identification of USF1 binding and associated target genes. Second, the results from independent linkage and gene expression studies suggest that ZNF217 also may be a candidate gene for CAD. We further investigated the role of ZNF217 for CAD in three independent CAD samples with different phenotypes. Our association studies of ZNF217 identified three SNPs having consistent association with CAD in three samples. Aorta expression profiling indicated that the proportion of the aorta with raised lesions was also positively correlated to ZNF217 expression. The combined evidence suggests that ZNF217 is a novel susceptibility gene for CAD. Finally, we applied our previously developed TF binding site (TFBS) prediction method to ZNF217. The performance of the prediction models of ZNF217 and USF1 are very similar. We demonstrated that our TFBS prediction method can be extended to other TFs. In summary, the results of this dissertation research are (1) evaluation of two TFs, USF1 and ZNF217, as susceptibility factors for CAD; (2) development of a generalized method for TFBS prediction; (3) prediction of TFBSs and target genes of two TFs, and identification of SNPs within TFBSs. This research allows for the development of study design to access TF based interactions in genetic susceptibility to human complex disease.

Description

Keywords

binding site, prediction, transcription factor

Citation

Degree

PhD

Discipline

Bioinformatics

Collections