Computational Methods for Identifying and Characterizing the Human Gene Regulatory Regions and Cis-elements

dc.contributor.advisorLeping Li, Committee Memberen_US
dc.contributor.advisorBruce S. Weir, Committee Chairen_US
dc.contributor.advisorWilliam R. Atchley, Committee Memberen_US
dc.contributor.advisorJeffrey L. Thorne, Committee Memberen_US
dc.contributor.advisorRussell D. Wolfinger, Committee Memberen_US
dc.contributor.authorHuang, Weichunen_US
dc.date.accessioned2010-04-02T19:13:06Z
dc.date.available2010-04-02T19:13:06Z
dc.date.issued2005-11-23en_US
dc.degree.disciplineBioinformaticsen_US
dc.degree.leveldissertationen_US
dc.degree.namePhDen_US
dc.description.abstractThe identification of functional regulatory regions and cis-elements is a preliminary step toward the reconstruction of gene regulatory networks. Comparative genomics has been demonstrated to be a powerful approach for motif discovery. However, the accurate alignment of complex genomic sequences, especially those of mammalians, remains a computational challenge. In chapter 2, we propose a novel pairwise alignment system, ACANA, to improve the alignment quality of genomic sequences. Compared with top competing alignment tools, ACANA achieves better alignment quality in aligning divergent sequences for both local and global alignments. When applied to the upstream sequences of human-mouse orthologs, ACANA is able to reliably detect the conserved functional regions containing most cis-elements. Statistical motif modeling is another fundamental computational approach for motif prediction in large genome sequence. In chapter 3, we introduce the mixture of optimized Markov models to reduce false motif discovery rate in large genomic sequences. Our model is not only able to incorporate most dependency information within a motif by optimizing the arrangement of motif positions, but also flexible for adjusting model complexity limited by the size of training data. We implement the mixture model in our OMiMa system. Using OMiMa, we demonstrate that our model can improve motif prediction accuracy. Although the reconstruction of complete human gene regulatory networks, at present, remains a distant hope, it is still possible to infer some distinct features of the networks from the available data. In chapter 4, we present an example of inferring major evolutionary features of human gene regulatory networks by combining information from both gene sequence data and functional annotations. We systematically analyze the association between gene function and upstream region conservation for human-rodent orthologs. Our study shows that upstream regulatory regions of developmental transcription regulators, such as Hox genes, are extremely conserved while those of catalytic enzymes are significantly less conserved. We suggest that developmental and other important regulators constitute the central hub of human gene regulatory networks.en_US
dc.identifier.otheretd-08222005-004213en_US
dc.identifier.urihttp://www.lib.ncsu.edu/resolver/1840.16/5393
dc.rightsI hereby certify that, if appropriate, I have obtained and attached hereto a written permission statement from the owner(s) of each third party copyrighted matter to be included in my thesis, dissertation, or project report, allowing distribution as specified below. I certify that the version I submitted is the same as that approved by my advisory committee. I hereby grant to NC State University or its agents the non-exclusive license to archive and make accessible, under the conditions specified below, my thesis, dissertation, or project report in whole or in part in all forms of media, now or hereafter known. I retain all other ownership rights to the copyright of the thesis, dissertation or project report. I also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report.en_US
dc.subjectPhylogenetic footprintingen_US
dc.subjectMixed Markov modelsen_US
dc.subjectAlignment algorithmen_US
dc.subjectMotif identificationen_US
dc.subjectTranscription regulationen_US
dc.subjectCNSen_US
dc.subjectCircular Markov chainen_US
dc.subjectGene regulatory networksen_US
dc.titleComputational Methods for Identifying and Characterizing the Human Gene Regulatory Regions and Cis-elementsen_US

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
etd.pdf
Size:
1.69 MB
Format:
Adobe Portable Document Format

Collections