A New Method for Genetic Network Reconstruction in Expression QTL Data Sets

dc.contributor.advisorZhao-Bang Zeng, Committee Chairen_US
dc.contributor.advisorRussell Wolfinger, Committee Memberen_US
dc.contributor.advisorJung-Ying Tzeng, Committee Memberen_US
dc.contributor.advisorRonald Sederoff, Committee Memberen_US
dc.contributor.authorDuarte, Christine Woodsen_US
dc.date.accessioned2010-04-02T18:53:28Z
dc.date.available2010-04-02T18:53:28Z
dc.date.issued2009-11-16en_US
dc.degree.disciplineBioinformaticsen_US
dc.degree.leveldissertationen_US
dc.degree.namePhDen_US
dc.description.abstractExpression QTL (or eQTL) studies involve the collection of microarray gene expression data and genetic marker data from segregating individuals in a population in order to search for genetic determinants of differential gene expression. Previous studies have found large numbers of trans-regulated genes that link to a single locus or eQTL ``hotspot". It would be of great interest to discover the mechanism of co-regulation for these groups of genes. However, many difficulties exist with current network reconstruction algorithms such as low power and high compuatational cost. A common observation for biological networks is that they have a scale-free or power-law architecture. In such an architecture, there exist highly influential nodes that have many connections to other nodes, but most nodes in the network have very few connections. If we assume that this type of architecture applies to genetic networks, then we can simplify the problem of genetic network reconstruction by focusing on discovery of the key regulatory genes at the top of the network. We introduce the concept of ``shielding" in which a gene is conditionally independent of the QTL given the shielder gene, and we iteratively build networks from the QTL down using tests of conditional independence. We evaluate the confidence level of shielders using a two-part strategy of requiring a threshold number of genes to be shielded and requiring a high level of bootstrap support for shielders. We have performed a set of simulations to test the sensitivity and specificity of our method as a function of method parameters. We have found that our method has good performance using a significance level of 0.05 for testing the hypothesis that a gene is a shielder, with little gained by decreasing $alpha$ further. The shielder boostrap confidence level depends on the desired balance between false positives and false negatives, but our recommendation is to use 80\% bootstrap support for high confidence of discovered network features. With a small sample size (100) and a large number of network genes (as many as 600), our algorithm succeeds in finding a high percentage of the key network regulators (47\% on average) with high confidence (95\% specificity on average). We have applied our network reconstruction algorithm to a yeast expression QTL data set in which microarray and marker data were collected from the progeny of a backcross of two species of extit{Saccharomyces cerevisiae} cite{Brem2002}. Networks have been reconstructed for 11 of the largest eQTL hotspots in this data set. The regulation of shielder gene expression has been found to be primarily in trans, although about 10\% of shielder genes are found to be regulated in cis. Bioinformatic analysis of three networks generated different hypotheses for mechanisms of regulation of the shielded genes by the primary shielders. One common theme was that the shielders modulated the effect of transcription factors of which they were themselves targets. Overall our method has created a large list of potentially important regulatory genes in various yeast biological processes, and further bioinformatic analysis or laboratory experiments could lead to the generation and testing of many important hypthotheses.en_US
dc.identifier.otheretd-08172009-225656en_US
dc.identifier.urihttp://www.lib.ncsu.edu/resolver/1840.16/4410
dc.rightsI hereby certify that, if appropriate, I have obtained and attached hereto a written permission statement from the owner(s) of each third party copyrighted matter to be included in my thesis, dis sertation, or project report, allowing distribution as specified below. I certify that the version I submitted is the same as that approved by my advisory committee. I hereby grant to NC State University or its agents the non-exclusive license to archive and make accessible, under the conditions specified below, my thesis, dissertation, or project report in whole or in part in all forms of media, now or hereafter known. I retain all other ownership rights to the copyright of the thesis, dissertation or project report. I also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report.en_US
dc.subjecteQTLen_US
dc.subjectBayesian networksen_US
dc.subjectgenetic networksen_US
dc.subjectQTLen_US
dc.titleA New Method for Genetic Network Reconstruction in Expression QTL Data Setsen_US

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
etd.pdf
Size:
671.3 KB
Format:
Adobe Portable Document Format

Collections