A New Method for Genetic Network Reconstruction in Expression QTL Data Sets

Show full item record

Title: A New Method for Genetic Network Reconstruction in Expression QTL Data Sets
Author: Duarte, Christine Woods
Advisors: Zhao-Bang Zeng, Committee Chair
Russell Wolfinger, Committee Member
Jung-Ying Tzeng, Committee Member
Ronald Sederoff, Committee Member
Abstract: Expression QTL (or eQTL) studies involve the collection of microarray gene expression data and genetic marker data from segregating individuals in a population in order to search for genetic determinants of differential gene expression. Previous studies have found large numbers of trans-regulated genes that link to a single locus or eQTL ``hotspot". It would be of great interest to discover the mechanism of co-regulation for these groups of genes. However, many difficulties exist with current network reconstruction algorithms such as low power and high compuatational cost. A common observation for biological networks is that they have a scale-free or power-law architecture. In such an architecture, there exist highly influential nodes that have many connections to other nodes, but most nodes in the network have very few connections. If we assume that this type of architecture applies to genetic networks, then we can simplify the problem of genetic network reconstruction by focusing on discovery of the key regulatory genes at the top of the network. We introduce the concept of ``shielding" in which a gene is conditionally independent of the QTL given the shielder gene, and we iteratively build networks from the QTL down using tests of conditional independence. We evaluate the confidence level of shielders using a two-part strategy of requiring a threshold number of genes to be shielded and requiring a high level of bootstrap support for shielders. We have performed a set of simulations to test the sensitivity and specificity of our method as a function of method parameters. We have found that our method has good performance using a significance level of 0.05 for testing the hypothesis that a gene is a shielder, with little gained by decreasing $alpha$ further. The shielder boostrap confidence level depends on the desired balance between false positives and false negatives, but our recommendation is to use 80\% bootstrap support for high confidence of discovered network features. With a small sample size (100) and a large number of network genes (as many as 600), our algorithm succeeds in finding a high percentage of the key network regulators (47\% on average) with high confidence (95\% specificity on average). We have applied our network reconstruction algorithm to a yeast expression QTL data set in which microarray and marker data were collected from the progeny of a backcross of two species of extit{Saccharomyces cerevisiae} cite{Brem2002}. Networks have been reconstructed for 11 of the largest eQTL hotspots in this data set. The regulation of shielder gene expression has been found to be primarily in trans, although about 10\% of shielder genes are found to be regulated in cis. Bioinformatic analysis of three networks generated different hypotheses for mechanisms of regulation of the shielded genes by the primary shielders. One common theme was that the shielders modulated the effect of transcription factors of which they were themselves targets. Overall our method has created a large list of potentially important regulatory genes in various yeast biological processes, and further bioinformatic analysis or laboratory experiments could lead to the generation and testing of many important hypthotheses.
Date: 2009-11-16
Degree: PhD
Discipline: Bioinformatics
URI: http://www.lib.ncsu.edu/resolver/1840.16/4410

Files in this item

Files Size Format View
etd.pdf 671.2Kb PDF View/Open

This item appears in the following Collection(s)

Show full item record