Browsing by Author "Dr. Jaewoo Kang, Committee Member"
Now showing 1 - 4 of 4
- Results Per Page
- Sort Options
- Data Mining Techniques to Enable Large-scale Exploratory Analysis of Heterogeneous Scientific Data(2009-04-24) Chopra, Pankaj; Dr. Steffen Heber, Committee Co-Chair; Dr. Xiaosong Ma, Committee Member; Dr. Donald L. Bitzer, Committee Chair; Dr. Ting Yu, Committee Member; Dr. Jaewoo Kang, Committee MemberRecent advances in microarray technology have enabled scientists to simultaneously gather data on thousands of genes. However, due to the complexity of genetic interactions, the function and purpose of many genes remains unclear. The cause and progression of many diseases, like cancer and Alzheimer's, is increasingly being attributed to the deregulation of critical genetic pathways. Data mining is now being extensively used in biological datasets to infer gene function, and to identify genetic biomarkers for disease prognosis and treatment. There is a considerable need to design algorithms that explore and interpret the underlying microarray data from a biological perspective. In this thesis, three areas of data mining in heterogeneous biological datasets have been addressed. First, a new clustering algorithm has been designed that leverages information on known gene functions. Most conventional clustering algorithms generate only one set of clusters, irrespective of the biological context of the analysis. This is often inadequate to explore data from different biological perspectives and gain new insights. The new clustering model generates multiple versions of different clusters from a single dataset, each of which highlights a different aspect of the given dataset. Second, a new classification algorithm has been designed that uses gene pairings for cancer classification. This exploits the concept that due to genetic interactions, gene pairs may be a better metric for cancer classification compared to single genes. Third, a meta-analysis of human and mouse cancer datasets is conducted. The results are then integrated with gene ontology and pathway knowledge to highlight pathways that are closely implicated in the cause and progression of cancer.
- Implementation of DRAND, the Distributed and Scalable TDMA Time Slot Scheduling Algorithm(2005-12-06) Min, Jeong Ki; Dr. Jaewoo Kang, Committee Member; Dr. Rudra Dutta, Committee Member; Dr. Injong Rhee, Committee ChairThe problem of energy savings is the most important subject currently in the research area of wireless sensor networks. So, in order to present a better scheme for energy savings and system performance, the TDMA scheme is considered as a solution. Moreover, the TDMA time slot scheduling algorithm is an important issue in running the TDMA scheme. The distributed and scalable fashion is required in wireless sensor networks because it is very difficult and inefficient to manage many sensor nodes by the centralized method with small size of memory space and battery capacity on each sensor node deployed in the broad sensing field. So, we implemented DRAND, the TDMA time slot scheduling algorithm which supports the important requirements as we listed above. Even though a scheme shows good performance by the simulation result, the implementation as a real system is another problem to solve. This is because good simulation results could not guarantee that implementation of the algorithm would work properly in the real word due to various unexpected obstacles. Therefore, by implementing the DRAND scheme as a real system, we can confirm the analysis and simulation result with various real experiments. For the experiment, we use up to 42 MICA2 motes for one-hop and multi-hop test.
- Proximity Induced Labelling Schemes for Distributed Hash Tables(2004-08-16) Warrier, Ajit Chakrapani; Dr. Jaewoo Kang, Committee Member; Dr. Khaled Harfoush, Committee Member; Dr. Injong Rhee, Committee ChairP2P systems have been recently introduced as an unconventional approach to networking. Among them, structured P2P systems (or Distributed Hash Tables) have such benefits as load balancing, scalability, and self-organizing nature. Most of the earliest structured P2P systems had virtualized address spaces, hence disregarding underlying physical topologies while creating the overlay. By incorporating knowledge of the underlying topology into the P2P system, efficient overlays can be constructed. There have been several different approaches towards this goal. The most popular approach has been reactive in nature, where nodes having been assigned their virtual identifiers in the overlay, search for good neighbors or routes towards their destination. This work, on the other hand, takes a proactive approach. Our goal is to assign identifiers to nodes so that their position in the overlay would approximately reflect their position in the physical topology. Such identifiers or Proximity Induced Labels would then make the consequent search for good neighbors/routes unnecessary, since they would be implicit by the overlay geometry. We introduce two such labeling techniques, one for the well known Content Addressable Network (CAN), and the other for the binary Hypercube, based on delay information from a set of well-known nodes on the Internet called Landmarks. Our performance evaluation demonstrates that proximity induced labels can be assigned in a scalable manner to CAN without changing the CAN algorithms, leading to better performance than the conventional CAN. Also, such labeling when combined with the high connectivity of the Hypercube, achieves highly efficient overlays at the cost of some increased node state.
- View Selection for Query-Evaluation Efficiency using Materialized Views(2005-10-22) Gupta, Shalu; Dr. Jaewoo Kang, Committee Member; Dr. Munindar P Singh, Committee Member; Dr. Rada Y Chirkova, Committee ChairThe purpose of this research is to show the use of derived data such as materialized views for run time optimization of aggregate queries. In this thesis, we show the trade off between the time taken to design the views Vs the query run time. We have designed a system called Query Performance Enhancement by Tuning (QPET) which implements the idea of designing and using materialized views to answer frequent aggregate queries.