Statistical Design and Analysis of High Throughput Screening Data Using Pooling Experiments and Data Mining Techniques

Remlinger, Katja Sabine

Statistical Design and Analysis of High Throughput Screening Data Using Pooling Experiments and Data Mining Techniques

Files

etd.pdf (1.53 MB)

Date

2004-07-02

Authors

Remlinger, Katja Sabine

Advisors

Dr. Jacqueline M. Hughes-Oliver, Committee Chair

Dr. S. Stanley Young, Committee Co-Chair

Abstract

Discovery of a new drug involves screening large chemical libraries to identify new and diverse active compounds. Only a very small percentage of the compounds in the library are active. Naive screening approaches of testing all compounds in the library are not desirable since in addition to being expensive, they provide little information on what aspects of the chemical structure of active compounds are related to activity. This work investigates pooling experiments as one possible approach of improving screening efficiency and gaining insight into the structure-activity relationships. Four different pooling designs are proposed using two design criteria, optimal coverage of the chemical space and minimal collision between compounds. We evaluate each method by determining how well the design criteria are met and whether the methods are able to find many diverse active compounds. One pooling design emerges as a winner, but all designed pools clearly outperform randomly created pools. Furthermore, different analysis approaches of the pooling designs are investigated. Multiple trees are compared to model-based likelihood approaches with different covariate class definitions. Results show that a model-based likelihood approach with a multiple-trees-lower-bound covariate class definition gives the best performance. Another possible approach of improving screening efficiency and gaining insight into the structure-activity relationships is the use of data mining techniques such as RandomForest and ChemTree. These techniques are applied to individual compounds.

Keywords

Uniform cell coverage designs, Chemical descriptors, Drug discovery

URI

http://www.lib.ncsu.edu/resolver/1840.16/3850

Degree

PhD

Discipline

Statistics

Collections

Dissertations

Full item page

Statistical Design and Analysis of High Throughput Screening Data Using Pooling Experiments and Data Mining Techniques

Files

Date

Authors

Advisors

Journal Title

Series/Report No.

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Degree

Discipline

Collections