Statistical Analysis of Compounds Using OBSTree and Compound Mixtures Using Nonlinear Models


A novel tree-structured data-mining tool is proposed to automatically search for and find high performance classification and important quantitative structure-activity relationships (QSARs) hidden in large data sets. The presence or absence of multiple chemical features is implemented to identify more informative splitting rules. A stochastic optimization scheme combined with a new splitting criterion and a post-trimming procedure is developed to find global optimum splitting variables. The algorithm is also ready to serve as a powerful predictive tool for estimating unknown biological activities according to the chemical structures. We also investigate several statistical issues in chemical mixture studies. With a thorough review of different concepts of additivity the criteria for evaluating a concept of additivity are discussed and a particular concept of additivity is generalized to some complicated studies. A nonlinear dose-response model is initially developed for binary mixtures. The model can be easily generalized to a mixture of $M$ chemicals. Different types of test statistics under multiplicity adjustments are proposed to test the interactions.



hypothesis test, statistics, data mining, chemical mixtures, nonlinear model, QSAR





