Log In
New user? Click here to register. Have you forgotten your password?
NC State University Libraries Logo
    Communities & Collections
    Browse NC State Repository
Log In
New user? Click here to register. Have you forgotten your password?
  1. Home
  2. Browse by Author

Browsing by Author "William R. Atchley, Committee Chair"

Filter results by typing the first few letters
Now showing 1 - 5 of 5
  • Results Per Page
  • Sort Options
  • No Thumbnail Available
    Computational Biology of Ras Proteins
    (2008-04-07) Dellinger, Andrew Everette; William R. Atchley, Committee Chair; Carla Mattos, Committee Member; Jeffrey Thorne, Committee Member; Jon Doyle, Committee Member
    In this research, computational biology is used to elucidate how evolutionary history has changed roles of structure and function among Ras proteins, with a focus on the Ras family. This dissertation begins with phylogenetic analyses of the Ras superfamily and Ras family. Phylogenetic trees of the Ras family were estimated using Neighbor-Joining, Weighted Neighbor-joining, Parsimony, Quartet Puzzling, Maximum Likelihood and Bayesian methods. In nearly all cases, each clade represented a subfamily. Clade members and clade divisions were consistent among all the trees, increasing the probability of a correct estimation of the evolutionary history. Further investigation into the evolution of sequence involved decomposing sequence covariation into its respective components. The roles of the functional and structural components of covariation were the focus of several multivariate analyses. Decision tree analysis, a data mining method, found that sequence divergence in critical sites of the hydrophobic core, dimerization regions and ligand binding regions were sufficient to divide Ras subfamilies. Alignments of GDP-bound and GTP-bound crystal structures revealed that only Ral and M-Ras proteins have structural variation in the effector binding switch I regions, while all Ras structures vary in the protein binding switch II region. Di-Ras2-GDP was shown to have a unique C-terminal loop which binds to the interswitch region. Last, a common factor analysis was computed. The factors contain the set of sites that both discriminate among the subfamilies and have a unique functional or structural role, such as Ral tree-determinant sites. Finally, sequence signatures were developed for each of the families of the Ras superfamily using Boltzmann-Shannon entropy. This method was compared to the PROSITE signature, profile hidden Markov model and MEME position-specific scoring matrix methods. The Entropy method identified approximately 8% fewer proteins than the best of the other methods, MEME. Comparative analyses of these sequence signatures determined which sites and amino acids played important roles in the changes in protein function and structure among Ras families.
  • No Thumbnail Available
    Molecular and evolutionary analysis of the GATA transcription factor family
    (2003-03-03) Lowry, Jason Allen; William R. Atchley, Committee Chair; Steven L. Spiker, Committee Member; Michael D. Purugganan, Committee Member; Jeffrey L. Thorne, Committee Member
    The objective of this research has been to characterize the evolution of the GATA family of transcription factors through phylogenetic, molecular, and biochemical analyses. From a phylogenetic perspective, we address three major questions. First, does this protein family represent a monophyletic or polyphyletic group? Second, what methods of gene or modular duplication are utilized within different organisms to propagate and maintain this group of proteins? Third, what are the structural or functional constraints on evolution of the conserved DNA-binding domain? These questions are addressed through a combination of computational and molecular methods. Phylogenetic analyses provide evidence of monophyletic origin for the GATA family followed by gene duplication and modular evolution, accompanied by considerable divergence outside the conserved zinc finger DNA-binding domain. Genomic comparisons permit the tracing of GATA factor evolution and provide insight into mechanisms utilized by respective organisms. Finally, mutational and biochemical analyses enable the separation of phylogenetic and structural/functional constraints on the conserved zinc finger DNA-binding domain. The result of this research is a predictive motif for classifying potentially homologous proteins to be discovered in future studies.
  • No Thumbnail Available
    Multivariate Statistical Analysis of Protein Variation
    (2006-03-09) Zhao, Jieping; Bruce S. Weir, Committee Member; Zhao-Bang Zeng, Committee Member; Thomas M. Gerig, Committee Member; William R. Atchley, Committee Chair
    The purpose of this research is to study the protein sequence metric problem solution and apply it to explore the structural, functional and evolutionary aspects of basic helix-loop-helix (bHLH) protein family. Sequence metric problem is caused by the alphabetic coding of the amino acids and has long been a hindrance to efficient protein sequence analysis. This dissertation started with revisiting sequence metric problem solution initiated by Atchley et al (2005) [PNAS102(18):6401-6]. Some of the unsolved issues, such as information loss, model robustness, and concordance between factor analysis and principal component analysis were studied. Further, classification of 20 amino acids was explored in the numerical factor space resolved by Atchley et al (2005) Next two parts of the dissertation were focused on computational and statistical studies of the bHLH protein family. All the protein sequence data were transformed into numerical vectors by using the amino acid factor scores from the sequence metric solution. In the second part of the dissertation, protein sequence variability in the level of statistically supported lineages (=clades) was studied using the stepwise discriminant analysis. Some of the important sites for the clades discrimination were selected and hierarchical classification strategies for the clades were proposed. In the third part of the dissertation, 147 Arabidopsis bHLH proteins were studied by a series of multivariate analyses. Results showed that there were significant sequence differences between plant (e.g. Arabidopsis) and animal bHLH proteins, and some of the contributing discriminant sites were selected and discussed. Binding property of each of the Arabidopsis bHLH proteins was assigned by using the classification rules derived from animal bHLH proteins.
  • No Thumbnail Available
    Protein Evolution From Sequence To Structure.
    (2003-09-01) Buck, Michael Joseph; Jeff Thorne, Committee Member; Barbara Sherry, Committee Member; Michael Purugganan, Committee Member; William R. Atchley, Committee Chair
    The purpose of this research is to elucidate how natural selection shapes protein evolution. The question was addressed by exploring protein sequence evolution, 3D structural evolution, and analysis of the multidimensional nature of amino acid covariation. This thesis begins with a study of protein sequence evolution. 118 different bHLH genes in the completely sequenced Arabidopsis thaliana genome and 131 bHLH genes in the rice genome were identified and characterized using phylogenetic analysis. These plant proteins were classified into 15 distinct plant clades and were under weaker selective constraints than their animal counterparts. Additionally, it was shown that lineage specific expansions and subfunctionalization have fashioned regulatory proteins for plant specific functions. To further characterize the bHLH domain, a canonical 3D structure was created from solved structures. This canonical structure was used as a template for producing 3D models for other representative bHLH proteins, which were then compared, contrasted, and grouped based on structural characteristics. Structural similarities were discovered within the bHLH domain between three clades (Max, Myc, and PbHLH-LZ). In addition, structural models of the Sat proteins suggest a strong similarity to other bHLH proteins, which is in disagreement with previous functional characterization. To further understand the dimensionality of protein evolution, the independence of amino acid sites was explored using multivariate factor analysis. A matrix of pairwise normalized mutual information values were computed among amino acid sites for the serpin proteins. The normalized mutual information matrix was partitioned into orthogonal dimensions by factor analysis. Each eigenvector from the factor analysis can be interpreted as having phylogenetic or structural/functional explanations or combinations of both. This approach discerns strong amino acid covariation within several key functional regions including the RCL, shutter, and breach. In addition, this approach elucidates hydrogen bonding, hydrophobic, and electrostatic interactions within the serpin protein family.
  • No Thumbnail Available
    Quantifying Phylogenetic Conservation in Protein Molecular Evolution
    (2006-11-02) Fernandes, Andrew Dellano; Steffen Heber, Committee Member; William R. Atchley, Committee Chair; Eric A. Stone, Committee Member; Charles E. Smith, Committee Member
    This dissertation examines the problem of quantifying amino acid conservation in proteins molecular evolution. Ideally, this conservation is quantified by inferring the rate of evolution at each amino acid site of a multiple-alignment. However, current rate-inference methods have three problematic assumptions. The methods assume that (a) the rates of all sites are independent, (b) the rates are drawn from a known prior distribution, and (c) the mean rate across sites is approximately one. The problems are two-fold. First, the assumptions of site-rate independence and known mean rate are contradictory. To see the contradiction, consider a two-site alignment with known rate of ~0.5 at site one. The rate at site two is unknown under the independent-sites assumption, but is ~1.5 by the assumption of known mean rate. Second, if the rates are drawn from a known prior distribution, the assumption of known distribution implies the question "which distribution?". Previous work has focused only on selecting better families of rate distributions, often at the expense of additionally parameterizing the evolutionary model. Herein, I develop a method of inferring rates requiring only the assumption of known mean rate, and not requiring additional parameterization. Thus a model of evolution based on our method is a more general framework for inferring rates than previous work. Since a known mean rate is required to distinguish evolutionary rate from time, our method is arguably the most general possible that allows rate and time to be fully and independently identified. The method is assessed by investigating conservation in the Myc, Max, and p53 transcription-factor families.

Contact

D. H. Hill Jr. Library

2 Broughton Drive
Campus Box 7111
Raleigh, NC 27695-7111
(919) 515-3364

James B. Hunt Jr. Library

1070 Partners Way
Campus Box 7132
Raleigh, NC 27606-7132
(919) 515-7110

Libraries Administration

(919) 515-7188

NC State University Libraries

  • D. H. Hill Jr. Library
  • James B. Hunt Jr. Library
  • Design Library
  • Natural Resources Library
  • Veterinary Medicine Library
  • Accessibility at the Libraries
  • Accessibility at NC State University
  • Copyright
  • Jobs
  • Privacy Statement
  • Staff Confluence Login
  • Staff Drupal Login

Follow the Libraries

  • Facebook
  • Instagram
  • Twitter
  • Snapchat
  • LinkedIn
  • Vimeo
  • YouTube
  • YouTube Archive
  • Flickr
  • Libraries' news

ncsu libraries snapchat bitmoji

×