Dependence Among Sites in Protein and RNA Evolution

No Thumbnail Available

Date

2005-11-13

Journal Title

Series/Report No.

Journal ISSN

Volume Title

Publisher

Abstract

Widely used models of molecular evolution assume independent change among sequence sites. This assumption facilitates computation but it is biologically unrealistic. RNA secondary structure and protein tertiary structure both change more slowly over time than do the encoding DNA sequences. The constraints upon sequence evolution that serve to maintain structure induce dependent change among molecular sequence positions. The object of this thesis is to characterize the impact of structure on sequence evolution. The dependence among sites in protein evolution is first studied. Two simple and not very parametric hypothesis tests are introduced to study the spatial clustering of amino acid replacements within protein tertiary structure. Results of applying these tests to 273 protein families support the expectation that spatial clustering of amino acid replacements within tertiary structure is a ubiquitous phenomenon. More importantly, patterns of amino acid replacements do not seem to be solely attributable to spatial clustering of sequence positions that are independently evolving and have high rates of change. Instead, application of the newly introduced simple hypothesis tests yields evidence for dependent change among spatially clustered protein positions. This portion of the thesis work thereby casts doubt upon widely used methods for phylogeny inference. The second focus of this thesis is the impact of RNA secondary structure on RNA evolution. A model of RNA evolution incorporating RNA secondary structure is developed. The model introduces dependence among sites in RNA evolution via the effects of sequence changes on the approximate free energy of the resulting RNA secondary structure. This approximate free energy information can be thought as surrogate of fitness that serves as a link between genotype and phenotype in the model. Analysis of eukaryotic 5S ribosomal RNA sequences with this model shows the importance of RNA secondary structure on evolution. This analysis also confirms the value of the new model for studying adaptive evolution and for inferring ancestral sequences.

Description

Keywords

evolutionary models, protein tertiary structure, dependence among sites, RNA secondary structure

Citation

Degree

PhD

Discipline

Bioinformatics

Collections