Detecting coevolution

September 28, 2007

Yeang and Haussler have developed an interesting model of coevolution … the selective constraints on components of a molecular apparatus which require coordinated changes of its components. The best studied of these being the compensatory mutations required in RNA secondary structure. Yeang uses a general continuous-time Markov process to model substitution at two sites. The null hypothesis being that the sites evolve neutrally. The alternative model is one where the changes observed between the two sites are co-occuring … favoring double changes over singles. The resulting probabilistic graphical model is relatively general, as demonstrated by their two recent publications.

In their Mol. Evol. Biol paper [24(9):2119-2131 (2007)] they apply this model to predicting interactions with ribosomal RNA. The model (rather impressively) identifies not only secondary structure interactions but also a number of tertiary constrained pairs. The limitation seems to be that the model depends heavily on having high quality relatively deep sequence alignments, which are available for only a small number of RNA families.

In their more recent PLoS Comp. Biol paper [currently available as early release], they apply the model to protein domains as defined by Pfam. The majority of inferred coevolving positions are functionally and spatially coupled: appearing within the same protein, in interacting proteins, and often at functionally important sites. Their results imply the existence of selective pressures to maintain coordinated responses at key residues. Unlike RNA interactions, which are typically local in the 3D space of the structure, many of the protein interactions are between residues more distant spatially within the protein.

Together these papers demonstrate the power of detecting weak (but real and present) evolutionary signatures. In contrast to mutual information, their model takes into account how likely the covariation is anticipated to have arrisen from neutral evolution. Theoretically it is equally applicable to inter molecular signatures such as RNA-DNA, protein-DNA and protein-RNA interactions. In each of those cases, however, a large training data for paramter estimation is not readily available.

Dean and Thornton [Nat Rev Genetics 8:675-688 (2007)] said it best, “Virtually everything that living cells do is regulated by specific interactions between molecules — enzymes and substrates, ligands and receptors, transcription factors and DNA binding sites. How these tight partnerships evolve is a crutial question for both molecular and evolutionary biology.”

Yeang, C., Haussler, D. (2007). Detecting Coevolution in and among Protein Domains. PLoS Computational Biology, 3(11), e211. DOI: 10.1371/journal.pcbi.0030211

Yeang, C., Darot, J.F., Noller, H.F., Haussler, D. (2007). Detecting the Coevolution of Biosequences An Example of RNA Interaction Prediction. Molecular Biology and Evolution, 24(9), 2119-2131. DOI: 10.1093/molbev/msm142


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: