Predicting Drosophila Segmentation

February 28, 2008

Segal et. al. recently published a paper in Nature describing a computational framework that models transcriptional regulation in an attempt to predict expression. They apply their framework to the well characterized problem of segmentation of a Drosophila embryo.

Specifying precise spatio-temporal patterns of expression is critical for proper organism development. One of the well characterized model systems for examining this problem is that of segmentation within the Drosphilia embryo. This system begins with three maternal factors which specify an initial anterior-posterior pre-segmentation pattern. These maternal factors activate gap genes which encode sequence-specific transcriptional repressors. Regulation within this network is highly combinatorial and, in the initial steps, almost entirely transcriptional. While detailed descriptions of parts of the process have been available for a long time, Segal’s work is the first to attempt to model the complete system.

The Segal model takes as input expression levels and DNA-binding specificities (PSSMs: position specific scoring matrices) for a set of transcription factors and predicts the expression level determined by these factors for an arbitrary DNA sequence. The model first computes the occupancy distribution of factors on a given DNA sequence using a generalized hidden Markov model framework. They enforce a simple steric constraint by disallowing overlap between factors within the binding site defined by the PSSM. They consider all possible configurations without predefined PSSM cutoffs by a sum (Forward) algorithm. They include a term for modeling self-cooperativity between factors (that decays with increasing distance). The model then translates each configuration summary into an expression level by a logistic function. In so doing they assume that each factor is either an activator or a repressor. In their procedure each transcription factor has four parameters:

  1. the absolute concentration of the factor (will vary over the AP axis)
  2. the transcription rate of the factor (a signed weight per factor; positives are activators)
  3. the strength of binding cooperativity for the factor
  4. the PSSM of the factor (allowed to vary in a constrained fashion during training)

these parameters are learned by alternating between conjugate gradient ascent and the simplex method using standard optimization packages.

They apply this model to the Drosophila segmentation network, considering eight key trascription factors to predict 44 gap and pair-rule gene modules with known patterns. They then seek to model a single developmental time point at which both the input factor patterns and output module expression patterns are mature. They predict reasonably well for the patterns within their training set and show that some of the failures within this set are the result of missing regulators or the lack of positive synergy terms. They then predict expression on 11 held out modules and 15 modules from D. pseudoobscura measured within D. melaogaster. They then spend the rest of their paper discussing the general observations which arise from their model.

But the real question is how well do they do at predicting expression? For this I refer to a Drosophila expert and quote Mike Levine’s minireview of Segal’s paper,

“the model is in general agreement with the gene-expression patterns attributable to the enhancers in the gap genes, but produces only variable agreement with the stripes of gene expression produced by enhancers in the genes known as pair-end genes.”

he further states,

“a more accurate picture means invoking ‘nonlinear’ mechanisms of transcriptional regulation such as heterotypic cooperative DNA-binding, repression by quenching and significant contributions of low-affinity binding sites to the control of gene expression.”

Despite these criticism, Levine does praise Segal for demonstrating the generality of well studied mechanisms to the entirety of the segmentation process.

The Segal model, as Levine points out, fails to consider all the mechanisms known in the pre-systems literature. However, it is a good start to building a mechanistic model of the regulation observed in a well studied system. As Segal states, “Overall, the failures of our model are as instructive as its successes”.

Segal, E., Raveh-Sadka, T., Schroeder, M., Unnerstall, U., Gaul, U. (2008). Predicting expression patterns from regulatory sequence in Drosophila segmentation. Nature, 451(7178), 535-540. DOI: 10.1038/nature06496

Levine, M. (2008). A systems view of Drosophila segmentation. Genome Biology, 9(2), 207. DOI: 10.1186/gb-2008-9-2-207


One Response to “Predicting Drosophila Segmentation”

  1. Ceceliadc Says:

    omg.. good work, bro

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: