Snail

Promoter Structure

snail expression during neurogenesis evolves from segmentally repeated neuroectodermal domains forming a pan-neural pattern. A 2.8 kb regulatory region of the sna promoter drives LacZ expression in a faithful neuronal pattern. Deletion analysis of this region indicates that the pan-neural element is composed of separable CNS and PNS components. This finding is unexpected since all known genes controlling early neurogenesis, including the proneural genes (i.e. da and AS-C), are expressed in both the CNS and PNS. Expression of sna during neurogenesis is largely independent of the proneural genes da and AS-C. The separate control of CNS and PNS sna expression and independence of proneural gene regulation add to a growing body of evidence that current genetic models of neurogenesis are substantially incomplete (Ip, 1994b).

Shadow enhancers foster robustness of Drosophila gastrulation

Critical developmental control genes sometimes contain 'shadow' enhancers that can be located in remote positions, including the introns of neighboring genes. They nonetheless produce patterns of gene expression that are the same as or similar to those produced by more proximal primary enhancers. It was suggested that shadow enhancers help foster robustness in gene expression in response to environmental or genetic perturbations. This hypothesis was critically tested by employing a combination of bacterial artificial chromosome (BAC) recombineering and quantitative confocal imaging methods. Evidence is presented that the snail gene is regulated by a distal shadow enhancer located within a neighboring locus. Removal of the proximal primary enhancer does not significantly perturb snail function, including the repression of neurogenic genes and formation of the ventral furrow during gastrulation at normal temperatures. However, at elevated temperatures, there is sporadic loss of snail expression and coincident disruptions in gastrulation. Similar defects are observed at normal temperatures upon reductions in the levels of Dorsal, a key activator of snail expression. These results suggest that shadow enhancers represent a novel mechanism of canalization whereby complex developmental processes 'bring about one definite end-result regardless of minor variations in conditions (Perry, 2010).

Despite both intrinsic and environmental sources of noise, which introduce variability in complex developmental processes, the patterning of the Drosophila embryo unfolds with high fidelity. It has been postulated that genetic interactions in developmental regulatory networks can channel these variable inputs into faithful outcomes, as a ball bouncing inside of a funnel is channeled to the center, a process termed canalization. This paper presents evidence that shadow enhancers are important mediators of canalization, ensuring reliable and robust expression of critical patterning genes (Perry, 2010).

snail is a key determinant of dorsal-ventral patterning. It encodes a zinc finger repressor that establishes a sharp boundary between the presumptive mesoderm and neurogenic ectoderm and is essential for the formation of the ventral furrow and the invagination of the mesoderm. Whole-genome ChIP-chip assays identified a cluster of Dorsal and Twist (key activators of snail expression) binding sites in the immediate 5' flanking region of the snail transcription unit that coincide with the known enhancer. Unexpectedly, these studies also identified a second cluster of binding sites within the neighboring Tim17b2 locus, located ~7 kb upstream of snail. A small genomic DNA fragment (~1 kb) encompassing this second cluster of binding sites was attached to a lacZ reporter gene and was expressed in transgenic embryos. The fusion gene exhibits localized expression in the presumptive mesoderm, similar to that seen for the endogenous gene or obtained with the proximal enhancer (the first 2.8 kb of the 5' flanking region. The newly identified distal enhancer is arbitrarily referred to as the shadow enhancer and the original, proximal enhancer is referred to as the primary enhancer (Perry, 2010).

A snail fusion gene containing only the primary enhancer rescues the gastrulation of at least some snail mutants in a population of mutant embryos. Because snail is essential for the coordinated invagination of the mesoderm during early gastrulation, variability in expression could lead to occasional disruptions in morphogenesis. Perhaps the additional enhancer provides a mechanism for suppressing such variability, thereby ensuring robust expression in large populations of embryos. This hypothesis was motivated in part by previous preliminary evidence that neurogenic genes with shadow enhancers show less sensitivity to changes in activator concentration than similar genes lacking shadows (Perry, 2010).

An alternative view is that the proximal and shadow enhancers are primarily responsible for controlling distinct dynamic aspects of the snail expression pattern rather than functioning in an overlapping manner during mesoderm invagination. An expectation of the former robustness hypothesis is that transgenes containing either enhancer alone should be sufficient to induce gastrulation in snail mutant embryos. This possibility was tested by creating a series of recombineered bacterial artificial chromosomes (BACs) containing an ~25 kb genomic interval encompassing the snail and Tim17b2 loci. Comparable BACs were prepared that either contain or lack the proximal enhancer. This enhancer was not simply deleted, but an ~1 kb segment containing critical Dorsal activator elements was replaced with a spacer DNA sequence in order to retain normal spacing of the regulatory region (Perry, 2010).

To measure the effect that different enhancers have on transcriptional activity, a reporter system was developed for detecting nascent transcripts. The endogenous yellow gene is not transcribed until late in development and contains a large intron, making it an ideal reporter for the detection of de novo transcripts by in situ hybridization. In contrast, the snail transcription unit lacks introns and is therefore not amenable to quantitative in situ hybridization methods that rely on intronic probes. Consequently, a series of BACs was created that contains yellow in place of snail. These BACs contain both enhancers or have either the primary or shadow enhancer replaced with random DNA. All of the aforementioned BACs were inserted in the same chromosomal location on 2L using phiC31 targeted integration (Perry, 2010).

BACs containing the snail gene were crossed into a mutant background with a deletion spanning the entire snail transcription unit (Df (2L)osp²⁹), along with a marked balancer to identify homozygous snail null mutants. As noted earlier, the reciprocal situation, proximal enhancer without shadow, can sometimes rescue gastrulation. Mutant embryos homozygous for the snail deficiency chromosome (osp²⁹) are easily recognized by the absence of snail expression and ectopic single-minded (sim) expression, a key regulator of midline formation within the central nervous system that is normally excluded from the mesoderm by the Snail repressor (Perry, 2010).

There is neither a ventral furrow nor subsequent ingression of the mesoderm in these mutants or just the shadow enhancer alone rescue gastrulation of mutant embryos. In both cases, a complete ventral furrow is formed, followed by invagination of the mesoderm indistinguishable from that seen in wild-type embryos. Both BACs restore snail expression in the presumptive mesoderm, and sim transcripts are restricted to lateral regions that form the ventral midline of the central nervous system after gastrulation. These observations, along with previous studies, indicate that neither the primary nor shadow enhancer is necessary for the gastrulation of embryos raised at optimal, permissive conditions (Perry, 2010).

Although the shadow enhancer is sufficient for generating a qualitatively normal pattern of snail expression, additional assays were done to determine whether there might be subtle changes in expression. Quantitative confocal imaging methods were used to investigate this possibility. As mentioned earlier, BAC transgenes were prepared that contain the yellow reporter gene in place of the snail transcription unit. In situ hybridization assays with intronic probes permit direct detection of yellow de novo transcripts, and, hence, precise measurements of snail transcription with single cell (nucleus) resolution. At normal culturing temperatures (22°C), there is no discernible difference in the initial de novo transcription patterns of BAC transgenes containing both enhancers or containing just a single enhancer, either the primary enhancer or shadow enhancer. In the majority of cases, more than 90% of the nuclei in the presumptive mesoderm express yellow nascent transcripts (Perry, 2010).

Less-reliable expression is observed for BAC transgenes containing a single enhancer at elevated temperatures (30°C). More than 20% of the nuclei in the presumptive mesoderm lack yellow nascent transcripts in over half of the embryos expressing the BAC transgene without the shadow enhancer. This effect is even more pronounced upon removal of the primary enhancer. The same cutoff value, absence of yellow nascent transcripts in at least 20% of all mesodermal nuclei, occurs in over three-fourths of these embryos. In contrast, the BAC transgene containing both the primary and shadow enhancers continues to display nearly complete patterns of de novo transcription at the elevated temperature (Perry, 2010).

Similar results were obtained in response to genetic perturbations. For example, the yellow transgene BAC containing both enhancers exhibits a normal pattern of expression in embryos derived from dl/+ mothers containing half the normal dose of the Dorsal gradient. The distribution of nuclei failing to maintain active expression is similar to that seen for wild-type embryos. However, the comparable BAC transgene containing only the shadow enhancer exhibits erratic patterns of activation in these embryos, particularly in lateral regions. These results, along with the preceding analysis of embryos grown at elevated temperatures, suggest that the snail shadow enhancer helps ensure accurate and reproducible patterns of gene expression in large populations of embryos subject to genetic and environmental perturbations (Perry, 2010).

The preceding results document quantitative changes in the variability and reliability of snail expression upon removal of the primary or shadow enhancer. Next it was asked whether such variation causes changes in cellular morphogenesis, particularly the formation of the ventral furrow and subsequent invagination of the mesoderm. snail mutant embryos carrying BACs with both enhancers or just the shadow enhancer. Embryos carrying the transgene with both enhancers exhibit normal patterns of gastrulation. In contrast, comparable embryos lacking the primary enhancer display erratic patterns of gastrulation, including the formation of incomplete ventral furrows that do not extend along the entire germband and disruptions in the symmetry of the involuted mesodermal tube. As shown earlier, such defects are not observed at normal temperatures (22°C) (Perry, 2010).

This paper has presented evidence that the snail shadow enhancer located within the Tim17b2 locus helps ensure reliable and reproducible patterns of snail expression in the presumptive mesoderm during gastrulation. BAC transgenes lacking either the primary enhancer or the shadow enhancer display erratic patterns of de novo transcription at elevated temperatures. It is proposed that shadow enhancers come to be fixed in populations by ensuring robustness in the activities of key patterning genes such as snail. Increases in temperature should cause less-stable occupancy of critical binding sites, but an additional enhancer could suppress this noise by increasing the probability of gene activation. This increased time of active transcription per cell might augment the overall levels of expression, which could be an important function of shadow enhancers (Perry, 2010).

Other critical dorsal-ventral determinants also contain shadow enhancers, including brinker, vnd, and sog. The recent analysis of shavenbaby suggests that shadow enhancers are essential for the reliable morphogenesis of embryonic bristles in older embryos. There is also evidence that shadow enhancers might be a common feature of vertebrate systems such as zebrafish (Perry, 2010).

Shadow enhancers appear to represent a novel mechanism of canalization, whereby complex developmental processes lead to a fixed outcome despite genetic and environmental perturbations. Other mechanisms of canalization have been suggested, including recursive wiring of gene regulatory networks and 'capacitors' such as hsp90 that suppress both altered folding of mutant proteins and transpositioning of mobile elements (Perry, 2010).

It is conceivable that primary and shadow enhancers mediate overlapping patterns of activity only during early embryogenesis. They might come to possess distinctive regulatory activities at later stages of development. Nonetheless, during the time when their activities coincide during gastrulation, they maintain reliable patterns of snail expression in response to environmental and genetic variability. Although either enhancer might be sufficient, both enhancers are required for accurate and reliable patterns of expression in response to variability. This precise patterning enables rapid development, without delays arising from corrective feedback mechanisms. It is easy to imagine that delays in embryogenesis would result in selective disadvantages to the resulting larvae, which must compete for limiting sources of food. Regardless of the specific mechanisms that select for shadow enhancers, the occurrence of such enhancers provides an opportunity for the evolution of novel patterns of gene expression. As long as the two enhancers maintain overlapping activities during developmental hot spots such as gastrulation, they can drift or be selected to produce novel patterns of gene expression (Perry, 2010).

Enhancer control of transcriptional bursting

Transcription is episodic, consisting of a series of discontinuous bursts. Using live-imaging methods and quantitative analysis, transcriptional bursting was examined in living Drosophila embryos. Different developmental enhancers positioned downstream of synthetic reporter genes produce transcriptional bursts with similar amplitudes and duration but generate very different bursting frequencies, with strong enhancers producing more bursts than weak enhancers. Insertion of an insulator reduces the number of bursts and the corresponding level of gene expression, suggesting that enhancer regulation of bursting frequency is a key parameter of gene control in development. It was also shown that linked reporter genes exhibit coordinated bursting profiles when regulated by a shared enhancer, challenging conventional models of enhancer-promoter looping (Fukaya, 2016).

To explore the relationship between enhancers and bursts, the activities of several well-defined enhancers were visualized and measured in living Drosophila embryos. These enhancers were placed upstream or downstream of reporter genes containing a series of MS2 stem loops, permitting detection of nascent RNAs using MCP-GFP fusion proteins. A correlation was observed between enhancer strength and the frequency of transcriptional bursting. For example, the snail (sna) distal shadow enhancer generates more bursts than the proximal primary enhancer, although the amplitude and duration of the bursts are similar for the two enhancers. A variety of additional evidence, including insertion of the gypsy insulator, leads to the conclusion that the regulation of bursting frequencies is a key parameter of gene control in the Drosophila embryo (Fukaya, 2016).

To determine the relationship between enhancer-promoter interactions and bursting frequencies, two different reporter genes containing MS2 or PP7 stem loops were simultaneously visualized under the control of individually linked enhancers. Surprisingly, a high frequency of coordinate bursts was observed, suggesting co-activation of the two reporter genes. These observations challenge classical models of stable enhancer-promoter looping and raise the possibility that chromosome topology is a critical feature of gene control in development (Fukaya, 2016).

This paper has presented several lines of evidence that the regulation of bursting frequencies is a key parameter of gene control in the Drosophila embryo. Strong enhancers produce more bursts than weak enhancers. Moreover, differential bursting frequencies correlate with nonuniform expression profiles of rho, Abd-B, and Kr. To determine the relationship between enhancer-promoter interactions and bursts, the regulation of linked MS2 and PP7 reporter genes by shared enhancers was examined. Surprisingly, a high incidence of coordinate bursts was observed, suggesting co-activation of linked reporter genes. These observations challenge traditional models of enhancer-promoter looping (Fukaya, 2016).

Previous live-imaging studies revealed bursts of eve stripe 2 expression, but not hb. It is proposed that the high levels of gene activity mediated by the proximal hb enhancer obscure individual bursts. Given the rates of Pol II elongation (~2.0 kb/min), it should take ~3 min for all of the Pol II complexes from a burst to clear the reporter gene template. Consequently, refractory periods lasting <3 min would be expected to result in the conflation of consecutive bursts. It appears that this problem was circumvented by placing test enhancers downstream of the reporter gene, thereby diminishing the levels of expression and permitting the resolution of discrete bursts (Fukaya, 2016).

In principle, the levels of gene activity can be regulated by modulating the duration, amplitude, or frequency of individual bursts. This study consistently observed that bursting frequency is the major parameter underlying differences in gene expression. Most notably, the insertion of the gypsy insulator DNA between the sna shadow enhancer and PP7 reporter gene results in a marked reduction in the levels of gene expression and a corresponding diminishment in the number of bursts. However, when bursts are detected, they exhibit nearly the same amplitude as those seen for the unimpeded enhancer lacking the insulator DNA. There are minor reductions in the amplitude (e.g., rate of release of Pol II complexes from the promoter) and duration (how long the promoter remains active), but bursting frequency is the most consistent variable observed (Fukaya, 2016).

It is proposed that transcriptional bursting renders gene expression more immediately responsive to dynamic and transient on and off signals during development. For example, the eve stripe 2 enhancer is regulated by localized repressors that delineate the anterior and posterior stripe borders. This repression tends to occur during the refractory period between bursts. Perhaps it is easier to repress transcription during these 'down' phases as compared with periods of peak activity. Bursting may be an essential property of gene activity, and it is easy to imagine that the regulation of bursting frequencies is a prime determinant of gene control in a variety of developmental processes (Fukaya, 2016).

According to conventional looping models, the shared enhancer is expected to randomly select one of the two promoters, generate a burst, dissociate from the target promoter, and again randomly select a reporter gene for another burst. This would lead to alternating red (PP7-yellow) and green (MS2-yellow) bursts. Instead, a high incidence is seen of coordinate bursting, raising the possibility that an enhancer can simultaneously activate linked promoters separated by a distance of >15 kb. Co-activation is seen even when an insulator DNA is inserted between the shared enhancer and one of the reporter genes. These observations suggest a far more dynamic view of enhancer-promoter interactions than the stable landscapes implied by 4C assays (Fukaya, 2016).

Most enhancer-promoter interactions are thought to occur in the context of topological association domains (TADs). In vertebrates, a typical TAD is ~1 Mb in length and contains 10 genes and a few hundred enhancers. TADs are thought to be static structures that are invariant in different tissues. Despite this invariance, enhancer-promoter interactions within TADs appear to be highly dynamic. The co-activation of linked MS2 and PP7 reporter genes might result from local loop domains that bring different target promoters into close proximity with shared enhancers (see A Model for Dynamic Transitions in Chromosome Topology). The alternating inhibition and coordinate bursting observed for transgenes containing the gypsy insulator is consistent with the occurrence of dynamic transitions in loop domains. Such transitions might also be due to unstable assembly of insulator protein complexes, which consist of multiple subunits, including Su(Hw), Mod(mdg4)2.2, and CP190. Just as live-imaging methods provide a far more dynamic view of gene activity in the Drosophila embryo as compared with fixed tissues, these methods are likely to provide a more vibrant glimpse into the nature of enhancer-promoter communication (Fukaya, 2016).

Enhancer additivity and non-additivity are determined by enhancer strength in the Drosophila embryo

Metazoan genes are embedded in a rich milieu of regulatory information that often includes multiple enhancers possessing overlapping activities. This study employed quantitative live imaging methods to assess the function of pairs of primary and shadow enhancers in the regulation of key patterning genes - knirps, hunchback, and snail-in developing Drosophila embryos. The knirps enhancers exhibit additive, sometimes even super-additive activities, consistent with classical gene fusion studies. In contrast, the hunchback enhancers function sub-additively in anterior regions containing saturating levels of the Bicoid activator, but function additively in regions where there are diminishing levels of the Bicoid gradient. Strikingly sub-additive behavior is also observed for snail, whereby removal of the proximal enhancer causes a significant increase in gene expression. Quantitative modeling of enhancer-promoter interactions suggests that weakly active enhancers function additively while strong enhancers behave sub-additively due to competition with the target promoter (Bothma, 2015).

This quantitative analysis of hb and kni expression provides seemingly opposing results. For kni, additive, sometimes even super-additive, action of the two enhancers was observed within the presumptive abdomen. In contrast, the two hb enhancers do not function in an additive fashion in anterior regions but are additive only in central regions where expression abruptly switches from 'on' to 'off'. It is proposed that 'weak' enhancers function additively or even super-additively, whereas 'strong' enhancers can impede one another (Bothma, 2015).

Additional support for this view is provided by the analysis of sna. The removal of the proximal enhancer significantly augments expression, consistent with the occurrence of enhancer interference within the native locus. It is also conceivable that a single strong enhancer (e.g., hb proximal or sna distal) already mediates maximum binding and release of Pol II at the promoter, and additional enhancers are therefore unable to increase the levels of expression. However, the increase in the levels of sna expression upon removal of the primary enhancer is inconsistent with this explanation. Perhaps, the proximity of the proximal enhancer to the sna promoter gives it a 'topological advantage' in blocking access of the distal enhancer. The proximal enhancer might mediate less efficient transcription than the distal enhancer and thereby reduce the overall levels of expression. It is not believed that this proposed difference is due to differential rates of Pol II elongation since published and preliminary studies suggest that different enhancers and promoters lead to similar elongation rates (~2 kb/min; T Fuyaka and M Levine, unpublished results). A nonexclusive alternative possibility is that deletion of the proximal enhancer removes associated sna repression elements, thereby augmenting the efficiency of the distal enhancer (Bothma, 2015).

A minimal model of enhancer–promoter associations provides insights into potential mechanisms. In the parameter regime where such interactions are infrequent the two enhancers display additive behavior. However, in the regime of frequent interactions, enhancers compete for access to the promoter resulting in sub-additive behavior. Enhancer–promoter interaction parameters are likely to vary not only between different enhancers but also as the input patterns are modulated in time and space during development (Bothma, 2015).

This simple model explains the switch from sub-additive to additive enhancer activities for hb and sna. However, in order to explain the super-additive behavior of the kni enhancers, it would be necessary to incorporate an additional state in the model, whereby both enhancers form an active complex with the same target promoter. Such a complex would have a more potent ability to initiate transcription than individual enhancer–promoter interactions (Bothma, 2015).

In summary, it is proposed that enhancers operating at reduced activities ('weak enhancers') can function in an additive manner due to relatively infrequent interactions with their target promoters. In contrast, 'strong' enhancers might function sub-additively due to competition for the promoter. For hb, this switch between competitive and additive behavior occurs as the levels of Bicoid activator diminish in central regions where the posterior border of the anterior Hb domain is formed. Similarly, stress might reduce the performance of the sna enhancers to foster additive behavior under unfavorable conditions such as increases in temperature. This study highlights the complexity of multiple enhancers in the regulation of gene expression. They need not function in a simple additive manner, and consequently, their value may be revealed only when their activities are compromised (Bothma, 2015).

cis-Decoder discovers constellations of conserved DNA sequences shared among tissue-specific enhancers

A systematic approach is described for analysis of evolutionarily conserved cis-regulatory DNA using cis-Decoder, a tool for discovery of conserved sequence elements that are shared between similarly regulated enhancers. Analysis of 2,086 conserved sequence blocks (CSBs), identified from 135 characterized enhancers, reveals most CSBs consist of shorter overlapping/adjacent elements that are either enhancer type-specific or common to enhancers with divergent regulatory behaviors. These findings suggest that enhancers employ overlapping repertoires of highly conserved core elements (Brody, 2007).

Analysis of mammalian cis-regulatory sequences included 14 neural and 21 mesodermal enhancers whose regulatory behaviors have been characterized in developing mouse embryos. EvoPrints of these enhancers included orthologs from placental mammals (human, chimp, rhesus monkey, cow, dog, mouse, rat) or also included the opossum; these species afford enough additive divergence (~200 My) to resolve most enhancer Multi-Species Conserved Sequences (MCSs). When possible, chicken and frog orthologs were also included in the EvoPrints. Except when EvoDifference profiles revealed sequencing gaps or genomic rearrangements in one or more species that were not present in the majority of the different orthologous DNAs, pair-wise reference species versus test species readouts from all of the above BLAT formatted genomes were used to generate the EvoPrints (Brody, 2007).

Using the EvoPrint-Parser program, both forward and reverse-complement sequences of each enhancer CSB of 6 bp or greater were extracted, named and consecutively numbered. Based on their enhancer regulatory expression pattern, CSBs were grouped into two different CSB-libraries, neural and mesodermal. Although there exists a distinction between expression in either neural or mesodermal tissues, each of the CSB-libraries represent a heterogeneous population of enhancers that drive gene expression in different cells and/or different developmental times in these tissues. For this study, CSBs of 5 bp or less were not included in the analysis. Although these shorter CSBs, particularly the 5 and 4 bp CSBs, are most likely important for enhancer function, the use of CSBs of 6 bp or larger (representing greater than 80% of the conserved MCS sequences) is sufficient to resolve sequence element differences between enhancers that regulate divergent expression patterns. A total of 286 neural CSBs and 289 mesodermal CSBs were extracted from the mammalian enhancers (Brody, 2007).

For Drosophila, three CSB-libraries, neural, segmental and mesodermal, were generated from CSBs identified by EvoPrinting : neural enhancers included those regulating both CNS and peripheral nervous system (PNS) determinants; segmental enhancers included those regulating both pair-rule and gap gene expression; and mesodermal enhancers included those regulating both presumptive and late expression. Many of the D. melanogaster reference sequences used to initiate the EvoPrints were curated from the regulatory element database REDfly, while others were identified from their primary reference. The collection of neural enhancers includes both those that direct expression during early development, such as the snail , scratch, and deadpan CNS and PNS enhancers, and late nervous system regulators, such as the eyeless enhancer ey12, which confers expression in the adult brain. The early embryonic segmental enhancers represent pair-rule regulators such as the hairy stripe 1 and even-skipped stripe 1 enhancers, and gap expression regulators, such as the hunchback enhancers. The mesodermal enhancers include those directing mesodermal anlage expression of snail and tinman , and late expressing enhancers, such as those directing serpent fat body expression and mesodermal expression of Sex combs reduced. The collective evolutionary divergence of all of the EvoPrints was greater than 100 My and in most cases EvoPrints represented over approximately 160 My of additive divergence. The average CSB length for both the Drosophila and mammalian CSBs is 13 bp; the longest identified CSBs were 99 bp from the giant (-10) segmental enhancer and 95 bp from the Paired-like homeobox-2b mammalian neural enhancer (Brody, 2007).

As an initial step toward understanding the nature of the CSB substructure, a set of DNA sequence alignment tools, known collectively as cis-Decoder, were developed that allow identification of 6 bp or greater perfect match identities, called cis-Decoder Tags (cDTs), within two or more CSBs from either similar or divergent enhancers. The cDTs, which range in size from 6 to 14 bp with an average of 7 or 8 bp, are organized into cDT-libraries that identify sequence elements within CSBs of the same CSB-library. In addition, common cDT-libraries that represent sequence elements aligning to CSBs of two or more different CSB-libraries were also organized (Brody, 2007).

Mammalian CSB alignments, using the CSB-aligner program, yielded 336 neural specific and 60 neural-enriched cDTs and analysis of the mammalian mesodermal CSBs yielded 258 mesodermal specific and 55 mesodermal enriched cDTs. The CSB alignments also produced 137 cDTs that are common to both neural and mesodermal CSBs. Alignments of the Drosophila enhancer CSBs yielded 444 neural specific cDTs (showing no hits on mesodermal or segmental enhancer CSBs), 284 segmental enhancer specific cDTs and an additional 451 cDTs found in neural and segmental enhancers but not part of mesodermal CSBs. Also 451 cDTs were identified that were enriched in neural and/or segmental CSBs but were also found at a lower frequency in mesodermal enhancer CSBs. From the mesodermal CSBs analyzed, 169 mesodermal specific cDTs (not in neural or segmental enhancer CSBs) were identified along with 104 additional cDTs enriched in mesodermal enhancers but also found at a lower frequency among neural and/or segmental enhancer CSBs. A common cDT-library was also generated that contains 993 cDTs that represent common sequence elements found in CSBs of both neural and mesodermal enhancers (Brody, 2007).

The constituent sequence elements of the different cDT-libraries are dependent on the enhancers used to identify them. As additional CSBs are included in the cDT-library construction, certain cDTs may be re-designated. For example, some that are currently considered neural specific will be discovered to be neural enriched, and others that are part of enriched libraries may be reassigned to common cDT-libraries (Brody, 2007).

Although each mammalian and fly cDT is present in at least two or more enhancers, most are not found as repeated sequences in any of the enhancers. In addition, one of the principle observations of this analysis is that enhancers of similarly regulated genes share different combinatorial sets of elements that are enhancer-type specific (Brody, 2007).

Cross-library CSB alignments revealed that nearly all CSBs contain cDTs that are either shared by CSBs from divergent enhancer types or found only in CSBs from enhancers with related regulatory functions. For example, the 37 bp neural mastermind ^#10 CSB (TATTATTACTATATACAATATGGCATATTATTATTAC) contains a 9 bp sequence (first underlined sequence) also found in the 20 bp ^#8 CSB from the dpp mesodermal enhancer and it also contains a 14 bp sequence (second underlined sequence) that constitutes the entire 14 bp ^#33 CSB from the neural enhancer region of nerfin-1 (Brody, 2007).

The analysis of both the mammalian and fly common cDT-libraries reveals that many cDTs contain core recognition sequences for known transcription factors. However, when additional flanking CSB sequences are considered, many common transcription factor binding sites become tissue specific cDTs. For example, the DNA-binding site for basic helix-loop-helix (bHLH) transcription factors, the E-box motif CAGCTG is present 22 times in different neural CSBs, and 2 and 4 times within the CSBs of segmental and mesodermal enhancers, respectively. However, when flanking sequences are included in the analysis, such as the sequences CAGCTGG, CAGCTGAT, CAGCTGTG, CAGCTGCA, CAGCTGCT and ACAGCTGCC, all are neural specific cDTs (E-box underlined). It has been previously shown that different E-boxes bind different bHLH transcription factors to regulate different neural target genes. Although transcription factor consensus DNA-binding sites are well represented in the cDT-libraries, greater than 50% of the cDTs in all of the libraries, both mammalian and fly, represent novel sequences whose function(s) are currently unknown. The fact that there exists such a high percentage of novel sequences within these highly conserved sequences indicates that the identity, function and/or the combinatorial events that regulate enhancer behavior are as yet unknown (Brody, 2007).

Although the resolution of cis-Decoder analysis increases as more enhancers and/or enhancer types are included in the CSB and cDT alignments, analysis of mammalian enhancers found that many shared sequence elements can be identified among related enhancers when as few as two different enhancer groups are used to generate specific cDT-libraries. This is a particularly useful feature of cis-Decoder, especially when studying a biological process or developmental event where relatively little is known about the participating genes and their controlling enhancers. To demonstrate the ability of cis-Decoder to analyze relatively small subsets of enhancers, this study showed how cDT-libraries generated from 14 neural and 21 mesodermal mammalian enhancers can be used to distinguish between the neural and mesodermal enhancers that regulate embryonic expression of Dll1 (Brody, 2007).

Dll1 encodes a Notch ligand that is essential for cell-cell signaling events that regulate multiple developmental events. Studies in the mouse reveal that Dll1 is dynamically expressed in specific regions of the developing brain, spinal cord and also in a complex pattern within the embryonic mesoderm. The 1.6 kb Dll1 cis-regulatory region, located 5' to its transcribed sequence, has been shown to contain distinct enhancers that direct gene expression in these different tissues. These studies have identified two highly conserved neural enhancers, designated Homology I (H-I) and Homology II (H-II), and two mesodermal enhancers termed msd and msd-II. The H-I enhancer directs expression to the ventral neural tube, while the H-II enhancer primarily drives Dll1 expression in the marginal zone of the dorsal region of the neural tube. The msd enhancer drives expression in paraxial mesoderm, and msd-II directs Dll1 expression to the presomitic and somitic mesoderm (Brody, 2007).

An EvoPrint of the Dll1 cis-regulatory region reveals clustered CSBs in each of the enhancer regions. The EvoPrint analysis used mouse (reference DNA), human, rhesus monkey, cow, rat, opossum and Xenopus tropicalis orthologs, representing over approximately 240 My of collective evolutionary divergence. EvoPrint-parser CSB extraction of the EvoPrint generated a total of 35 CSBs of 6 bp or longer, representing 83% of the total MCS. A cDT-scan of the four Dll1 enhancer regions using the mammalian neural and mesodermal specific cDT-libraries accurately differentiates between the neural and mesodermal enhancers. The cDT-library scan identified 77 type-specific sequence elements within the Dll1 CSBs and over half (52%) align with three or more CSBs from different enhancers, indicating that, even if Dll1 had been excluded from the analysis that generated the specific cDT-libraries, there would still be extensive coverage of the Dll1 CSBs by type-specific cDTs. All but eight of the CSBs contain elements that align with one or more neural or mesodermal specific cDTs. The H-I and H-II early CNS enhancers exhibited 64% and 43% coverage, respectively, by neural specific cDTs. The CSBs of the two mesodermal enhancers, msd and msd-II, exhibited 48% and 56% coverage, respectively, by one or more mesodermal specific cDTs. When common cDTs, shared by mesodermal and neural enhancers, were taken into account, coverage of all four enhancers was 81% (Brody, 2007).

cDT-cataloger analysis of aligning cDTs with H-I and H-II early CNS enhancers revealed that the H-I enhancer shares a remarkable 9 different sequence elements with the Wnt-1 early CNS neural plate enhancer CSBs, representing 62 bp (32%) of the H-I CSB coverage, 7 elements with the Paired-like homeobox-2b (Phox2b) hindbrain-sensory ganglia enhancer CSBs (23% coverage) and 6 sequence elements (20% coverage) with the Sox9^phindbrain-spinal cord enhancer CSBs as well as numerous other neural specific elements in common with CSBs of other neural enhancers. Comparisons of Dll1 H-I, Wnt-1, Phox2b and Sox9^penhancer CSBs reveal that the orientation and order of the shared cDTs are unique for each of the enhancers. The H-I and H-II enhancer CSBs also share the 7 bp sequence element GCTCCCC, and H-I has a repeat sequence element (AGTTAAA) that is present in two of its CSBs. The conserved AGTTAAA repeat is also part of a CSB in Phox2b enhancer. cDT-cataloger analysis of the mesodermal enhancer cDT hits reveals that, together, msd and msd-II share 7 elements in common with the mesodermal enhancer of Nkx2.5 as well as numerous elements in common with CSBs of other mesodermal enhancers (Brody, 2007).

To demonstrate the ability of cis-Decoder to differentiate between Drosophila neural and mesodermal enhancers, an analysis was performed of the snail upstream cis-regulatory region. The enhancers that regulate snail's dynamic embryonic expression have been mapped to a 2,974 bp upstream DNA fragment. An EvoPrint of this sequence reveals that each of the restriction fragments that contain the different enhancer activities (CNS, mesodermal and PNS) harbor clusters of highly conserved CSBs. The combined evolutionary divergence of the snail upstream EvoPrint (generated from Drosophila melanogaster, D. sechellia, D. yakuba, D. erecta, D. ananassae, D. pseudoobscura, D. mojavensis, D. virilis and D. grimshawi orthologous sequences) is approximately 160 My, suggesting that many, if not all, of the identified CSBs are likely to be genus invariant and that each base-pair within a CSB has been evolutionarily challenged (Brody, 2007).

To identify sequence elements within the snail upstream CSBs that are present in CSBs of other functionally related or unrelated enhancers, a cDT-scan of the snail EvoPrint was carried out using the neural, segmental and mesodermal specific cDTs and the enriched cDT-libraries. Within the snail early CNS neuroblast enhancer region, the cDT-library scan identified 22 different neural and neural/segmental cDT hits, distributed among all but one of the CSBs, covering 73% of the CSBs. Interestingly, 10 of the 22 cDTs that align with the early CNS enhancer CSBs are found in CSBs of both neural and segmentation enhancers. The high percentage of neural/segmental cDT hits most likely reflects the fact that this enhancer initially drives snail expression in the neuroectoderm in a pair-rule pattern and then in a segmental pattern corresponding to the first wave of delaminating neuroblasts. cDT-cataloger analysis of the aligning cDTs reveals that many of the identified sequence elements are also part of other early neuroblast enhancer CSBs. For example, the 9 bp cDTs ATTCCTTTC, ATTGATTGT, ATTGTGCAA, TGCAATGCA and GATTTATGG are also present, respectively, in CSBs from the nerfin-1, biparous, string, scratch and worniu neuroblast enhancers (Brody, 2007).

Within the presumptive mesodermal enhancer CSBs, 11 cDTs mesodermal specific aligned with 5 of the 12 CSBs, covering 40% of the CSBs. Like the neural cDTs, some of the mesodermal cDTs contain putative DNA-binding sites for classes of known transcription factor families. For example, the seventh cDT (TAATTGGA) contains a consensus core DNA-binding sequence (underlined) for Antennapedia class homeodomain factors (Brody, 2007).

In the snail early PNS enhancer region, 5 of the 7 CSBs aligned with a total of 15 different cDTs that cover 69% of the total PNS CSB sequence. Similar to the CNS enhancer CSB cDT alignments, close to half of the PNS cDT hits represent sequence elements within both neural and segmental enhancer CSBs, again most likely a reflection of the segmental structure of the PNS. The significant overlap in cDTs found in both CNS and PNS enhancer CSBs may reflect the likelihood that many early neural specific transcriptional regulatory factors are pan-neural (Brody, 2007).

Many of the snail enhancer CSB-cDT hits represent sequences found only in two CSBs, snail itself and one other. In these instances it appears that these elements, although specific for neural or mesodermal CSBs, are relatively rare when compared to others. Only through analysis of additional enhancers will it be clear whether these rare elements are indeed type-specific or only enriched in the type-specific CSBs. Nevertheless, the fact that the sequence elements identified by these rare cDTs are conserved in two distinct enhancer CSBs that have both been under positive selection for over 160 My of collective divergence merits their inclusion in the analysis (Brody, 2007).

As part of this study of Drosophila enhancers, cis-Decoder analysis was carried out of 38 segmentation enhancers responsible for both gap and pair-rule gene expression during Drosophila embryogenesis. Although the segmentation enhancer specific library consisted of only 284 cDTs, these cDTs aligned with over 70% of bases of the CSBs of segmentation enhancers. As an example of alignment of these cDTs with a segmental enhancer, an alignment of segmentation specific cDTs with the hairy stripe 1 enhancer is presented. cis-Decoder recognizes highly conserved Abdominal-B, HOX, Hunchback, Kruppel and Tramtrack binding sites, as well as additional uncharacterized sites, as being shared by hairy stripe 1 enhancer and other segmentation enhancers (Brody, 2007).

Although cDT-libraries were initially generated from general classes of different enhancer types, this approach should be applicable to the analysis of gene co-regulation in any cell type involved in any biological event. As the variety and depth of the different cDT-libraries increase, it is thought that cDT-library scans of EvoPrinted putative enhancer regions will have great utility for the identification and initial characterization of cis-regulatory sequences. Future efforts that address the role of individual enhancer CSBs and the dissection of their modular elements will undoubtedly yield new insights into the function of these 'evolutionarily hardened' sequences and ultimately produce a better understanding of the regulatory code underlying coordinate gene expression (Brody, 2007).

Complex interactions between cis-regulatory modules in native conformation are critical for Drosophila snail expression

It has been shown in several organisms that multiple cis-regulatory modules (CRMs) of a gene locus can be active concurrently to support similar spatiotemporal expression. To understand the functional importance of such seemingly redundant CRMs, two CRMs were examined from the Drosophila snail gene locus that are both active in the ventral region of pre-gastrulation embryos. By performing a deletion series in a ~25 kb DNA rescue construct using BAC recombineering and site-directed transgenesis, it was demonstrated that the two CRMs are not redundant. The distal CRM is absolutely required for viability, whereas the proximal CRM is required only under extreme conditions such as high temperature. Consistent with their distinct requirements, the CRMs support distinct expression patterns: the proximal CRM exhibits an expanded expression domain relative to endogenous snail, whereas the distal CRM exhibits almost complete overlap with snail except at the anterior-most pole. It was further shown that the distal CRM normally limits the increased expression domain of the proximal CRM and that the proximal CRM serves as a 'damper' for the expression levels driven by the distal CRM. Thus, the two CRMs interact in cis in a non-additive fashion and these interactions may be important for fine-tuning the domains and levels of gene expression (Dunipace, 2011).

This study provides evidence that early snail expression is regulated by two concurrently acting CRMs that support gene expression patterns that are spatially and functionally different. The distally located CRM is necessary to support gastrulation as well as viability of snail mutants, whereas the proximal CRM is dispensable for viability except at high temperature. Furthermore, our data show these CRMs support distinct expression patterns. Although they probably share many transcription factors, the distal CRM alone is responsive to the repressor Huckebein and the unknown laterally acting 'repressor X', whereas the proximal CRM alone responds to an anterior activator (Dunipace, 2011).

These data suggest that the proximal CRM functions as a 'damper' to reduce the high levels of expression normally supported by the distal CRM. Multiple CRMs associated with a single gene may support spatiotemporally similar expression patterns, but the mean levels of gene expression supported by each can be very different. In the case of the snail locus, the data show that the distal and proximal CRMs drive high or low levels of expression, respectively, within a similar domain in ventral regions of the embryo. The results supports a model in which these two CRMs provide dual-control of expression levels, high versus low, to provide flexibility in terms of levels of snail expression. The requirement for the proximal CRM at high temperatures could indicate a need to more closely regulate the expression levels of snail in stressful environments. Such flexibility is probably advantageous and may explain why two CRMs that support similar expression patterns may be evolutionarily constrained (Dunipace, 2011).

Both the proximal and distal CRMs support expression not only during gastrulation in ventral regions of the embryo but in other domains at later stages of development. The distal CRM also supports expression within malphigian tubule precursors, and the proximal CRM supports expression later within neuroblasts. Therefore, these elements can be reused during the course of development, and may be evolutionarily retained for reasons beyond a role in canalization (Dunipace, 2011).

The results show that transcription factors associated with the distal CRM can dominantly affect the other proximally located CRM to support expression of sna that is refined and excluded from the posterior pole. The data support the view that non-autonomous CRM function is responsible for the resulting pattern which is effectively non-additive, i.e. it is not simply the summed equivalent of the domains of expression supported by the two CRMs. Non-autonomous CRM function may be advantageous, providing additional flexibility by allowing individual and combined activities of CRMs based on circumstances, to support canalization. It has been demonstrated that non-additive CRM interactions also play a role defining the expression domain of another Drosophila early patterning gene, sloppy-paired 1 (Prazak, 2010). The current data support the view that this is a more common cis-regulatory mechanism than currently appreciated. For example, even in case of the even-skipped gene locus that has received considerably focus, questions remain about why particular CRM behaviors are not equivalent to the behaviors of the eve gene itself. The expansion of a eve stripe 3/7 reporter gene in knirps mutants, but not the eve gene itself, suggests that another repressor is required to drive proper eve stripe 3/7 expression and that this activity is supported through another DNA fragment. It is proposed that another CRM associated with the eve locus may aid in definition of eve stripes 3/7 by serving as a vehicle for additional repressors(s), similar in mechanism to regulation of snail gene expression shown here in this study (Dunipace, 2011).

This study also supports the view that CRMs are organized in the context of the gene locus to support proper patterning and to minimize cross-repressive interactions. It is believed that the loss of Ect1 expression that is see in the 'squish' construct, in which, proximal and distal CRMs are brought in close proximity, is the result of dominant repression, owing to the fact that the distal enhancer is moved in proximity to the proximal enhancer. This would suggest that the native context of CRMs within a locus can limit interactions between elements, and may go towards explaining why enhancers in diverged species/animals tend to be found in the same general location. Similarly, the dampening of all snail expression patterns observed in the 'D to P' construct, in which the distal CRM is moved to the proximal position, in a double-delete background, may be due to the repressive activity of the distal CRM being moved near the promoter (Dunipace, 2011).

Placing binding sites for repressors near the promoter potentially limits the range of activity of a gene. Many genes involved in early development, such as snail, take on different roles later in development and are subject to different molecular inputs during the life of the animal. Like snail, the intermediate neuroblasts defective (ind) gene also has a distally located enhancer and another that is located in the proximal position. Similar to what is see at the snail locus, the distal CRM has documented repression associated with it, whereas the proximally located element functions through positive autoregulatory feedback. It is suggested that keeping repressors located at a distance from the promoter supports flexibility in reiterative reactivation of genes throughout the course of development. However, in addition to buffering repressive crosstalk through distance, it is proposed that linking repression function to the presence of an activator (i.e., between CRMs concurrently active in the same cells) may also serve as an alternate mechanism to moderate non-autonomous CRM interactions; other studies in the past have suggested that repressors may require activators to bind DNA (i.e., 'hot chromatin' model) (Dunipace, 2011).

The current data show that expression of the Drosophila snail gene in embryos is established through integrated activity of multiple CRMs that function concurrently and, in part, through non-additive interactions. Non-additive activity of CRMs, through sharing of repressors for example, is likely more commonplace than currently appreciated. It is possible that concurrently acting CRMs function coordinately to regulate spatial domain and levels of expression in general, and may provide one explanation why genes in Drosophila and other animals often have multiple CRMs that support similar spatiotemporal patterns of expression (Dunipace, 2011).

Transcriptional Regulation

The first step in the differentiation of the Drosophila mesoderm is the activation of two regulatory genes, twist and snail, in ventral regions of early embryos. sna is a target of the Dorsal (DL) morphogen. DL and TWI directly activate sna expression. Site-directed mutagenesis of DL- and TWI-binding sites within defined regions of the sna promoter suggest that the two proteins (containing the Rel and helix-loop-helix domains, respectively) function multiplicatively to ensure strong, uniform expression of sna, particularly in ventral-lateral regions where there are diminishing amounts of DL. These results are consistent with the possibility that the sharp sna transcription borders are formed by multiplying the shallow DL gradient and the steeper TWI gradient (Ip, 1992b).

decapentaplegic (dpp), zerknullt (zen), twist (twi), and snail, four zygotic patterning genes, are initially expressed either dorsally or ventrally in the segmented region of the embryo, and at the poles. In the segmented region of the embryo, correct expression of these genes depends on cues from the maternal morphogen Dorsal (DL). The DL gradient appears to be interpreted on three levels: dorsal cells express dpp and zen, but not twi and sna; lateral cells lack expression of all four genes; and ventral cells express twi and sna, but not dpp and zen. DL appears to activate the expression of twi and sna and repress the expression of dpp and zen. Polar expression of dpp and zen requires the terminal system to override repression by DL, while that of twi and sna requires the terminal system to augment activation by DL (Ray, 1991).

dpp expression by the dorsal ectoderm is directly involved in functioning of snail expressing mesodermal cells. Normally, the sna expression pattern encompasses 18-20 cells in ventral and ventrolateral regions. Narrowing the sna pattern results in fewer invaginated cells. As a result, the mesoderm fails to extend into lateral regions so that fewer cells come into contact with dpp-expressing regions of the dorsal ectoderm. This leads to a substantial reduction in visceral and cardiac tissues, consistent with recent studies suggesting that DPP induces lateral mesoderm. These results also suggest that the dorsal regulatory gradient defines the limits of inductive interactions between germ layers after gastrulation (Maggert, 1995).

During Drosophila gastrulation, morphogenesis occurs as a series of cell shape changes and cell movements that probably involve adhesive interactions between cells. The dynamic aspects of cadherin-based cell-cell adhesion were examined in the morphogenetic events to assess the contribution of such activity to morphogenesis. Shotgun and Cadherin-N show complementary expression patterns in the presumptive ectoderm and mesoderm at the mRNA level. Switching of cadherin expression from the Shotgun to the CadN type in the mesodermal germ layer occurs downstream of the mesoderm-determination genes twist and snail. These dynamic aspects of cadherin-based cell-cell adhesion appear to be associated with the following: (1) initial establishment of the blastoderm epithelium; (2) acquisition of cell motility in the neuroectoderm; (3) cell sheet folding, and (4) epithelial to mesenchymal conversion of the mesoderm. These observations suggest that the behavior of the Shotgun-catenin adhesion system may be regulated in a stepwise manner during gastrulation to perform successive cell-morphology conversions. Also discussed are the processes responsible for loss of epithelial cell polarity and elimination of preexisting Shotgun-based epithelial junctions during early mesodermal morphogenesis (Oda, 1998).

Huckebein (hkb) sets the anterior and the posterior borders of the ventral furrow, but acts by different modes of regulation. In the posterior part of the blastoderm, HKB represses the expression of sna in the endodermal primordium. In the anterior part, HKB antagonizes the activation of target genes by TWI and SNA. Here, Bicoid permits the co-expression of hkb, sna and twi, all of which are required for the development of the anterior digestive tract. Mesodermal fate is determined where sna and twi are expressed, but not hkb. Anteriorly hkb together with sna determines endodermal fate, while hkb together with sna and twi are required for foregut development (Reuter, 1994).

Both escargot and snail have similar DNA binding specificity and both are required in the wing imaginal disc. Both escargot and snail are required for for their own expression (autoregulation) and for regulation of expression of each other (Fuse, 1996).

Dorsoventral (DV) patterning of the Drosophila embryo is initiated by a broad Dorsal (Dl) nuclear gradient, which is regulated by a conserved signaling pathway that includes the Toll receptor and Pelle kinase. What are the consequences of expressing a constitutively activated form of the Toll receptor, Toll(10b), in anterior regions of the early embryo? Using the bicoid 3' UTR, localized Toll(10b) products result in the formation of an ectopic, anteroposterior (AP) Dl nuclear gradient along the length of the embryo. The analysis of both authentic Dorsal target genes and defined synthetic promoters suggests that the ectopic gradient is sufficient to generate the full repertory of DV patterning responses along the AP axis of the embryo. For example, mesoderm determinants are activated in the anterior third of the embryo, whereas neurogenic genes are expressed in central regions. These results raise the possibility that Toll signaling components diffuse in the plasma membrane or syncytial cytoplasm of the early embryo (Huang, 1997).

The Huang (1997) paper also clearly summarizes what is known about the regulation of genes involved in dorsal/ventral patterning. There are five distinct thresholds of gene activity in response to the Dorsal nuclear gradient in early embryos. The type I target gene folded gastrulation is activated only in response to peak levels of the Dl gradient, so that expression is restricted to a subdomain of the presumptive mesoderm. The PE enhancer from the twist promoter region exhibits a similar pattern of expression. This enhancer contains a cluster of low-affinity Dl binding sites that restrict expression to the ventral-most regions of early embryos. The type II target gene snail contains a series of low-affinity Dl-binding sites, as well as binding sites for the bHLH activator, Twist. The Dl and Twist proteins appear to make synergistic contact with the basal transcription complex, so that snail is activated throughout the presumptive mesoderm in response to both peak and high levels of the Dl gradient. The ventral midline arises from the mesoderm, which is derived from the ventral-most regions of the neuroectoderm. Mesectoderm differentiation is controlled by the bHLH-PAS gene, sim. Some of the E(spl) complex also exhibit early expression in the presumptive mesectoderm. A synthetic enhancer containing high-affinity Dl-binding sites and Twist binding sites exhibits expression in this region. The type IV target gene rhomboid is expressed in lateral stripes that encompass the ventral half of the presumptive neuroectoderm. These stripes are regulated by a 300-bp enhancer (NEE) that contains high-affinity Dl-binding sites, Twist-binding sites, and "generic" E-box sequences that appear to bind ubiquitously distributed bHLH activators (Daughterless and Scute), which are present in the unfertilized egg. The fifth and final threshold response is defined by the lowest levels of the Dl gradient. The zerknullt target gene is repressed by high and low levels of the gradient, so that expression is restricted to the presumptive dorsal ectoderm. The zen promoter region contains high-affinity Dl-binding sites and closely linked "corepressor" sites. Efficient occupancy of the Dl sites appears to depend on strong, cooperative DNA-binding interactions between Dl and the corepressors. The same low levels of Dl that repress zen also repress sog. The sim, E(spl), rho and sog expression patterns are restricted to the neurogenic ectoderm and excluded from the ventral mesoderm by Snail, which encodes a zinc finger repressor (Huang, 1997).

This study also provides evidence that neurogenic repressors may be important for the establishment of the sharp mesoderm/neuroectoderm boundary in the early embryo. About half of the embryos carrying the Toll anteriorly expressed transgene exhibit a ventral gap in the endogenous ventral expression pattern of snail behind the ectopic anterior staining pattern. Although the identity of the repressor creating this gap is unknown, it is conceivable that members of the E(spl) complex encode putative snail repressors because previous studies have shown that the m7 and m8 genes are expressed in the lateral neuroectoderm of early embryos. Proteins coded for by these genes are known to repressors. These proteins might be regulated by the gene hierarchy responsible for D/V polarity (Huang, 1997).

The Groucho corepressor mediates negative transcriptional regulation in association with various DNA-binding proteins in diverse developmental contexts. Groucho has been implicated in Drosophila embryonic terminal patterning: it is required to confine tailless and huckebein terminal gap gene expression to the pole regions of the embryo. An additional requirement for Groucho in this developmental process has been revealed by establishing that Groucho mediates repressor activity of the Huckebein protein. Putative Huckebein target genes are derepressed in embryos lacking maternal groucho activity and biochemical experiments demonstrate that Huckebein physically interacts with Groucho. Using an in vivo repression assay, a functional repressor domain in Huckebein that has been identified contains an FRPW tetrapeptide, similar to the WRPW Groucho-recruitment domain found in Hairy-related repressor proteins. Mutations in Huckebeins FRPW motif abolish Groucho binding and in vivo repression activity, indicating that binding of Groucho through the FRPW motif is required for the repressor function of Huckebein. Thus Groucho-repression regulates sequential aspects of terminal patterning in Drosophila (Goldstein, 1999).

One proposed Hkb target is the snail (sna) gene, which is transcribed in the ventral-most portion of the embryo. sna expression is thought to be excluded from the posterior pole by hkb activity. Accordingly, sna and hkb expression domains abut in cellularizing wild-type embryos, whereas sna expression extends to, and includes, the posterior pole of hkb mutant embryos. In tor^D embryos, hkb expression expands towards the center of the embryo and the sna domain correspondingly retracts. By contrast, in gro^mat- embryos, the expression of sna does not respect the sna posterior border and spreads to the pole, overlapping extensively with the hkb expression domain. The expression of the T-related gene brachyenteron (byn; also called Trg) also seems to be repressed by Hkb. byn is not expressed at the most posterior region of wild-type (or tor^D) embryos, whereas it extends throughout the posterior cap of hkb mutant embryos, consistent with hkb setting the posterior limit of byn expression. However, it is found that byn is ectopically expressed at the posterior tip of gro^mat- embryos. Together, these results suggest that gro is, directly or indirectly, necessary for hkb repressor functions (Goldstein, 1999).

Differential activation of the Toll receptor leads to the formation of a broad Dorsal nuclear gradient that specifies at least three patterning thresholds of gene activity along the dorsoventral axis of precellular embryos. The activities of the Pelle kinase and Twist basic helix-loop-helix (bHLH) transcription factor in transducing Toll signaling have been investigated. Pelle functions downstream of Toll to release Dorsal from the Cactus inhibitor. Twist is an immediate-early gene that is activated upon entry of Dorsal into nuclei. Transgenes misexpressing Pelle and Twist were introduced into different mutant backgrounds and the patterning activities were visualized using various target genes that respond to different thresholds of Toll-Dorsal signaling. These studies suggest that an anteroposterior gradient of Pelle kinase activity is sufficient to generate all known Toll-Dorsal patterning thresholds and that Twist can function as a gradient morphogen to establish at least two distinct dorsoventral patterning thresholds. How the Dorsal gradient system can be modified during metazoan evolution is discussed and it is concluded that Dorsal-Twist interactions are distinct from the interplay between Bicoid and Hunchback, which pattern the anteroposterior axis (Stathopoulos, 2002).

The snail, sim, vnd and sog expression patterns represent four different Toll-Dorsal signaling thresholds. snail is activated only by peak levels of the Dorsal gradient; sim and vnd are activated by intermediate levels, and sog is activated by the lowest levels of the gradient. These expression patterns were visualized in mutant and transgenic embryos via in situ hybridization using digoxigenin-labeled antisense RNA probes (Stathopoulos, 2002).

Dorsal target genes are essentially silent in mutant embryos that lack an endogenous dorsoventral Dorsal nuclear gradient. Mutant embryos were collected from females that are homozygous for a null mutation in the gastrulation defective (gd) gene, which blocks the processing of the Spätzle ligand and the activation of the Toll receptor. These mutants permit the analysis of ectopic, anteroposterior Dorsal and Twist gradients in 'apolar' embryos that lack dorsoventral polarity. snail, vnd, and sog are sequentially expressed along the anteroposterior axis of mutant embryos that contain a constitutively activated form of the Toll receptor (Toll^10b) misexpressed at the anterior pole using the bicoid (bcd) promoter and 3' UTR. These expression patterns depend on an ectopic anteroposterior Dorsal nuclear gradient. The repression of the vnd and sog patterns at the anterior pole is probably mediated by Snail, which normally excludes expression of these genes in the ventral mesoderm of wild-type embryos (Stathopoulos, 2002).

The activated Pelle-Tor⁴⁰²¹ kinase also directs sequential anteroposterior patterns of snail, vnd, and sog expression in gd^–/gd^– mutant embryos. As in the case of Toll^10b, the activated Pelle kinase was misexpressed at the pole using the bcd 3' UTR. The snail, vnd and sog expression patterns are similar to those obtained with the Toll^10b transgene. The vnd and sog expression patterns are probably repressed at the anterior pole by Snail. These results suggest that the levels of Pelle kinase activity are sufficient to determine different Dorsal transcription thresholds (Stathopoulos, 2002).

Similar experiments were carried out with a Pelle-Tor fusion gene that contains the Tor signal peptide, extracellular domain and transmembrane peptide, but lacks the amino acid substitution (Y327C) in the Tor⁴⁰²¹ protein that induces dimerization. The Pelle-Tor fusion protein fails to induce snail expression, but succeeds in activating vnd and sog (Stathopoulos, 2002).

The twist-bcd transgene was introduced into mutant embryos that completely lack Dorsal. Without the transgene these mutants do not express twist, snail, sim, vnd or sog. Introduction of the twist-bcd transgene causes intense expression of twist in the anterior 40% of the embryo. This broad Twist gradient fails to activate snail, but succeeds in inducing weak expression of sim and somewhat stronger staining of vnd at the anterior pole. The activation of vnd in mutant embryos is comparable with the expression seen in wild-type and Toll^rm9/Toll^rm10 embryos. However, in both wild-type and mutant embryos the vnd pattern is transient, and lost after the completion of cellularization. These results indicate that Twist can activate dorsoventral patterning genes in the absence of Dorsal (Stathopoulos, 2002).

An anteroposterior Twist gradient generates at least two thresholds of gene activity in mutant embryos that contain decreased levels of Dorsal. High levels of Twist activate sim at the anterior pole, whereas lower levels are sufficient to induce the expression of snail in more posterior regions of embryos containing low, uniform levels of the Dorsal protein. These results demonstrate that twist gene activity is not dedicated to mesoderm formation. Instead, Twist supports expression of two regulatory genes, sim and vnd, which pattern ventral regions of the neurogenic ectoderm. The twist-bcd transgene was shown to induce weak expression of both genes even in mutant embryos that completely lack Dorsal (Stathopoulos, 2002).

snail is activated by Dorsal and Twist in cellularizing embryos. The sharp lateral limits of the snail expression pattern establish the boundary between the presumptive mesoderm and neurogenic ectoderm. It has been suggested that the crude Dorsal gradient triggers a somewhat steeper Twist gradient, and the two activators function synergistically within the snail 5' cis-regulatory DNA to establish the sharp, on/off expression pattern. Dorsal-Twist transcription synergy may provide a means for 'multiplying' the Dorsal and Twist gradients to produce the sharp snail pattern. This model suggests that both proteins must be present in a gradient to generate the sharp snail border. However, while both Dorsal and Twist are required for the activation of snail, a Twist gradient is sufficient to generate a reasonably sharp pattern of snail expression in embryos containing low, uniform levels of Dorsal. It is proposed that cooperative binding of Twist might act as a switch to regulate snail expression when the snail 5' cis-regulatory region is rendered responsive by the Dorsal activator (whether present at uniform levels or in a gradient). Therefore, the ratio of Dorsal to Twist may be important to produce the sharp lateral limits of snail expression (Stathopoulos, 2002).

This study raises some questions about the role of operator binding affinities in the specification of different transcription thresholds. The Dorsal binding sites present in the snail 5' regulatory region bind with lower affinity than the sites present in the rho lateral stripe enhancer (NEE). The analysis of a number of synthetic enhancers prompted the proposal that the activation of Dorsal target genes in the ventral mesoderm versus lateral neurogenic ectoderm depends on the affinity of Dorsal operator sites. However, the demonstration that the twist-bcd transgene can activate snail expression in Toll^rm9/Toll^rm10 embryos suggests that occupancy of the distal Dorsal-binding sites may not be crucial for determining whether the gene is on or off. It is conceivable that Dorsal occupies one or more sites in mutant embryos, but is unable to trigger expression in the absence of Twist. In general, 'promoter context' (combinations of regulatory factors) might be more critical for defining Dorsal transcription thresholds than the affinities of Dorsal operator sites (Stathopoulos, 2002).

Dorsal interacting protein 3 potentiates activation by Drosophila Rel homology domain proteins

Dorsal interacting protein 3 (Dip3) contains a MADF DNA-binding domain and a BESS protein interaction domain. The Dip3 BESS domain was previously shown to bind to the Dorsal (DL) Rel homology domain. This study shows that Dip3 also binds to the Relish Rel homology domain and enhances Rel family transcription factor function in both dorsoventral patterning and the immune response. While Dip3 is not essential, Dip3 mutations enhance the embryonic patterning defects that result from dorsal haplo-insufficiency, indicating that Dip3 may render dorsoventral patterning more robust. Dip3 is also required for optimal resistance to immune challenge since Dip3 mutant adults and larvae infected with bacteria have shortened lifetimes relative to infected wild-type flies. Furthermore, the mutant larvae exhibit significantly reduced expression of antimicrobial defense genes. Chromatin immunoprecipitation experiments in S2 cells indicate the presence of Dip3 at the promoters of these genes, and this binding requires the presence of Rel proteins at these promoters (Ratnaparkhi, 2009).

The Drosophila genome encodes three rel homology domain (RHD) containing proteins, Dorsal (Dl), Dorsal-related immunity factor (Dif), and Relish (Rel). The RHD, which is also found in the human NFκB family of transcriptional activators, mediates dimerization and sequence-specific DNA binding. Rel/NFκB family proteins in vertebrates and invertebrates play central roles in the innate immune response by triggering the expression of antimicrobial defense genes in response to signals transduced by Toll and the Immune deficiency (Imd) signal transduction pathways. In Drosophila, Dl also directs dorsoventral (D/V) patterning of the embryo. Specifically, the regulated nuclear localization of maternally expressed Dl in response to Toll signaling in the embryo leads to the formation of a ventral-to-dorsal nuclear concentration gradient of Dl and to the spatially restricted regulation of a large number of genes, including twist (twi), snail (sna), and rhomboid (rho), which are activated by Dl, and zerknullt and decapentaplegic, which are repressed by Dl. This serves to subdivide the embryo into multiple developmental domains along its D/V axis (Ratnaparkhi, 2009).

Unlike Dl, Dif and Rel are not required for D/V patterning. Instead, these two rel-family proteins function along with Dl in the innate immune response. Toll signaling in the immune system leads to the translocation of Dl and Dif to the nucleus and the consequent activation of a subset of anti-microbial defense genes, including drosomycin (drs) and Immune induced molecule 1. Dl and Dif are believed to have redundant roles in this process and thus either one alone is sufficient for the induction of drs. Activation of the Imd signal transduction pathway, leads to proteolytic cleavage of Rel. The N-terminal region of Rel, which contains the RHD, then translocates into the nucleus where it activates expression of anti-bacterial genes, such as diptericin (dipt), cecropin-A1 (cec-A), and attacin-A. Dl, Dif, and Rel homo- and hetero-dimerize to activate different subsets of the anti-microbial defense genes in response to signals from the Toll and Imd pathways (Ratnaparkhi, 2009).

Very little is known about the identity of factors that assist the RHD proteins in the activation of the anti-microbial defense genes. Proteins that modulate expression of these genes include transcription factors such as the GATA factor Serpent (Srp), Hox factors, Helicase89B, and an unknown protein that binds region 1 (R1), a regulatory module in cec-A and other anti-microbial defense genes. In addition, a recent screen identified several POU domain proteins as potential regulators of anti-microbial defense genes (Ratnaparkhi, 2009).

To date, about a dozen proteins that interact directly with Dl and modulate its regulatory functions have been identified by genetic and biochemical means. For example, an interaction between Dl and Twist (Twi) enhances the activation of Dl target genes, while an interaction between Dl and Groucho (Gro) is essential for Dl-mediated repression. A yeast two-hybrid screen to identify Dl interacting proteins yielded, in addition to the well characterized Dl-interactors Twi and Cactus, four novel Dl-interactors (Dip1, Dip2, Dip3, and Dip4/Ubc9). Conjugation of SUMO to Dl by Ubc9 was subsequently shown to result in more potent activation by Dl (Ratnaparkhi, 2009).

Dip3 belongs to a family of proteins that contain both MADF (for Myb/SANT-like in ADF) and BESS (for BEAF, Stonewall, SuVar(3)7-like) domains. While MADF-BESS domain proteins are found in both insects and vertebrates, only a few have been characterized and their functions are largely unknown. The Drosophila genome encodes 14 MADF-BESS domain factors. In addition to Dip3, these include Adf-1, which was initially found as an activator of Alcohol dehydrogenase, and Stonewall, which is required for oogenesis. The Dip3 MADF domain mediates sequence specific binding to DNA, while the Dip3 BESS domain mediates binding to a subset of TATA binding protein associated factors as well as to the Dl RHD and to Twi. In addition to functioning as an activator, Dip3 can function as a coactivator to stimulate synergistic activation by Dl and Twi in S2 cells (Ratnaparkhi, 2009).

This study shows that Dip3 assists RHD proteins during both embryonic development and the innate immune response. By stimulating the expression of antimicrobial defense genes, Dip3 improves survival of both larvae and adults following septic injury. The presence of Dip3 near the promoters of antimicrobial defense genes depends upon Rel family proteins suggesting that Dip3 functions as a coactivator at these promoters (Ratnaparkhi, 2009).

It has been shown that Dip3, which binds both Dl and Twi via its BESS domain, synergistically enhances the activation of a luciferase reporter with multiple Dl and Twi binding sites upstream of the promoter. In addition, Dip3 has been implicated as the 'mystery protein' which binds to sites adjacent to Dl and Twi binding sites in a subset of Dl target genes. Therefore the ability of Dip3 to enhance the expression of the Dl target promoters twi, sna, and rho in S2 cell transient transfection assays was examined. All three promoters require both Dl and Twi for full activity. Dip3 was found to synergize with Dl and Twi in the activation of the sna and twi promoters, but not in the activation of the rho promoter (Ratnaparkhi, 2009).

A polyclonal antibody against recombinant Dip3 was generated, and used to determine where and when Dip3 is present in the embryo. Maternally expressed Dip3 is observed in all nuclei as early as nuclear cycle 7. It was detected in subsequent nuclear cycles during formation of the Dl nuclear concentration gradient. In interphase embryonic as well as S2 cell nuclei, Dip3 localizes to nuclear speckles of unknown identity. During mitosis Dip3 is enriched on chromosomes. It associates with the centrosome proximal portion of the anaphase chromatids and the inside ring of the polar body rosette suggesting a predominant pericentromeric location at this stage of the cell cycle and hinting at a possible role of Dip3 in centromeric function. Confirming the specificity of the antibodies, the immunoreactivity is absent from Dip3¹ embryos in which the Dip3 transcriptional and translational start sites as well as a large segment of the Dip3 coding region have been deleted. Weak Dip3 expression is also detected in the fat body (Ratnaparkhi, 2009).

Homozygous Dip3¹ flies are viable and fertile, indicating that Dip3 cannot have an essential role in embryonic D/V pattern formation. However, a small proportion (7±4%) of the embryos fail to hatch and exhibit D/V patterning defects. Embryos produced by females transheterozygous for Dip3¹ and a deficiency that removes a portion of the second chromosome containing the Dip3 gene (Df(PC4) exhibit similar embryonic lethality (10%) and D/V patterning defects. Also, maternal overexpression of Dip3 using the Gal4-UAS system leads to 54±9 % embryonic lethality with cuticles of the dead embryos showing both anteroposterior and D/V patterning defects, indicating that Dip3 may have a role in embryonic pattern formation (Ratnaparkhi, 2009).

Consistent with a non-essential role for Dip3 in D/V patterning, a Dip3 mutation enhances the temperature sensitive dl haploinsufficieny phenotype. The degree of dorsalization is often quantified by categorizing embryos on a scale from D0 (completely dorsalized, lacking all dorsoventral pattern elements other than dorsal epidermis) to D3 (inviable, but with little or no apparent defect in the cuticular pattern). At 29°, about half the dead embryos produced by dl¹/+ females exhibit detectable D/V patterning defects and the majority of these fall into the D2 category (moderately dorsalized, exhibiting mildly expanded ventral denticle belts and a twisted germ band). Removal of maternal Dip3 increases the proportion of dorsalized embryos to about 75% with most of the increase being due to an increase in the number of D2 embryos. The effect seems to be strictly maternal as the paternal genotype does not modulate the dl haploinsufficiency phenotype (Ratnaparkhi, 2009).

Dip3 is present in the fat body, the organ in which RHD factors activate antimicrobial defense genes in response to infection. Since Dip3 binds the Dl RHD, the role of Dip3 in the innate immune response was examined by assessing the sensitivity of Dip3¹ flies to bacterial and fungal infection. Wild-type and Dip3¹ adults and larvae were injected with gram positive bacteria (M. luteus), gram negative bacteria (E. coli), and fungi (B. brassiana). For comparison, flies were infected that contained mutations in known components of the Toll (spz^rm7) and Imd (Rel^E20) pathways. Wild-type, Rel^E20, spz^rm7, and Dip3¹ adults showed little lethality (<15%) 30 days after mock infection. However, the Dip3¹ adult flies exhibited 55% lethality one month after injection with a 1:1 mixture of M. luteus and E. coli, compared to 10% lethality after 30 days for wild-type flies and 98% after 30 days for Rel^E20 flies. In contrast, wild-type and Dip3¹ adults were equally sensitive to fungal infection, both showing 55-70% lethality after 30 days compared to 100% lethality after 22 days for Rel^E20 adults and 100% lethality after 7 days for spz^rm7 adults. Similar results were seen in larvae in which Dip3¹, Rel^E20 and spz^rm7 mutations resulted in reduced rates of eclosion following septic injury compared to wild-type. The effectiveness of the immune challenge was further verified by an experiment showing that septic injury leads to translocation of Dl into the nucleus (Ratnaparkhi, 2009).

To determine if the sensitivity of Dip3¹ flies to infection results from reduced induction of antimicrobial peptides, the expression of dipt, drs and cec-A was monitored as a function of time following septic injury. Relative to uninfected flies, the levels of expression of drs and dipt were reduced by the Dip3¹ mutation, especially at the 2 and 4 hr time points, while the levels of cec-A expression were not significantly altered. Thus, some, but not all, antimicrobial defense genes that are regulated by RHD family proteins exhibit dependence on Dip3. At the 4 hr time point, relative to infected, wild type flies, the spz^rm7 mutation reduced drs expression to basal levels while the Rel^E20 mutation reduced dipt expression ten fold (Ratnaparkhi, 2009).

Dip3 was over expressed in the larvae using the Cg-Gal4 driver to examine the effect of increasing levels of Dip3 on the expression of antimicrobial defense genes in the fat body. Cec-A and drs levels were unaffected, while dipt levels increased two-fold in infected flies. Thus, both loss-of-function and over expression data are consistent with the conclusion that Dip3 makes the immune response more robust by elevating the expression of a subset of antimicrobial defense genes (Ratnaparkhi, 2009).

Radiolabeled Dip3 interacts with FLAG-tagged Dl and Rel immobilized on anti-FLAG beads. Similarly, immobilized FLAG-Dip3 binds Dl (Bhaskar, 2002) and Rel (Residues 1-600). Dip3 binds to DNA via its MADF domain and to the RHD via its BESS domain, and can thus function either as an activator or as a coactivator (Bhaskar, 2002). To determine if Dip3 is present at the promoters of antimicrobial defense genes, ChIP assays were carried out in S2 cells transfected with FLAG-Dip3. FLAG antibody was used to immunoprecipitate Dip3 crosslinked to chromatin. Compared both to mock-transfected cells and to the transcribed region of a ribosomal protein-encoding gene (rp49), Dip3 was highly enriched at the drs, dipt and cecA promoters. As expected, dsRNA directed against Dip3 eliminated the ChIP signal verifying antibody specificity. The association of Dip3 with the promoters of the anti-microbial defense genes depended on Rel family proteins, since knockdown of these proteins by dsRNAi significantly reduced association of Dip3 with the promoters. Similar results were observed with an anti-GFP antibody and cells expressing a Dip3-GFP fusion protein (Ratnaparkhi, 2009).

These results suggest that Dip3 may synergize with RHD proteins in multiple developmental contexts possibly through contact with the Dl rel homology domain. Dip3 is expressed maternally and present in cleavage stage nuclei at the time that Dl is functioning to pattern the D/V axis. Furthermore, Dip3 can potentiate Dl-mediated activation of the twist and snail promoters in S2 cells. These observations suggest that Dip3 might have a role in D/V patterning. Consistent with this possibility, it was found that removal of maternal Dip3 results in occasional D/V patterning defects and significantly enhances the dl haploinsufficiency phenotype suggesting the Dip3 renders D/V patterning more robust perhaps by assisting in Dl-mediated activation (Ratnaparkhi, 2009).

An important aspect of the immune response is activation in the fat body of genes encoding antimicrobial peptides by the Rel family transcription factors Dl, Dif, and Rel. This study found that synergistic killing of flies by a mixture of E.coli and M. luteus is enhanced in Dip3¹ flies. This suggests roles for Dip3 in the Imd and/or Toll pathways, which mediate the response to microbial infection. In accord with this idea, it was found that activation of the Imd pathway target dipt and the Toll pathway target drs are compromised in Dip3 mutant larvae (Ratnaparkhi, 2009).

To determine if the role of Dip3 at antimicrobial defense gene promoters is direct, ChIP assays were carried out demonstrating that this factor associates directly with the drs, dipt, and cec-A promoters in S2 cells. Since Dip3 contains a DNA binding domain, it is possible that it binds to these promoters through a direct interaction with DNA. However, with one exception in the drs promoter, these promoters lack matches for the consensus Dip3 binding sites. Thus, Dip3 may be acting as a coactivator at these promoters consistent with its ability to bind the rel homology domain. In support of this idea, it was found that simultaneous knockdown of all three rel family proteins significantly reduced recruitment of Dip3 to the promoters (Ratnaparkhi, 2009).

The mechanism of Dip3 co-activation remains unclear. The finding that the Dip3 BESS domain binds TAFs (Bhaskar, 2002) suggests a role for Dip3 in the recruitment of the basal machinery. In addition, the MADF domain is closely related to the SANT domain, which binds histone tails and may have a role in interpreting the histone code. While analysis of RHD targets suggests roles for Dip3 in activation, Dip3 also associates with pericentromeric heterochromatin during mitosis, consistent with a possible role in silencing. Other heterochromatic proteins including a suppressor of position effect variegation (Su(Var)3-7) also contain BESS domains. However, the loss of Dip3 does not appear to modify position effect variegation (Ratnaparkhi, 2009).

In flies, additional roles for RHD-mediated activation have been demonstrated in haematopoesis, neural fate specification, and glutamate receptor expression. Antimicrobial defense genes are also expressed constitutively in barrier epithelia and in the male and female reproductive tracts. It will be interesting to determine if Dip3 is involved in rel protein-dependent and independent gene activation in some or all of these tissues. One tissue in which Dip3 appears to have clear rel-independent functions is in the developing compound eye, where Dip3 overexpression results in conversion of eye to antenna, while Dip3 loss-of-function leads to mispatterning of the retina (Ratnaparkhi, 2009 and references therein).

Neural stem cell transcriptional networks highlight genes essential for nervous system development

Neural stem cells must strike a balance between self-renewal and multipotency, and differentiation. Identification of the transcriptional networks regulating stem cell division is an essential step in understanding how this balance is achieved. It has been shown that the homeodomain transcription factor Prospero acts to repress self-renewal and promote differentiation. Among its targets are three neural stem cell transcription factors, Asense, Deadpan and Snail, of which Asense and Deadpan are repressed by Prospero. This study identifies the targets of these three factors throughout the genome. A large overlap in their target genes was found, and indeed with the targets of Prospero, with 245 genomic loci bound by all factors. Many of the genes have been implicated in vertebrate stem cell self-renewal, suggesting that this core set of genes is crucial in the switch between self-renewal and differentiation. It was also found that multiply bound loci are enriched for genes previously linked to nervous system phenotypes, thereby providing a shortcut to identifying genes important for nervous system development (Southall, 2009).

Recent work on Drosophila neural stem cells (or neuroblasts) has provided important insights into stem cell biology and tumour formation. Neuroblasts divide in an asymmetric, self-renewing manner producing another neuroblast and a daughter cell that divides only once to give post-mitotic neurons or glial cells. During these asymmetric divisions the atypical homeodomain transcription factor, Prospero, is asymmetrically segregated to the smaller daughter cell, the ganglion mother cell (GMC), where it can enter the nucleus and regulate transcription. Neuroblasts lacking Prospero form tumours in both the embryonic nervous system and the larval brain. Using the chromatin profiling technique DamID, together with expression profiling, it has been showm that Prospero represses neuroblast genes and is required to activate neuronal differentiation genes. Therefore, Prospero acts as a binary switch to repress the genetic programs driving self-renewal (by directly repressing neuroblast transcription factors) and to promote differentiation. It was found that Prospero represses the neuroblast transcription factors, Asense, Deadpan and Snail, suggesting that these transcription factors may control genes involved in neural stem cell self-renewal and multipotency (Southall, 2009).

To identify the transcriptional networks promoting neural stem cell fate the binding sites of Asense, Deadpan and Snail were profiled, on a whole genome scale. These three proteins are members of a small group of transcription factors that are expressed in all embryonic neuroblast. The first, Asense, is a basic-helix-loop-helix protein, a member of the achaete-scute complex, and a homologue of the vertebrate neural stem cell factor, Ascl1 (Mash1). Unlike the other members of the achaete-scute complex, Asense is not expressed in proneural clusters in the embryo. Asense expression is initiated in the neuroblast and is maintained in at least a subset of GMC daughter cells. Asense is also expressed in most larval brain neuroblasts but is markedly absent from the DM/PAN neuroblast (Bello, 2008; Bowman, 2008). In these lineages, Asense expression is delayed and the daughter cells (secondary neuroblasts) of the Asense-negative DM/PAN neuroblasts undergo multiple cell divisions, expanding the stem cell pool before producing GMCs (Bello, 2008; Boone, 2008; Bowman, 2008). Ectopic expression of Asense limits the division potential of DM/PAN neuroblast progeny (Bowman, 2008). A study in the optic lobe showed that Asense expression coincides with the upregulation of dacapo and cell-cycle exit. Perhaps in combination, these results suggest that Asense may also have a pro-differentiation role (Southall, 2009).

The second transcription factor, Deadpan, is a basic-helix-loop-helix protein related to the vertebrate Hes family of transcription factors. Deadpan is expressed in all neuroblasts and has been shown to promote the proliferation of optic lobe neural stem cells. Unlike Asense, Deadpan is also expressed in the DM/PAN neuroblasts of the larval brain (Southall, 2009).

The third factor, Snail, is a zinc-finger transcription factor whose vertebrate homologues have roles in the epithelial to mesenchymal transition and in cancer metastasis. The Snail family members (Snail, Worniu and Escargot) are known to regulate neuroblast spindle orientation and cell-cycle progression (Southall, 2009).

To further understand the role of these pan neural stem cell transcription factors, their targets were mapped throughout the genome. This, combined with expression profiling, allows building of the gene regulatory networks governing neural stem cell self-renewal, and enhancement of knowledge of the function and mode of action of these transcription factors in neural stem cells (Southall, 2009).

To identify the genes regulated by Asense, Deadpan and Snail in the embryo, d their binding sites in vivo were mapped by DamID, as has been done previously for Prospero (Choksi, 2006). In brief, DamID involves tagging a DNA or chromatin-associated protein with a Escherichia coli DNA adenine methyltransferase (Dam). Wherever the fusion protein binds, surrounding DNA sequences are methylated. Methylated DNA fragments can then be isolated, labelled and hybridised on a microarray. This study expressed Dam fusion proteins in vivo, in transgenic Drosophila embryos. Methylated DNA fragments from transgenic embryos expressing Dam alone serve as a reference. Target sites identified by DamID have been shown to match targets identified by chromatin immunoprecipitation, by mapping to polytene chromosomes and by 3D microscopy data (Southall, 2009).

In comparing the results for Asense, Deadpan, Snail and Prospero, a high degree of overlap was seen between their targets. The average overlap for the four factors in pairwise comparisons is 40%, with the highest overlap between Deadpan and Snail (66%). The similarity in binding is illustrated by the binding of all four factors to the intronic regions of the cell-cycle regulation gene CycE. 245 genes are bound by all four proteins, including genes involved in neuroblast cell fate determination, cell-cycle control and differentiation. These loci are unlikely to represent regions of chromatin accessible to all transcription factors; only 17/245 (7%) were also bound by another neural transcription factor, Pdm1. The large overlap in the targets of Asense, Deadpan, Snail and Prospero implies that these may be a core set of genes involved in neuroblast self-renewal and differentiation (Southall, 2009).

Genome-wide analysis of Asense DamID peaks shows that Asense binding is associated with increased levels of DNA conservation (determined by the alignment of eight insect species. A representation of Asense binding around a generic gene shows an enrichment of ~2 kb upstream of the transcriptional start site, binding within intronic regions (32%) and also downstream of the gene (20%). This distribution is consistent with transcription factor-binding analysis and regulatory sequence studies in mice and humans (Southall, 2009).

The resolution of DamID is ~1 kb and there are currently no motif discovery tools available that can analyse the large amount of sequence data generated by full genome DamID. Therefore, a motif discovery tool, called MICRA (Motif Identification using Conservation and Relative Abundance) was developed to identify overrepresented motifs in low-resolution data. In brief, 1 kb of sequence from each binding site is extracted and filtered for conserved sequences. The relative frequency of each 6-10 mer is then calculated and compared with background frequency. Using MICRA the E-box, CAGCTG, was identified as the most overrepresented 6 mer in the regions of Asense binding (131% overrepresented using a conservation threshold of 0.6. In support of the in vivo binding data, in vitro studies had previously shown that Asense binds to CAGCTG, which is also the binding site of the vertebrate Asense homologue Ascl1 (Mash1) (Southall, 2009).

A GO annotation analysis of the genes bound by Asense shows a highly significant overrepresentation of genes involved in nervous system development and cell fate determination. Similar analyses were performed for Deadpan and Snail and for both transcription factors; DNA conservation was enriched surrounding their binding sites. Deadpan and Snail targets fall broadly into the same gene ontology classes as Asense and Prospero and the binding peaks show a similar distribution relative to gene structure as for Asense. Motif discovery using MICRA identifies sites consistent with previously published in vitro studies for Deadpan (CACGCG and CACGTG) and Snail (CAGGTA). These analyses provide unbiased support for the Deadpan and Snail DamID experiments (Southall, 2009).

When comparing the data sets for Asense, Deadpan, Snail and Prospero genomic loci in which multiple transcription factors bind were found. This phenomenon has been described previously in a Drosophila cell line and, more recently, in mouse embryonic stem (ES) cells in which these loci are termed 'multiple transcription factor-binding loci' (MTL). The ES cell MTLs are associated with ES-cell-specific gene expression and are thought to identify genes important for stem cell self-renewal. The data provide an independent and direct, in vivo demonstration of the phenomenon described in these two earlier studies. Analysis of neural MTLs (as determined by binding of Asense, Deadpan, Prospero and Snail within a 2 kb window) shows increased sequence constraint, correlating with the number of transcription factors bound. The increase in conservation is higher than expected solely based on the combined binding sites of the factors studied. This suggests that further factors may bind to these loci. The loci associated with MTLs are enriched for genes required for proper neural development and for viability (Southall, 2009).

To investigate further the relationship between the number of transcription factors bound at a locus and the importance of the associated target gene in neural development, a database (www.neuroBLAST.org) was assembled comprising DamID data, expression profiling of neural transcription factors, and data on Drosophila nervous system development collated from genetic screens, expression screens, gene homology and text mining screens. Using a random permutation algorithm and training sets of known nervous system development genes weighted scores were assiged to each screen. A total score is calculated for each gene, providing an indication of the gene's involvement in nervous system development. Multiple gene lists can be searched in the database, which is a useful method to pinpoint key genes in user generated gene lists (e.g. expression array results) (Southall, 2009).

Using the data collected for the database, a correlation was consistently found between gene sets bound by increasing numbers of transcription factors and genes in Drosophila genetic screens for defects in nervous system development, eye development and cell-cycle progression or in text mining screens (occurrence of the gene or its homologue with neural or stem cell terms; r=0.98) (Southall, 2009).

This study has shown that Asense, Deadpan, Prospero and Snail bind to genes essential for neural development. This finding enables highlighting of novel genes that may be involved in neural development. The neuroBLAST database ranks genes based on the number of transcription factors bound, together with their appearance in external screens. In this way it identifies known key players in neural development such as prospero, brain tumour, miranda, seven up and glial cells missing. The majority of these genes are identified by multiple binding information (DamID data), independent of external screens and weighted scores (Southall, 2009).

Interestingly, there are many high scoring genes that have not previously been characterised for a role in Drosophila neural development. These include CG32158, an adenylate cyclase known to be expressed in the CNS, two putative transcription factors (CG2052 and CG33291), an NADH dehydrogenase (CG2014) and an F-box protein (CG9772). There is also cenG1A, an ARF GTPase activator, is bound by all four transcription factors and is expressed in neuroblasts. CG9650 is bound by Prospero and Deadpan, and is a homologue of the BCL11b oncogene, which is essential for proper corticospinal neuron development in vertebrates. Another high scoring gene identified by this method is canoe (bound by all four transcription factors, neuroBLAST score of 33.7), which has recently been shown to regulate neuroblast asymmetric divisions (Southall, 2009).

Using the binding data for these four transcription factors as a foundation, attempts were made to construct the transcriptional networks governing neural stem cell self-renewal and differentiation. Although DamID reports protein-binding sites, it cannot show how individual target genes are regulated in response to binding. Expression profiling of neuroblasts and GMCs from wild type and mutant embryos can provide this information, and provide greater insight into the biological function of each of the transcriptional regulators (Southall, 2009).

Expression profiling of asense mutants was performed on 50-100 neuroblasts and GMCs microdissected from the ventral nerve cord of stage 11 wild type and mutant embryos. Genes that are bound by Asense exhibited a significant change in expression level in asense mutant neuroblasts and GMCs. In many cases, neuronal differentiation and Notch pathway genes (enhancer of split complex [E(spl)-C] and bearded complex) are upregulated in the mutant, suggesting that Asense normally represses them, whereas neuroblast genes are downregulated, suggesting they require Asense for expression. This contrasts with the data for Prospero, which represses neuroblast genes and is required for the activation of differentiation genes. Combined with the fact that Prospero represses expression of Asense, these data support an antagonistic relationship between Prospero and Asense. For example, the neuroblast genes miranda and grainy head are activated by Asense and repressed by Prospero, whereas transcription of the differentiation gene Fasciclin I is promoted by Prospero but inhibited by Asense. Interestingly, however, there are also examples of differentiation and cell-cycle exit genes activated by Asense, such as commissureless, hikaru genki and dacapo. Furthermore, when the full expression array data from prospero mutants and asense mutants are compared by cluster analysis two clusters were found in which genes are regulated antagonistically, but also two clusters in which genes are similarily regulated. These data suggest a dual role for Asense: activating the expression of neuroblast genes and repressing differentiation genes in the neuroblast, whereas promoting differentiation when present in the GMC (Southall, 2009).

This study has combined in vivo chromatin profiling and cell-specific expression profiling to identify the gene regulatory networks directing neural stem cell fate and promoting differentiation in the Drosophila embryo. Asense, Deadpan, Snail and Prospero were found to bind to many of the same target genes. The targets of Asense, Deadpan and Snail include neuroblast genes but also many differentiation genes. The binding of these neural stem cell factors to differentiation genes is not entirely unexpected. In vertebrates, stem cell transcription factors bind to and repress differentiation genes to maintain the stem cell state. Additionally, it is becoming apparent that transcription factors can have roles in both activation and repression, in Drosophila and in vertebrate stem cell transcriptional networks. The ability to either repress or activate is likely to be due to interaction with co-factors, and the ability to recruit chromatin remodelling complexes to specific loci (Southall, 2009).

It was shown previously that Prospero represses the expression of Asense and Deadpan in GMCs, supporting a model whereby a core set of genes involved in neuroblast self-renewal and multipotency is activated by the neuroblast transcription factors and repressed by Prospero. This study has shown that, in part, Asense acts oppositely to Prospero, promoting the expression of neuroblast genes and repressing certain differentiation genes. However, the data also indicate that Asense can promote the expression of some genes required for differentiation, including the cell-cycle inhibitor dacapo, which is a member of the p21/p27 family of cdk inhibitors. dacapo expression inititates in the GMC; a reduction was observed in levels of dacapo mRNA in the asense mutant neuroblasts and GMCs, similar to what has been reported in the developing optic lobe. asense mRNA is known to be expressed in at least a subset of GMCs, and Asense protein is present in larval GMCs. This suggests that Asense has a secondary role, to promote GMC cell-cycle exit and differentiation. Asense is absent in larval PAN neuroblasts whose progeny, unlike GMCs, divide in a stem cell-like manner. Ectopic expression of Asense prevents formation of these daughter cells, which can undergo extra divisions (Bowman, 2008), possibly by the upregulation of dacapo, and other differentiation genes (Southall, 2009).

The expression pattern, function and binding site specificity of Asense all correlate strongly with its vertebrate counterpart, Ascl1 (Mash1). Mash1 is expressed in neural precursors in vertebrates, is known to regulate genes involved in Notch signalling (Delta, Jag2, Lfng and Magi1), cell-cycle control (Cdc25b) and neuronal differentiation (Insm1)and recognises the E-box sequence, CAGCTG. Furthermore, Mash1 is consistently found to promote neuronal differentiation, consistent with a pro-differentiation role for Asense. Conversely, it was shown that Asense activates the expression of certain neuroblast genes, such as miranda, which is expressed in all neuroblasts and repressed by Prospero. Deadpan and Snail bind to many neuroblast genes. Given that the expression of deadpan and snail is restricted to pan-neural neuroblasts, it is likely that they can also activate the expression of neuroblast genes. However, confirmation of this awaits expression profiling of deadpan and snail mutant neuroblasts and GMCs (Southall, 2009).

Finally, this study has shown that multiple transcription factor binding is associated with genes that have critical functions in neural development. This relationship can be used to identify novel genes involved in neural development, including those with vertebrate counterparts. A similar gene network and data mining study, using two pair-rule genes in Drosophila, has recently been used to identify a new marker for kidney cancer. Therefore, large-scale analysis of gene regulatory networks, as used here, provides a powerful approach to identifying key genes involved in development and disease (Southall, 2009).

RTK signaling modulates the Dorsal gradient

The dorsoventral (DV) axis of the Drosophila embryo is patterned by a nuclear gradient of the Rel family transcription factor, Dorsal (Dl), that activates or represses numerous target genes in a region-specific manner. This study demonstrates that signaling by receptor tyrosine kinases (RTK) reduces nuclear levels and transcriptional activity of Dl, both at the poles and in the mid-body of the embryo. These effects depend on wntD, which encodes a Dl antagonist belonging to the Wingless/Wnt family of secreted factors. Specifically, it was shown that, via relief of Groucho- and Capicua-mediated repression, the Torso and EGFR RTK pathways induce expression of WntD, which in turn limits Dl nuclear localization at the poles and along the DV axis. Furthermore, this RTK-dependent control of Dl is important for restricting expression of its targets in both contexts. Thus, the results reveal a new mechanism of crosstalk, whereby RTK signals modulate the spatial distribution and activity of a developmental morphogen in vivo (Helman, 2012).

Specification of body axes in all metazoans is initiated by a small number of inductive signals that must be integrated in time and space to control complex and unique patterns of gene expression. It is therefore of utmost importance to unravel the mechanisms underlying crosstalk between different signaling cues that concur during early development. This study has elucidated a novel signal integration mechanism that coordinates RTK signaling pathways with the Dl nuclear gradient, and thus with terminal and DV patterning of the Drosophila embryo (Helman, 2012).

Previous work had identified an input by Torso signaling into specific transcriptional effects of Dl. The current results establish a general mechanism, which involves RTK-dependent control of the nuclear Dl gradient itself, and thus affects a large group of Dl targets. This regulatory input is based on RTK-dependent derepression of wntD, a Dl target that encodes a feedback inhibitor of the Dl gradient. Thus, Dl activates wntD effectively only when accompanied by RTK signaling, enabling region-specific negative-feedback control of the nuclear Dl gradient. In the absence of RTK signaling, wntD is not expressed and the levels of nuclear Dl are elevated. Consequently, Dl target genes are ectopically expressed, both at the poles and along the DV axis (Helman, 2012).

Torso RTK signaling depends on maternal cues and is independent of the Dl gradient. Thus, it can be viewed as a gating signal that operates only at the embryonic poles, where it controls Dl-dependent gene regulation. However, the activity of the EGFR RTK pathway later on in development crucially depends on Dl, which induces the neuroectodermal expression of rhomboid, a gene encoding a serine protease required for processing of the EGFR ligand Spitz. In this case, EGFR-dependent induction of WntD represents a negative feedback loop that reduces nuclear levels of Dl laterally and, consequently, limits the expression of multiple Dl targets along the DV axis (Helman, 2012).

It should be noted that the regulatory interactions that have been characterized do not preclude the existence of other mechanisms modulating nuclear Dl concentration or activity. For example, the progressive dilution or degradation of maternal components involved in Toll receptor activation upstream of Dl should cause reduced Dl nuclear accumulation and retraction of its targets as development proceeds. It is also possible that Torso- or EGFR-induced repressors block transcription of Dl target genes directly. Accordingly, the ectopic sna expression observed in embryos mutant for components of the Torso pathway such as DSor and trunk probably reflects both loss of WntD activity on Dl and loss of Hkb-mediated repression of sna. In this context, it is interesting to note that sna expression expands and colocalizes with Hkb at the poles of wntD mutants; perhaps repression of sna by Hkb is not sufficient to override increased Dl activation in this genetic background. Thus, the Torso pathway probably employs more than one mechanism to exclude Dl target expression from the termini. Furthermore, the existence of such additional regulatory mechanisms could explain why wntD mutants do not have a clear developmental phenotype, despite the broad effects on Dl-dependent gene expression patterns caused by the genetic removal of wntD. It is proposec that corrective mechanisms are present, which make the terminal and DV systems robust with respect to removal of the WntD-based feedback, such as RTK-induced repressors. Understanding the basis of this robustness will require additional studies (Helman, 2012).

This work shows that RTK-dependent relief of Gro- and Cic-mediated repression is essential for transcriptional activation of wntD by Dl. Correspondingly, in the absence of cic or gro, the early expression of wntD expands ventrally throughout the domain of nuclear Dl. The early onset of this derepression, and the presence of at least one conserved Cic-binding site in the proximal upstream region of wntD, indicate that repression of wntD may be direct. Interestingly, it is thought that Gro and Cic are also involved in assisting Dl-mediated repression of other targets such as dpp and zen, as gro and cic mutant embryos show derepression of those targets in ventral regions. However, as ectopic wntD expression in these mutants leads to reduced nuclear localization of Dl along the ventral region, it is conceivable that decreased Dl activity also contributes to the derepression of dpp and zen (Helman, 2012).

In conclusion, the data presented in this study demonstrate RTK-dependent control of nuclear Dl via wntD, based on multiple regulatory inputs, including negative gating, feed-forward loops and negative feedback control. Together, these mechanisms provide additional combinatorial tiers of spatiotemporal regulation to Dl target gene expression. Future studies will show whether other signal transduction cascades and/or additional developmental cues also impinge on the Dl morphogen gradient (Helman, 2012).

The Interactive Fly resides on the
Society for Developmental Biology's Web server.