Promoter Structure

caudal contains a TATA-box deficient (TATA-less) promoter. Such promoters have a conserved sequence motif, A/GGA/TCGTG, termed the downstream promoter element (DPE), which is located about 30 nucleotides downstream of the RNA start site of many TATA-less promoters, including caudal. DNase I footprinting of the binding of epitope-tagged TFIID to TATA-less promoters reveals that the factor protects a region that extends from the initiation site sequence (about +1) to about 35 nucleotides downstream of the RNA start site. There is no such downstream DNase I protection induced by TFIID in promoters with TATA motifs. This suggests that the DPE acts in conjunction with the initiation site sequence to provide a binding site for TFIID in the absence of a TATA box to mediate transcription of TATA-less promoters (Burke, 1996).

Transcriptional Regulation

Different maternal and zygotic transcripts suggest the presence of two-stage specific promoters (Mlodzik 1987a).

The caudal gene is regulated by hunchback. caudal is a maternally and zygotically expressed gene. The two phases of expression can functionally replace each other. The zygotic expression forms an abdominal and a posterior domain. The abdominal cad domain is regulated by the Hunchback gradient through repression at high concentrations in the anterior region of the embryo and activation at low concentrations in the posterior region of the embryo (Schulz, 1995).

Hunchback is involved directly in establishing the caudal gradient through repression of cad RNA translation in anterior regions of the embryo, and activation at low concentrations of HB in the posterior region of the embryo (Schulz 1995).

Krüppel is required for cad expression in Malpighian tubules (Liu, 1992).

T-related gene expression is activated by tailless, but Trg does not regulate itself. Trg expression in the hindgut and anal pad primordia is required for the regulation of genes encoding transcription factors (even-skipped, engrailed, caudal, AbdominalB and orthopedia) and cell signaling molecules (wingless and decapentaplegic). In Trg mutant embryos, the defective program of gene activity in these primordia is followed by apoptosis (initiated by reaper expression and completed by macrophage engulfment), resulting in severely reduced hindgut and anal pads. Only anal pad expression of cad requires Trg; expression in Malpighian tubules and hindgut is not affected. dpp expression is absent in the hindgut. Females mutant for Trg are sterile, and no egg chambers in their ovaries progress beyond stage 7 of ovarian development (Singer, 1996).

Transcriptional regulation of the caudal gene by DRE/DREF

The caudal-related homeobox transcription factors are required for the normal development and differentiation of intestinal cells. Recent reports indicate that misregulation of homeotic gene expression is associated with gastrointestinal cancer in mammals. However, the molecular mechanisms that regulate expression of the caudal-related homeobox genes are poorly understood. A DNA replication-related element (DRE) has been identified in the 5' flanking region of the Drosophila caudal gene. Gel-mobility shift analysis reveals that three of the four DRE-related sequences in the caudal 5'-flanking region are recognized by the DRE-binding factor (DREF). Deletion and site-directed mutagenesis of these DRE sites results in a considerable reduction in caudal gene promoter activity. Analyses with transgenic flies carrying a caudal-lacZ fusion gene bearing wild-type or mutant DRE sites indicate that the DRE sites are required for caudal expression in vivo. These findings indicate that DRE/DREF is a key regulator of Drosophila caudal homeobox gene expression and suggest that DREs and DREF contribute to intestinal development by regulating caudal gene expression (Choi, 2004).

Dynamical analysis of regulatory interactions in the gap gene system of Drosophila

Genetic studies have revealed that segment determination in Drosophila melanogaster is based on hierarchical regulatory interactions among maternal coordinate and zygotic segmentation genes. The gap gene system constitutes the most upstream zygotic layer of this regulatory hierarchy, responsible for the initial interpretation of positional information encoded by maternal gradients. A detailed analysis of regulatory interactions involved in gap gene regulation is presented based on gap gene circuits, which are mathematical gene network models used to infer regulatory interactions from quantitative gene expression data. The models reproduce gap gene expression at high accuracy and temporal resolution. Regulatory interactions found in gap gene circuits provide consistent and sufficient mechanisms for gap gene expression, which largely agree with mechanisms previously inferred from qualitative studies of mutant gene expression patterns. These models predict activation of Kr by Cad and clarify several other regulatory interactions. This analysis suggests a central role for repressive feedback loops between complementary gap genes. Repressive interactions among overlapping gap genes show anteroposterior asymmetry with posterior dominance. Finally, these models suggest a correlation between timing of gap domain boundary formation and regulatory contributions from the terminal maternal system (Jaeger, 2004b).

Although activating contributions from Bcd and Cad show some degree of localization, positioning of gap gene boundaries during cycle 14A is largely under the control of repressive gap-gap cross-regulatory interactions. Thereby, activation is a prerequisite for repressive boundary control, which counteracts broad activation of gap genes in a spatially specific manner. In addition, gap genes show a tendency toward autoactivation, which increasingly potentiates activation by Bcd and Cad during cycle 14A. Autoactivation is involved in maintenance of gap gene expression within given domains and sharpening of gap domain boundaries during cycle 14A (Jaeger, 2004b).

Regulatory loops of mutual repression create positive regulatory feedback between complementary gap genes, providing a straightforward mechanism for their mutually exclusive expression patterns. Such a mechanism of 'alternating cushions' of gap domains has been proposed previously. The results suggest that this mechanism is complemented by repression among overlapping gap genes. Overlap in expression patterns of two repressors imposes a limit on the strength of repressive interactions between them. Accordingly, repression between neighboring gap genes is generally weaker than that between complementary ones. Moreover, repression among overlapping gap genes is asymmetric, centered on the Kr domain. Posterior to this domain, only posterior neighbors contribute functional repressive inputs to gap gene expression, while anterior neighbors do not. This asymmetry is responsible for anterior shifts of posterior gap gene domains during cycle 14A (Jaeger, 2004b).

Repression by Tll mediates regulatory input to gap gene expression by the terminal maternal system. Tll provides the main repressive input to early regulation of the posterior boundary of posterior gt, and activation by Tll is required for posterior hb expression. Note that these two features form only during cycle 13 and early cycle 14A, while other gap domain boundaries are already present at the transcript level during cycles 10-12 and largely depend on the anterior and posterior maternal systems for their initial establishment. The delayed formation of posterior patterning features and their distinct mode of regulation are reminiscent of segment determination in primitive dipterans and intermediate germ-band insects, supporting a conserved dynamical mechanism across different insect taxa (Jaeger, 2004b).

The set of regulatory interactions presented here provides a consistent and sufficient dynamical mechanism for gap gene expression. In summary, this set of interactions consists of the following five basic regulatory mechanisms: (1) broad activation by Bcd and/or Cad, (2) autoactivation, (3) strong repressive feedback between mutually exclusive gap genes, (4) asymmetric repression between overlapping gap genes, and (5) feed-forward repression of posterior domain boundaries by the terminal gap gene tll. In the following subsections, evidence is discussed concerning specific regulatory interactions involved in each of these basic mechanisms in some detail (Jaeger, 2004b).

Activation by Bcd and Cad: Activation of gap gene expression by Bcd and Cad is supported by the following. Bcd binds to the regulatory regions of hb, Kr, and kni. The kni regulatory region also contains binding sites for Cad. The anterior domains of gt and hb are absent in embryos from bcd mothers. The posterior domain of gt is missing in embryos mutant for both maternal and zygotic cad, while the posterior domain of kni is absent in embryos mutant for maternal bcd plus maternal and zygotic cad. These results suggest partial redundancy of activation of kni by Bcd, consistent with evidence from zygotic cad embryos from bcd mothers, where maternally provided Cad is sufficient to activate kni (Jaeger, 2004b).

Kr expression expands anteriorly in embryos from bcd mothers, which is due to the absence of the anterior gt and hb domains. Bcd has been shown to activate expression of Kr reporter constructs. The fact that Kr is still expressed in embryos from bcd mutant mothers has been attributed to activation by general transcription factors or low levels of Hb. In contrast, the models predict that this activation is provided by Cad. Although Kr expression is normal in embryos overexpressing cad, repressive control of Kr boundaries could account for the lack of expansion of the Kr domain in such embryos (Jaeger, 2004b).

The activating effect of Cad on hb found in gap gene circuits is likely to be spurious. The anterior hb domain is absent in embryos from bcd mutant mothers, which show uniformly high levels of Cad. Moreover, the complete absence of the posterior hb domain in tll mutants suggests activation of posterior hb by Tll rather than by Cad. It is believed that this spurious activation of hb by Cad is due to the absence of hkb in gap gene circuits. The posterior hb domain fails to retract from the posterior pole in hkb mutants, suggesting a repressive role of Hkb in regulation of the posterior hb border. Consistent with this, the posterior boundary of the posterior hb domain never fully forms in any of the circuits. Moreover, Tll is constrained to a very small or no interaction with hb due to the absence of the posterior repressor Hkb, since activation of hb by Tll would lead to increasing hb expression extending to the posterior pole (Jaeger, 2004b).

Autoactivation:: A role for autoactivation in the late phase of hb regulation is supported by the fact that the posterior border of anterior hb is shifted anteriorly in a concentration-dependent manner in embryos with decreasing doses of zygotic Hb. Weakened and narrowed expression of Kr in mutants encoding a functionally defective Kr protein suggests Kr autoactivation. Similarly, a delay in the expression of gt in mutants encoding a defective Gt protein indicates gt autoactivation. However, the results suggest that gt autoactivation is not essential. It is generally weaker than autoactivation of other gap genes, and circuits lacking gt autoactivation show no specific defects in gt expression. Finally, in the case of kni, there is no experimental evidence for autoactivation, while some authors have even suggested kni autorepression. No such autorepression has been detected in any gap gene circuit (Jaeger, 2004b).

Repression between complementary gap genes: Mutual repression of gt and Kr is supported by the following. gt expression expands into the region of the central Kr domain in Kr embryos. In contrast, Kr expression is not altered in gt mutants before germ-band extension. However, Gt binds to the Kr regulatory region, and the central domain of Kr is absent in embryos overexpressing gt. Moreover, Kr expression extends further anterior in hb gt double mutants than in hb mutants alone. The above is consistent with this analysis, which shows no significant derepression of Kr in the absence of Gt even though repression of Kr by Gt is quite strong (Jaeger, 2004b).

Hb binds to the kni regulatory region, and the posterior kni domain expands anteriorly in hb mutants. Embryos overexpressing hb show no kni expression at all, and embryos misexpressing hb show spatially specific repression of kni expression.There is no clear posterior expansion of kni in hb mutants. This could be due to the relatively weak and late repressive contribution of Hb on the posterior kni boundary or due to partial redundancy with repression by Gt and Tll. The posterior hb domain expands anteriorly in kni mutants, but anterior hb expression is not altered in these embryos. Nevertheless, a role of Kni in positioning the anterior hb domain is suggested by the fact that misexpression of kni leads to spatially specific repression of both anterior and posterior hb domains. Moreover, only slight posterior expansion of anterior hb is observed in Kr mutants, while hb is completely derepressed between its anterior and posterior domains in Kr kni double mutants (Jaeger, 2004b).

Repression between overlapping gap genes: gt, kni, and Kr show repression by their immediate posterior neighbors hb, gt, and kni, respectively. Retraction of posterior Gt from the posterior pole during midcycle 14A fails to occur in hb mutants, and no gt expression is observed in embryos overexpressing hb. The posterior kni boundary is shifted posteriorly in gt mutant embryos, and kni expression is reduced in embryos overexpressing gt. Note that these effects are very subtle and were not reported in similar studies by different authors. A weak but functional interaction of Gt with kni is consistent with these results. This interaction was found to be essential even in a circuit where it was deemed below significance level. Finally, Kni has been shown to bind to the Kr regulatory region, and the central Kr domain expands posteriorly in kni mutants (Jaeger, 2004b).

In contrast, no effect of Kr on hb was detected. However, hb expression expands posteriorly in Kr mutants. This effect is likely to involve repression of hb by Kni. Kni levels are reduced in Kr embryos. hb is completely derepressed between its anterior and posterior domains in Kr kni double mutants, whereas anterior hb does not expand at all in kni mutants alone. Taken together these results suggests that there is direct repression of hb by Kr in the embryo, but it is at least partially redundant with repression of hb by Kni (Jaeger, 2004b).

Unlike repression by posterior neighbors, no or only weak repression was found of posterior kni, gt, and hb by their anterior neighbors Kr, kni, and gt, respectively. Most gap gene circuits show weak activation of hb by Gt. Graphical analysis failed to reveal any functional role for such activation. Moreover, no functional interaction was found between gt and Kni. Although relatively weak repression of kni by Kr was found in 6 out of 10 circuits, no specific patterning defects could be detected in the other 4. Consistent with the above, expression of posterior hb is normal in gt mutants, and both the anterior boundaries of posterior gt and kni are positioned correctly in kni and Kr mutant embryos, respectively (Jaeger, 2004b).

Note that activation of kni by Kr, which has been proposed to explain decreased expression levels of kni in Kr mutants, was never found. The results strongly support the view that this interaction is indirect through Gt, which is further corroborated by the fact that kni expression is completely restored in Kr gt double mutants compared to that in Kr mutants alone (Jaeger, 2004b).

A significant repressive effect of Hb on Kr was found. Consistent with this, Hb has been shown to bind to the Kr regulatory region, and the central Kr domain expands anteriorly in hb mutants. However, partial redundancy of this interaction is suggested by correct positioning and shape of the anterior Kr domain in a circuit that does not show repression of Kr by Hb (Jaeger, 2004b).

It has been proposed that Hb plays a dual role as both activator and repressor of Kr. In the framework of the gene circuit model, concentration-dependent switching of regulative action could be implemented by allowing genetic interconnection parameters to switch sign at certain regulator concentration thresholds. The current model explicitly does not include such a possibility. Nevertheless, circuits have been obtained that reproduce Kr expression faithfully, suggesting that a dual role of Hb is not required for proper Kr expression. Moreover, activation of Kr by Hb was ever observed in any of the circuits. Therefore, the results support a mechanism in which the activation of Kr by Hb is indirect through derepression of kni (Jaeger, 2004b).

Repression by Tll: Only a few earlier theoretical approaches have considered terminal gap genes. Gap gene circuits accurately reproduce tll expression. However, in gene circuits, tll is subject to regulation by other gap genes, which is inconsistent with experimental evidence. In contrast, the correct expression pattern of tll in gap gene circuits allows its effect on other gap genes to be studied in great detail. Strong repressive effects of Tll on Kr, kni, and gt have been found. Tll binding sites have been found in the regulatory regions of Kr and kni. In tll mutants, Kr expression is normal, whereas expression of kni expands posteriorly, and the posterior gt domain fails to retract from the posterior pole. No expression of Kr, kni, or gt can be detected in embryos overexpressing tll under a heat-shock promoter (Jaeger, 2004b).

Targets of Activity

A major challenge in interpreting genome sequences is understanding how the genome encodes the information that specifies when and where a gene will be expressed. The first step in this process is the identification of regions of the genome that contain regulatory information. In higher eukaryotes, this cis-regulatory information is organized into modular units [cis-regulatory modules (CRMs)] of a few hundred base pairs. A common feature of these cis-regulatory modules is the presence of multiple binding sites for multiple transcription factors. Transcription factor binding sites have a tendency to cluster; the extent to which they do can be used as the basis for the computational identification of cis-regulatory modules. By using published DNA binding specificity data for five transcription factors active in the early Drosophila embryo, genomic regions containing unusually high concentrations of predicted binding sites were identified for these factors. A significant fraction of these binding site clusters overlap known CRMs that are regulated by these factors. In addition, many of the remaining clusters are adjacent to genes expressed in a pattern characteristic of genes regulated by these factors. One of the newly identified clusters, mapping upstream of the gap gene giant (gt) was tested; it acts as an enhancer that recapitulates the posterior expression pattern of gt (Berman, 2002).

The transcription factors Bicoid (Bcd), Caudal (Cad), Hunchback (Hb), Krüppel (Kr), and Knirps (Kni) act at very early stages of Drosophila development to define the anterior-posterior axis of the embryo. Bcd and Cad are maternal activators broadly distributed in the anterior and posterior portions of the embryo, respectively. Hb, Kr, and Kni are zinc-finger gap proteins that act primarily as repressors in specific embryonic domains. Sequences of previously described binding sites were collected for these five factors present in the cis-regulatory regions of known target genes. The binding sequences for each factor were aligned by using the motif-assembly program, and the binding specificities of each factor were modeled with position weight matrices (PWMs). PWMs are a useful way to represent binding specificities and provide a statistical framework for searching for novel instances of the motif in genome sequences (Berman, 2002).

A freely available program PATSER was used to search the genome for sequences that match these PWMs, and a web-based visualization tool, CIS-ANALYST ( was devised to display the location of predicted binding sites along with genome annotations in selected genomic regions. PATSER assigns a score to each potential site that reflects the agreement between the site and the corresponding PWM. These scores approximate the free energy of binding between the factor and site, and CIS-ANALYST uses a user-defined cutoff parameter to eliminate predicted low-affinity sites (Berman, 2002).

Using CIS-ANALYST, the distribution of Bcd, Cad, Hb, Kr, and Kni binding sites were examined in a 1-Mb genomic region surrounding the well-characterized eve locus at a site_p value of 0.0003. At this relatively high-stringency value, most experimentally verified binding sites are retained; at more restrictive values, many of these sites would be lost (Berman, 2002).

To investigate whether binding site clustering could help to explain the specificity of these factors for eve, a simple notion of binding site clustering was incorporated into CIS-ANALYST, allowing searches for segments of a specified length containing a minimum number of predicted binding sites. When the 1-Mb region surrounding eve was searched for dense clusters of predicted high-affinity sites (at least 13 Bcd, Cad, Hb, Kr, or Kni sites in a 700-bp window), three discrete regions were identified. Strikingly, these three clusters are all adjacent to eve, and overlap the previously characterized stripe 2, stripe 3 + 7, and stripe 4 + 6 enhancers (Berman, 2002).

To generalize and quantify these promising results, a broader collection of 19 well-defined CRMs from 9 Drosophila genes known to be required for proper embryonic development was compiled. Each of these CRMs is sufficient to direct the expression of a distinct anterior-posterior pattern in early embryos; genetic evidence suggests that each CRM is regulated by at least one of the following: Bcd, Cad, Hb, Kr, and Kni. Mutation and in vitro DNA binding studies completed on a subset of the CRMs provide evidence for a direct regulatory relationship. The same clustering criteria that were successful for identifying CRMs in eve (700-bp regions with at least 13 predicted binding sites) identified clusters overlapping 14 of these 19 known CRMs (Berman, 2002).

A search of the entire genome for 700-bp windows containing at least 13 predicted binding sites identified 133 clusters in addition to the 19 described above, or ~1 per 700 kb of noncoding sequence. As expected, when more stringent clustering criteria are used, both the number of known CRMs recovered and the number of novel clusters identified decrease. The novel clusters identified with a density of at least 15 binding sites per 700 bp, a level at which half of the known CRMs are still recovered, were further examined. Binding site plots for the 22 novel clusters identified at this high stringency condition, and 6 additional novel clusters identified with an equally stringent search by using only Bcd, Hb, Kr, and Kni have been published as supporting information on the PNAS web site). Twenty-three of these 28 clusters fall in regions between genes, whereas the remaining 5 fall in introns. There are therefore 49 genes that either contain a novel cluster of binding sites or flank an intergenic region that does. The expression patterns of these 49 genes in early embryos were examined by whole-mount RNA in situ hybridization and DNA microarray hybridization. At least 10 of the 28 clusters were adjacent to a gene that showed localized anterior-posterior expression in the syncitial or cellular blastoderm stages, consistent with early regulation by maternal effect or gap transcription factors. Although the numbers are small, this is significantly more than the 1 or 2 expected if the positions of clusters had been chosen at random (Berman, 2002).

One of these clusters is located ~2 kb upstream of the gap gene giant (gt). During cellularization, gt is expressed in two broad domains, one in the anterior and one in the posterior portion of the embryo. The pattern of expression of the posterior expression domain is determined by the activities of Cad, Hb, and Kr. However, the cis-regulatory sequence controlling this posterior expression pattern has not been precisely identified. Whether this cluster of binding sites might be the gt posterior enhancer was evaluated. A 1.1-kb fragment containing this cluster was placed in a reporter construct containing the eve minimal promoter fused to a lacZ reporter gene. The expression pattern of this construct largely recapitulates the early expression pattern of the gt posterior expression domain. In the absence of Kr function, the anterior border of the gt posterior domain shifts anteriorly, indicating repression by Kr. The construct containing the gt posterior enhancer exhibits a similar shift in the absence of Kr (Berman, 2002).

Organization of developmental enhancers in the Drosophila embryo

Most cell-specific enhancers are thought to lack an inherent organization, with critical binding sites distributed in a more or less random fashion. However, there are examples of fixed arrangements of binding sites, such as helical phasing, that promote the formation of higher-order protein complexes on the enhancer DNA template. This study investigated the regulatory 'grammar' of nearly 100 characterized enhancers for developmental control genes active in the early Drosophila embryo. The conservation of grammar is examined in seven divergent Drosophila genomes. Linked binding sites are observed for particular combinations of binding motifs, including Bicoid-Bicoid, Hunchback-Hunchback, Bicoid-Dorsal, Bicoid-Caudal and Dorsal-Twist. Direct evidence is presented for the importance of Bicoid-Dorsal linkage in the integration of the anterior-posterior and dorsal-ventral patterning systems. Hunchback-Hunchback interactions help explain unresolved aspects of segmentation, including the differential regulation of the eve stripe 3 + 7 and stripe 4 + 6 enhancers. Evidence is presented that there is an under-representation of nucleosome positioning sequences in many enhancers, raising the possibility for a subtle higher-order structure extending across certain enhancers. It is concluded that grammar of gene control regions is pervasively used in the patterning of the Drosophila embryo (Papatsenko, 2009).

Nearly 100 characterized enhancers and ~30 associated binding motifs control the patterning of the early Drosophila embryo, probably the best understood developmental process. These enhancers and sequence-specific TFs regulate the expression of ~50 genes controlling AP and DV patterning, including segmentation and gastrulation. The known TFs controlling embryogenesis represent less than ~10% of all TFs in the Drosophila genome. Thus, this analysis of regulatory grammar was restricted to the ~100 AP and DV enhancers and their ~30 TF inputs (31) (Papatsenko, 2009).

The recent completion of whole-genome sequence assemblies for 12 divergent Drosophila species has created an unprecedented opportunity for analyzing enhancer evolution. In this study 96 selected enhancer sequences from D. melanogaster were mapped to all 12 Drosophila genomes, using the UCSC Browser. The resulting collection combined 1420 kb of genomic sequence data in 1127 sequences, representing 60 enhancers in 23 AP genes and 36 enhancers in 31 DV genes. The entire collection of sequences and binding motifs is available at the Berkeley on-line resource (Papatsenko, 2009).

Inspection of aligned enhancer sequences among all 12 Drosophila species revealed strong conservation within the D. melanogaster subgroup (D. melanogaster, D. simulans, D. seichellia, D. yakuba and D. erecta) and also within the D. obscura group (D. pseudoobscura and D. persimilis). In order to focus on evolutionary changes in these enhancers the seven most divergent Drosophilids were analyzed: D. melanogaster, D. ananassae, D. pseudoobscura, D. willistoni, D. mojavensis, D. virilis and D. grimshawi. The remaining five species contain conservation patterns that are similar to those present in D. melanogaster or D. pseudoobscura (Papatsenko, 2009).

Short-range TF-binding linkages (0-80 bp) were examined in the collection of 96 enhancers from seven species for homo- and heterotypic pairs of binding motifs. Binding sites for the 30 most reliable TF motifs (see the Berkeley online resource) were mapped in enhancers using position weight matrices with match probability cutoff values set to ~2E-04. Distance histograms were generated for distances smaller than 80 bp, measured between the putative centers of each pair of neighboring site matches. Periodic signals were identified in the distance histograms using Fourier analysis, and statistical significance was estimated by bootstrapping positions of site matches in each enhancer sequence (Papatsenko, 2009).

Fourier analysis has identified helical phasing (~11 bp spacing) for several different homotypic activator-activator motif pairs. Such periodic signals were found in the distributions of Bcd-binding sites. Weaker helical-phasing signals were also identified for Caudal (Cad) and Dl-binding sites. Periodic signals close to two DNA turns (~20-22 bp) were found for Twi, Hb and Kruppel. Such helical phasing raises the possibility of direct protein-protein interactions (Papatsenko, 2009).

A weaker, ~11.4-bp periodic signal was detected in the distribution of heterotypic activator-activator site pairs, including Dl-Twi and Bcd-Cad. In contrast, there is a significant reduction in helical phasing signatures for activator-repressor motif pairs, and in fact, an over-representation of site pairs with 'anti-helical' spacing (15.2 bp). A similar 15.2 bp anti-helical signal was detected in distributions of all possible pair-wise combinations of the 30 binding motifs examined in this study. Thus, it would appear that any two randomly chosen binding sites are more likely to occupy the opposite sides of the DNA duplex as compared with helical phasing. This observation raises the possibility that most TFs function either additively or antagonistically to one another and just a special subset of TFs function in a synergistic fashion as reflected by helical phasing of the associated binding sites (Papatsenko, 2009).

The preceding analysis considered 'short-range' organizational constraints, involving linked binding sites separated by <25-30 bp. The possibility of 'long-range' constraints were also considered. The 96 enhancers under study possess characteristic 'unit lengths' of ~500 bp to 1.5 kb (300 bp minimum). The minimal/maximal sizes of the functional enhancers and the 'optimal' site densities can be determined by the amount of encoded information (pattern complexity), mechanisms of TF-DNA recognition such as lateral diffusion, or structural chromatin features like nucleosome positioning (Papatsenko, 2009).

Differential distance histograms reveal an over-representation of short-range linkages (<50 bp), but a depletion in mid-range distances (100-500 bp). These observations raise the possibility that TFs are distributed in a non-uniform manner across the length of the enhancer. That is, there may be sub-clusters, or 'hotspots', of binding sites within a typical enhancer. Such hotspots are observed in the prototypic eve stripe 2 enhancer, whereby 8 of the 12 critical binding sites are observed within two ~50-bp fragments located at either end of the minimal 480 bp enhancer. Homotypic motifs display the greatest propensity for such sub-clustering. Homotypic clusters (38) usually contain 3-5-binding sites distributed over 50-100 bp. Heterotypic activator-activator motif pairs also demonstrate sub-clustering, but these clusters are smaller (<25-30 bp) and usually contain just a pair of heterotypic sites. Heterotypic activator-repressor pairs show moderate enrichment over a distance of 50-70 bp, which is in agreement with the well-documented phenomenon of 'short-range repression'. Depletion of mid-range spacing constraints (around ~200 bp) is especially striking in the case of heterotypic motif pairs. Thus, activator synergy is like short-range repression: it appears to depend on closely linked binding sites (Papatsenko, 2009).

A possible explanation for this depletion of mid-range spacing is the occurrence of positioned nucleosomes, which might separate functionally distinct regions within an enhancer, and also separate neighboring enhancers. To test this hypothesis, nucleosome formation potential was compared with the distributions of TF-binding motifs in enhancers using the 'Recon' program. Three of the four eve enhancers that were examined (eve 1+5, eve 2 and eve 4+6) display a clear negative correlation between potential nucleosome formation and the distribution of TF-binding sites. This observation is consistent with the depletion of nucleosomes near TF-binding sites in vertebrates. This anti-correlation is especially striking in the case of the bipartite eve stripe 1+5 enhancer, where two enhancer regions (stripe 1 and stripe 5) are separated by a 400 bp 'spacer' DNA (in positions 600-1000), which might promote positioning of two nucleosomes and associated linker sequences (Papatsenko, 2009).

To investigate nucleosome positioning further, nucleosome-forming potential was measured in two sets of sequences, previously identified based on clustering of Dl sites and tested in vivo for enhancer activity. One set of sequences functioned as bona fide enhancers and produced localized patterns of gene expression across the DV axis of early embryos. The other set produced no expression in transgenic embryos, despite the presence of the same quality Dl-binding site clusters. The nucleosome-forming potential of the enhancers (true positives) was lower than that of the non-functional sequences (false-positives). These observations raise the possibility that the false Dl-binding clusters fail to function due to the formation of inactive nucleosomal structures (Papatsenko, 2009).

All 465 possible pairwise motif combinations for the 30 relevant binding motifs were tested for conservation in divergent drosophilids. Only linked binding sites, separated by a distance with small variations (max. distance bin = five bases) were considered. In the case of motif pairs, statistical significance was evaluated by bootstrapping columns in the binding motif alignments, thus preserving patterns of conservation. Pairs of homotypic motifs strongly prevailed in this type of analysis (28% of total pairs versus 6.5% expected), suggesting that homotypic interactions are important and pervasive in embryonic patterning. The strongest linkages were found for Bcd, Cad and Hb homotypic pairs. Each of these pairs was shared by five to six different enhancers and conserved in four to seven species. Among the identified heterotypic motif pairs, the most interesting were Bcd-Dl, Bcd-Cad and Dl-Twi (Papatsenko, 2009).

To identify cases of binding site pairs organized in a more flexible fashion, significant motif combinations were extracted using large distance bins or large distance variations. Along with the previously identified motif pairs, this analysis revealed several additional combinations, mainly involving the 'TAG-team' sequence motif, which is recognized by Zelda, a ubiquitous zinc finger TF. Zelda participates in the activation of the early zygotic genome and regulates a wide range of critical patterning genes. Indeed, significant combinations were identified for the TAG motif and Bcd, Dl and Hb. However, all of these TAG-X combinations exhibit spacing variability in different Drosophilids (Papatsenko, 2009).

It is conceivable that these results represent an underestimate of significantly linked motif combinations since very conservative cutoff values were used for statistical evaluation. A database of shared and/or conserved motif pairs, including those below the selected significance cutoff P = 0.03 is available from the Berkeley online resource (Papatsenko, 2009).

Conserved Bcd-Dl-binding site pairs were identified in the enhancers of several AP- and DV-patterning genes, including sal (AP), brk and sog (DV). The sites were found at similar distances, in the same orientation and were conserved in all seven species. It was suggested that the Bcd sites in the brk enhancer might augment gene expression in anterior regions, but this possibility was not directly tested. In wild-type embryos, both brk and sog exhibit significantly broader patterns of gene expression in anterior regions. This expanded pattern is lost in bcd mutants (Papatsenko, 2009).

Highly conserved Hb tandem repeats were detected in the regulatory regions of pair-rule genes, in the gap gene Kruppel, and in the Notch-signaling gene nubbin. Most of the homotypic Hb-Hb site pairs fall into two major groups, separated by either 6-8 or 13-15 bases. Some of the pair-rule enhancers selectively conserve either the 'short' or 'long' arrangement. For example, the eve stripe 4 + 6 enhancer contains two short Hb elements, while the stripe 3 + 7 enhancer contains a single long element. The odd 3 + 6 enhancer contains both short and long elements with various degrees of conservation. The hairy stripe 2,6,7 enhancer contains a single short element. Among the known gap genes, the long and short Hb elements were widely present in the enhancers of Kruppel, and in the blastoderm enhancer of nubbin, but not in any of the known knirps enhancers. It is conceivable that the distinct Hb site arrangements are important for the differential regulation of pair-rule genes by the Hb gradient (Papatsenko, 2009).

In conclusion, the systematic analysis of TF-binding sites in AP and DV patterning enhancers suggests a much higher degree of grammar, or fixed arrangements of binding sites, than is commonly believed. Developmental enhancers are thought to be highly flexible, with randomly distributed binding sites sufficing for the integration of multiple TFs. The results suggest that a large number of enhancers contain conserved short-range arrangements of pairs of binding sites. For instance, virtually all of the enhancers that respond to intermediate and low levels of the Dl gradient contain conserved arrangements of Dl-binding sites along with recognition sequences for other critical DV determinants, such as Twist and Zelda. Cooperating pairs of Bcd sites are found in enhancers responding to low Bcd concentrations, such as Knirps. Finally, distinctive arrangements of Hb-binding sites might influence whether the associated target genes are activated or repressed by high or low levels of the Hb gradient (Papatsenko, 2009).

Caudal and the anterior determinant Bicoid cooperate to form a partly redundant activator system in the posterior region of the embryo, serving to activate abdominal gap genes giant and knirps. Caudal doesn't bind the knirps promoter cooperatively (Rivera-Pomar, 1995 and Schulz, 1995) .

Multiple transcriptional activators and repressors interact with the fushi tarazu 'zebra-stripe' promoter unit to bring about the positional specificity of ftz transcription. Caudal is one such regulator. The cad gene product can increase the level of ftz transcription in the posterior half of the embryo by interacting with multiple copies of a TTTATG consensus sequence located in the zebra-stripe unit. This is one path by which the product of a maternally expressed segmentation gene, expressed in an antero-posterior concentration gradient, can directly regulate the expression of a pair-rule gene (Dearolf, 1989).

Ectopic expression of CAD at the anterior end of cellular blastoderm embryos has been found to disrupt head development and segmentation, due to alteration of the expression of segmentation genes such as fushi tarazu and engrailed, as well as repression of head-determining genes such as Deformed. While CAD is probably required to activate transcription of fushi tarazu in the posterior half of the embryo, it should not be expressed in the anterior half prior to gastrulation. If so, this suggests a role for the CAD gradient (Macdonald, 1986).

The expression of the pair-rule gene hairy (h) in seven evenly spaced stripes along the longitudinal axis of the Drosophila blastoderm embryo is mediated by a modular array of separate stripe enhancer elements. The minimal enhancer element, which generates reporter gene expression in place of the most posterior h stripe 7 (h7-element), contains a dense array of binding sites for factors providing the trans-acting control of h stripe 7 expression as revealed by genetic analyses. The stripe seven enhancer is found in a minimal 932 bp region from a 1.5 kb DNA fragment of the h upstream region. The h7-element mediates position-dependent gene expression by sensing region-specific combinations and concentrations of both the maternal homeodomain transcriptional activators, Caudal and Bicoid, and of transcriptional repressors encoded by locally expressed zygotic gap genes. Zygotic caudal expression is not required for activation. Caudal and Bicoid, which form complementing concentration gradients along the longitudinal axis of the embryo, function as redundant activators, indicating that the anterior determinant Bicoid is able to activate gene expression in the most posterior region of the embryo. The spatial limits of the h stripe-7 domain are brought about by the local activities of repressors that prevent activation. The spatial limit of h7 is significantly altered in the gap mutants tailless, knirps and kruppel, but not in embryos lacking either hunchback, giant or huckebein. There are seven binding sites for Bcd, twenty-three for caudal, five for Kruppel, fourteen for Knirps, eight for Hunchback and five for Tailless. In the absence of both cad and bcd, activation still occurs. Thus, a third activator, likely to be Kr, must function in such embryos. It is thought that Kr acts as both a repressor and an activator within the h7 element depending on its concentration. The posterior border is set in response to Tll activity under the control of the terminal maternal organizer system. The anterior border of the expression domain is due to repression in response to Kni. The results suggest that the gradients of Bicoid and Caudal combine their activities to activate segmentation genes along the entire axis of the embryo (La Rosee, 1997).

Ectopic expression of CAD at later stages of development has no obvious effects on embryogenesis or imaginal disc development, suggesting that the homeotic genes of the Antennapedia and bithorax complexes are almost completely epistatic to caudal (Mlodzik, 1990). Nevertheless, there is now reason to believe that cad is required for Abd-B function in suppressing some aspects of the tail (Kuhn, 1995).

The homeobox gene Caudal regulates constitutive local expression of antimicrobial peptide genes in Drosophila epithelia

In Drosophila, although the NF-kappaB transcription factors play a pivotal role in the inducible expression of innate immune genes, such as antimicrobial peptide genes, the exact regulatory mechanism of the tissue-specific constitutive expression of these genes in barrier epithelia is largely unknown. This study shows that the Drosophila homeobox gene product Caudal functions as the innate immune transcription modulator that is responsible for the constitutive local expression of antimicrobial peptides cecropin and Drosomycin in a tissue-specific manner. These results suggest that certain epithelial tissues have evolved a unique constitutive innate immune strategy by recruiting a developmental 'master control' gene (Ryu, 2004).

In silico identification of putative genomic binding sites of AMP genes and their transcription factors was performed by using the MatInd and MatInspector systems. In this analysis, several cis elements (such as the kappaB motif, the GATA motif, and Cad binding motifs) were commonly found in the promoter regions of all known AMPs. Transcription factors resulting from this analysis were systematically tested for their capacity to induce AMP genes in the immunocompetent Schneider cell line SL2. In Schneider cells stably expressing Cad, the expression of all seven AMP genes was greatly enhanced, suggesting that Cad is a potential transcription regulator. This result prompted an in-depth investigation into the in vivo role of Cad using two representative AMP genes (IMD pathway-controlled Cec and Toll pathway-controlled Drs). Because the Cad gene product contains a homeodomain, which indicates that the protein has a DNA-binding capability, the cis elements responsible for Cad-induced Cec and Drs expression were examined. To identify the cis elements responsible for Cad-induced Cec and Drs expression, a luciferase reporter assay of various mutant constructs having deletions in the Cec promoter region was performed in Drosophila Schneider cells. Cad-induced luciferase activity in cells transfected with the plasmid with a deletion from -751 to -484 bp was found to be almost invariant compared with that in cells transfected with the wild-type construct. However, luciferase activity remained at the basal level in cells transfected with the plasmid having a deletion from bp -751 to -377. These results suggest that the region from bp -484 to -377 of the Cec promoter is a candidate region for Cad-protein DNA recognition elements (CDREs). For Drs, the region covering bp -1082 to -1008 was identified as a candidate region for CDREs of Drs. Based on these results, six putative binding sites (S1 to S6) with the consensus Cad binding motif (T>C/A>G)TTT(A>G>C)(T>G/A/C)(G>T/C/A)(A>G/T/C) were identified in the promoter region of Cec and Drs. To determine whether Cad possesses a DNA binding capability with these putative binding sites of the Cec promoter, DNA-binding experiments were performed with the recombinant Cad protein using wild-type probes and various mutant probes. The results showed that GST-Cad is able to bind to two Cad binding motifs, at the S2 and S5 sites. Luciferase reporter analysis with a plasmid carrying double mutations in the putative binding sites (S2 and S5) revealed that these sites are essential for Cad-mediated Cec promoter regulation. Similar methods were employed to identify the CDREs for Cad-mediated Drs promoter regulation. The luciferase reporter assay with plasmids carrying point mutations in the putative CDREs together with the EMSA and supershift assay revealed that Cad is capable of directly regulating the expression of Drs via four CDREs (S1 to S4) found in its promoter. These results demonstrate the involvement of Cad in the regulation of AMP genes, providing yet another function for this homeotic transcriptional regulator, well known for its key role in anteroposterior patterning of the embryo (Ryu, 2004).

This study shows that a Cad-LacZ reporter and endogenous Cad mRNA are expressed in the salivary glands and ejaculatory duct, where AMP expression is constitutive. Vertebrate Cad homologues are well known to participate in early embryogenesis, the development of the intestine, and colon tumorigenesis. However, apart from their developmental roles, the physiological functions and target genes of the Cad homeobox gene family are unknown. The observation that Cad regulates AMP gene expression in a subset of epithelia indicates a new function for this trans-activator in the local defense against microbial infection and/or maintenance of microbial flora. At present, the real in vivo function of AMP gene expression in local epithelia in Drosophila is not known. In the local-infection experiment, enhanced mortality in the Cad-RNAi-expressing flies was not obserfed following short-term (1 h) bacterial feeding. However, although local AMP expression is not directly related to the rate of survival of infection, the locally secreted AMPs may help to prevent the onset of infections (Ryu, 2004).

It is well known that the Toll/NF-kappaB signaling pathway for dorsoventral body axis formation mainly regulates the inducible expression of the Drs gene during the systemic immune response. Interestingly, this pathway has been well conserved during evolution and assists NF-kappaB activation via Toll-like receptors in the human innate immune system. The results show that, in the local epithelial immune system, NF-kappaB-independent, constitutive expression of Drs and Cec in the barrier epithelial tissues is mainly controlled by the homeobox gene Cad, a master controller of anteroposterior body axis formation. The developmental genes involved in specification of the fly body plan (dorsoventral and anteroposterior body axes) have been recruited for this evolutionally ancient first line of defense (Ryu, 2004).

The involvement of Cad in the constitutive local innate immunity illustrates the complexity of the tissue-specific regulation of AMP expression in Drosophila. To better visualize the complexity and dynamic of the innate immune response in Drosophila, a comprehensive scheme was constructed. Experimental infection such as septic injury rapidly induces various AMPs, mainly in the fat body (known as systemic immunity), via two different NF-kappaB pathways (Toll and IMD pathways), whereas natural infection, such as local bacterial infection, activates the expression of AMPs via the IMD pathway only in a subset of epithelial tissues (known as inducible local innate immunity). These two inducible innate immune systems in Drosophila are rather distinct because septic injury cannot activate the inducible local immune system. The third type of AMP regulation is the constitutive local expression of AMPs in an NF-kappaB-independent manner in several epithelia. This type of strategy is believed to be very ancient in evolution and may be very efficient in certain epithelia by avoiding chronic NF-kappaB activation where the contact with microbes is continuous (Ryu, 2004).

This study showed that Cad is capable of directly regulating Cec and Drs via CDREs found in their promoters in Drosophila Schneider cells. Furthermore, Cad binds in vitro to the CDREs found upstream of AMP genes in a gel shift assay. These results demonstrate that Cad is a direct trans-activator of AMP genes. The in vivo reporter analysis demonstrates that mutations affecting CDREs do not abolish the inducible systemic Drs expression in the fat body. These results clearly indicate that the CDREs, in contrast to kappaB sites, are not required for inducible Drs expression in the fat body during a systemic immune response. In addition to the fat body, the trachea is involved in inducible Drs expression. This tissue, in which Drs expression is normally absent but rapidly induced in response to local infection by Erwinia carotovora, is known to be involved in inducible local immunity. Surprisingly, even though there is no appreciable role for CDREs in the fat body, it was found that all 12 independent Drsmut-GFP-expressing fly lines (larvae and adults) exhibited spontaneous constitutive expression of Drs reporter activity in the trachea in the absence of local infection. One may speculate that CDREs can also act as negative cis elements in some epithelial tissues such as the trachea, where they can maintain the silencing of Drs expression, and that this depends on the specific cell type. Further studies will be needed to understand the complete tissue-specific Cad signaling pathway for AMP regulation in all epithelial tissues. In contrast to kappaB-dependent inducible AMP expression, the constitutive local innate immunity employs Cad for the expression of AMPs through CDRE motifs rather than kappaB motifs. Interestingly, for salivary glands, overexpression of the Cad-RNAi construct is sufficient to severely reduce Drs expression, indicating that constitutive local expression of Drs in salivary glands is greatly dependent on Cad. For the ejaculatory duct, although partial reduction of Cad modestly reduces Cec expression, only minor expression (~20%) of the Cec reporter was detected in flies carrying Cecmut-GFP, as well as flies overexpressing the dominant-negative form of Cad. This also indicates that Cec expression in this tissue is largely dependent on Cad. Interestingly, this study showed that not all constitutive local expression is dependent on Cad. The constitutive local expression in the female reproductive organs is completely CDRE independent, suggesting the existence of yet another unknown signaling pathway(s). Recently, Drosophila Toll-9, one of the Toll-related receptors, was found to trigger the constitutive expression of Drs in cultured cells (Ooi, 2002). It is possible that Toll-9 may control constitutive Drs expression in certain epithelia. The studies on the in vivo function of Toll-9 should elucidate this issue (Ryu, 2004).

The expression of various AMPs in analogous human epithelial tissues suggests that epithelial innate immunity is well conserved and that the careful regulation of AMP levels may be needed to maintain homeostasis in these tissues from Drosophila to humans. The presence of human Cad homologues, CDXs, raises interesting questions concerning their putative role(s) in human epithelial innate immune gene regulation (Ryu, 2004).

Reverse engineering the gap gene network of Drosophila

A fundamental problem in functional genomics is to determine the structure and dynamics of genetic networks based on expression data. A new strategy is described for solving this problem and it is applied to recently published data on early Drosophila development. The method is orders of magnitude faster than current fitting methods and allows fitting of different types of rules for expressing regulatory relationships. Specifically, this approach is sused to fit models using a smooth nonlinear formalism for modeling gene regulation (gene circuits) as well as models using logical rules based on activation and repression thresholds for transcription factors. The technique also allows inference of regulatory relationships de novo or testing network structures suggested by the literature. A series of models is fitted to test several outstanding questions about gap gene regulation, including regulation of and by hunchback and the role of autoactivation. Based on the modeling results and validation against the experimental literature, a revised network structure is proposed for the gap gene system. Interestingly, some relationships in standard textbook models of gap gene regulation appear to be unnecessary for, or even inconsistent with, the details of gap gene expression during wild-type development (Perkins, 2006).

The regulatory structure of the Combined model is itself sufficient to reproduce all six gap gene domains using either the gene circuit or logical formalisms for production rate functions. Support is cited for the Combined model, and then consider the results of the individual models in light of several outstanding questions about gap gene regulation are discussed (Perkins, 2006).

The maternal proteins Bcd and Cad are largely responsible for activating the trunk gap genes, with Bcd being more important for the anterior domains and Cad more important for the posterior domains. Bcd is a primary activator of the anterior hb domain, the anterior gt domain, and the Kr domain. Cad activates posterior gt. The kni domain is present in bcd mutants and in cad mutants, but not in bcd;cad double mutants. This suggests redundant activation by the two maternal factors. Such redundant activation of kni is present in the Unc-GC model. For the other models, the optimization selected one or the other as activators, but not both. Tll is crucial for activating the posterior hb domain, while it represses Kr, kni, and gt, preventing their expression in the extreme posterior. All the regulatory relationships between the gap genes in the Combined model are repressive. The complementary gap gene pairs, hb-kni and Kr-gt are known to be strongly mutually repressive, as was found in nearly all the models. [Repression of hb by Kni is not part of the Rivera-Pomar and Jäckle (RPJ) regulatory relationships (Rivera-Pomar, 1996b), but the unconstrained gene circuit (Unc-GC) model and Unc-Logic model (that employs the regulatory structure discovered by the unconstrained gene circuit fit, except that Gt activation of hb and Kni activation of gt were removed) included the link.] The models also suggest that mutual repression between hb and Kr helps to set the boundary between those two domains. A chain of repressive relationships, hb-gt-kni-Kr, causes the shifts in the Kr, kni, and posterior gt domains. Autoactivation by hb is well-established, and there is also some evidence for autoactivation by Kr and gt (Perkins, 2006).

Does Hb have a dual regulatory effect on Kr? There is a long-running debate about whether or not low levels of Hb activate Kr. In hb mutants, the Kr domain expands anteriorly, suggesting that Hb represses Kr. However, Kr expression in these mutants is lower than in wild-type and expands posteriorly in embryos overexpressing Hb. Further, in embryos lacking Bcd and Hb, the Kr domain is absent, but can be restored in a dosage-dependent manner by reintroducing Hb. These observations suggest that Hb activates Kr. It has been suggested, therefore, that low levels of Hb activate Kr while high levels repress it. An alternative explanation, however, is that the apparently activating effects of Hb are indirect, via Hb's repression of kni and Kni's repression of Kr. Optimization of the Unc-GC model, which could have resulted in activation or repression of Kr by Hb, but not both, resulted in repression. The RPJ models allow for a dual effect, but activation by Hb was eliminated during optimization of the RPJ-Logic model. The RPJ-GC model retained functional activation and repression of Kr by Hb. However, Kr expression in this model is defective. Kr is not properly repressed in the anterior. Further, Kr is ectopically expressed in a small domain in the posterior of the embryo. Thus, the current models provide no support for activation of Kr by Hb. The only support found, which is crucial in all models except Unc-Logic and also consistent with the mutant and overexpression studies, is for repression of Kr by Hb (Perkins, 2006).

What represses hb between the anterior and posterior domains? Another point of disagreement in the literature is what prevents the expression of hb between its two domains. In the model of Rivera-Pomar and Jäckle (1996b), repression by Kr is the explanation. The RPJ models confirm that this mechanism is sufficient. Specifically, in these models Kr repression prevents hb expression just to the posterior of the anterior hb domain. Between the Kr and posterior hb domains, there is no explicit repression of hb. Rather, Hb is not produced simply because of a lack of activating factors. In contrast, the models of Jaeger (2004a and b) detected no effect of Kr and attributed repression solely to Kni. The Unc-GC and Unc-Logic models found repression by Kni, but in addition to repression by Kr, not instead of it. Kr is more responsible for repression near the anterior hb domain and Kni is more responsible for repression near the posterior hb domain. This is consistent with observations of expression in mutant embryos. Embryos mutant for Kr show slight expansion of the anterior hb domain, while kni embryos show expansion of the posterior hb domain. In Kr;kni double mutants, hb is completely derepressed between its two usual domains. This suggests, as seen in the Unc-GC and Unc-Logic models, that Kr and Kni are both repressors of hb, that their activity is redundant in the center of the trunk, and that Kr and Kni are the dominant repressors for setting the boundaries of the anterior and posterior domains, respectively. This interpretation was also favored by Jaeger (2004a and b), on the basis of the mutant data, even though Jaeger's models did not find repression by Kr (Perkins, 2006).

The posterior hb domain. In all of the current models, the posterior hb domain is activated by Tll and sustained by Tll and hb autoactivation. Rivera-Pomar (1996b) did not consider the posterior hb domain, and did not include activation by Tll in his model. That link was added to the RPJ network structure because otherwise it was not possible to capture the posterior hb domain. The model of Jaeger (2004a and b) captured the domain without Tll activation by substituting activation from cad. However, there is no confirming evidence for such an interaction. The absence of posterior hb in tll mutants and the inability of the models to explain posterior hb by other means, leads to the straightforward hypothesis that Tll activates posterior hb. Posterior hb is unique in that the domain begins to form later than the other five domains modeled. In the RPJ models, this happens simply because high levels of Tll are needed to activate hb -- levels that are reached only at about t = 30 min. The Unc-GC and Unc-Logic models also employ repression by Cad to slightly delay Hb production in the posterior. However, there is no confirming evidence for such repression, and it is omitted from the Combined model (Perkins, 2006).

Shifting of the Kr, kni, and posterior gt domains. Domain shifting was first observed by Jaeger (2004a and b) and attributed to a chain of repressive regulatory relationships, hb-gt-kni-Kr. The current models largely support the importance of this regulatory chain, particularly the final two links. Repression of Kr by Kni was significant in all of the current models. Repression of kni by Gt was present in all models except RPJ-Logic, where it would be of little impact anyway, since RPJ-Logic has a defective posterior gt domain. Consistent with these findings, Kni binds to the regulatory region of Kr, and the Kr domain expands towards the posterior in kni mutants. Similarly, the kni domain expands posteriorly in gt mutants, while embryos overexpressing gt show reduced kni expression (Perkins, 2006).

Repression of gt by Hb is not as well supported by the current models. The Unc-GC model included the link, though the regulatory weight was the smallest of all those in the model. The link was eliminated from Unc-Logic and, of course, not present in the RPJ network structure. Instead, the models utilized decreasing activation by Cad (Unc-GC, Unc-Logic) and repression by Tll (Unc-GC, RPJ-GC) to shift the posterior gt domain. Even with these links, however, shifting of the domain is not well-captured. RPJ-GC appears to capture the posterior gt shift best (Figure 3E). However, it relies on its small ectopic Kr domain to repress gt, a completely incorrect mechanism. Interestingly, a gene circuit fit using the network structure of Sanchez and Thieffry (2001), captured the shift of posterior gt better than any of the other current models, and it did so using repression of gt by Hb, providing additional modeling support for the relationship. There also is strong mutant evidence in favor of the relationship. In hb mutants, the posterior gt domain does not retract from the posterior pole. Further, Gt is absent in embryos that have ubiquitous Hb, such as maternal oskar or nanos mutants or embryos expressing Hb ubiquitously under a heat-shock promoter. Thus, sufficient evidence was found to include a repressive link from hb to gt in the Combined model (Perkins, 2006).

Activating or repressing links that oppose the direction of the repressive chain were eliminated by optimization of the Unc-Logic, RPJ-GC, and RPJ-Logic models. In agreement with this result, the boundaries of the kni and posterior gt domains are correctly positioned in Kr and kni mutants, respectively. Thus, the simplest picture supported by the current models and consistent with the mutant studies is that there is no regulation from Kr, kni, or posterior gt to any of their immediate posterior neighbors, and that the repressive chain highlighted by Jaeger (2004a and b) is indeed responsible for domain shifting (Perkins, 2006).

Do gap genes autoregulate? All four of the current models include autoactivation by hb. This is supported by the observation that late anterior hb expression is absent in embryos lacking maternal and early zygotic Hb 47. The models suggest hb autoactivation also plays a crucial role in sustaining the posterior domain, once it has been initiated by Tll, a role not previously emphasized. Autoactivation for the other genes was found by the Unc-GC model, but is not part of the RPJ network structure. It included autoactivation only for Kr and gt in the Combined model, on the basis of a weakened and narrowed Kr domain in embryos producing defective Kr protein and a delay in gt expression in embryos producing defective gt protein. Interestingly, the gene circuit models of Jaeger (2004a and b) also found autoactivation for all four gap genes, but they considered autoactivation by gt to be the weakest and least certain. In contrast, the Unc-Logic model retained gt autoactivation while eliminating autoactivation for Kr and kni. The RPJ-Logic model was unable to reproduce the posterior gt domain. However, it was found that by adding gt autoactivation to the model, it was able to create and sustain posterior gt correctly, bringing the error of the model down to 15.34. This suggests that, after hb, gt is the most likely candidate for autoactivation. However, even this is not strictly necessary. The RPJ-GC model is able to reproduce and sustain the posterior gt domain without autoactivation by relying on cooperative activation from Bcd and Cad (Perkins, 2006).

Comparison of regulatory architectures. The regulatory relationships proposed by Rivera-Pomar and Jäckle (1996b) are not fully consistent with the data and require amending. Repression of gt by Kni, which contradicts the mechanism of domain shifts described by Jaeger (2004a and b), was eliminated by the optimization in both of the current models based on the RPJ regulators. Activation of kni by Kr was never observed. No support was found for a dual regulatory effect of Hb on Kr. Activation of Kr at low levels of Hb was eliminated in the RPJ-Logic model. It was retained in the RPJ-GC model, but resulted in serious patterning defects. Inclusion of Tll as an activator of hb was sufficient to produce the posterior hb domain. Based on the current fits and the primary experimental literature, there are likely other regulatory links missing from the model of Rivera-Pomar and Jäckle, though they are not strictly required to reproduce the wild-type gap gene patterns. Foremost is repression of hb by Kni, which appears important for eliminating hb expression anterior of the posterior domain. Fits based on the Sanchez and Thieffry (2001) regulatory relationships also support these conclusions (Perkins, 2006).

In contrast, the regulatory relationships in the Combined model and both the Unc-GC and Unc-Logic models are able to capture the wild-type gap patterns without gross defects. The relationships in the Unc-GC model are very similar to those obtained by Jaeger (2004a and b). For example, the regulation of Kr and kni is qualitatively equivalent in both models, and there is a single minor difference in the regulation of gt. The optimizations correctly identified activation of hb by Tll, which was missed by Jaeger (2004a and b), though the current models did less well at capturing shifting of the posterior gt domain. These regulatory relationships are also similar to those found by Gursky (2004), though that study was based on gap gene expression data with much lower accuracy and temporal resolution than the data used in this study. These similarities show that differences in the mathematical formulations of these models-as ordinary versus partial differential equations, how diffusion and nuclei doubling are modeled, and choice of boundary conditions and other simulation parameters-are not important for the reproduction of the gap gene patterns nor for the inference of regulatory relationships from the data (Perkins, 2006).

Caudal, a key developmental regulator, is a DPE-specific transcriptional factor

The regulation of gene transcription is critical for the proper development and growth of an organism. The transcription of protein-coding genes initiates at the RNA polymerase II core promoter, which is a diverse module that can be controlled by many different elements such as the TATA box and downstream core promoter element (DPE). To understand the basis for core promoter diversity, potential biological functions of the DPE were explored. It was found that nearly all of the Drosophila homeotic (Hox) gene promoters, which lack TATA-box elements, contain functionally important DPE motifs that are conserved from Drosophila melanogaster to Drosophila virilis. It was then discovered that Caudal, a sequence-specific transcription factor and key regulator of the Hox gene network, activates transcription with a distinct preference for the DPE relative to the TATA box. The specificity of Caudal activation for the DPE is particularly striking when a BREu core promoter motif is associated with the TATA box. These findings show that Caudal is a DPE-specific activator and exemplify how core promoter diversity can be used to establish complex regulatory networks (Juven-Gershon, 2008).

This study found that the DPE is used extensively in the network of genes that are involved in the development of the early Drosophila embryo. Nearly all of the Drosophila Hox gene promoters, which have been known to be TATA-less, contain functionally essential DPE motifs that are conserved from D. melanogaster to D. virilis. The two Hox genes lacking DPE motifs are also the most recent Hox genes from an evolutionary standpoint. Thus, the DPE is a critical yet previously unrecognized component of the Hox genes (Juven-Gershon, 2008).

The DPE is not only in the Hox genes, but is also present in ftz, gt, h, fkh, cad (zygotic promoter), and en, which are involved in early embryonic development. This finding suggests that the DPE is used broadly throughout the network of genes that mediate the development of the embryo. This hypothesis was tested by analyzing the transcriptional properties of Caudal, a ParaHox protein and sequence-specific DNA-binding factor that regulates ftz, gt, h, and fkh. These studies revealed that Caudal is a DPE-specific activator. The preference of Caudal for activating transcription from DPE- versus TATA-dependent core promoters is seen most distinctly either with the natural ftz enhancer-promoter region or with a core promoter containing a BREu motif associated with the TATA box. The effect of Caudal on transcription of two Hox genes, Antp and Scr, was examined and it was found that Caudal activates the TATA-less, DPE-dependent Antp P2 and Scr promoters. These findings thus provide a direct link between Caudal, a DPE-specific activator, and the DPE-containing Hox genes (Juven-Gershon, 2008).

The discovery that Caudal is a DPE-specific activator provides new insight into the basic mechanisms of transcriptional regulation. Previous enhancer-trapping experiments have shown that there are enhancers that activate DPE-dependent promoters but not TATA-dependent promoters; however, neither the cis-acting elements nor the trans-acting factors that are responsible for the DPE-specific activation had been identified. Therefore, these studies demonstrate the existence of a DPE-specific enhancer-binding factor. Moreover, it is likely that other core-promoter-specific enhancer-binding factors, such as TATA-specific activators, will be discovered (Juven-Gershon, 2008).

These experiments uncovered a novel activity of the BREu core promoter motif. The BREu is a 5' extension of the TATA box that is bound by the TFIIB basal/general transcription factor. Depending on the context, the BREu has been found to have either a positive or a negative effect on transcription. In this study, it was found that the BREu motif has little effect on basal/unactivated transcription, but potently suppresses the ability of Caudal to activate transcription via the TATA box. In contrast, the BREu in its normal upstream location has no effect on Caudal-mediated activation via the DPE. These findings indicate that the BREu can contribute to core-promoter-element-mediated transcriptional regulation. Hence, there is a positive linkage between Caudal and the DPE as well as a negative interaction between Caudal and the BREu-TATA element. The combination of both positive (DPE) and negative (BREu-TATA) interactions yields maximal specificity of Caudal function (Juven-Gershon, 2008).

The new findings lead to the model that transcriptional regulation involves the combined action of sequence motifs in both the core promoter and the enhancer. The ability of Caudal to discriminate among DPE, TATA, and BREu-TATA motifs regulates the flow of information from the enhancer-bound activator to the core promoter -- the site of transcription initiation. In this manner, core promoter elements can be viewed as a component of transcriptional circuits. In these transcriptional circuits, connections between enhancers and core promoters are established and modulated according to the properties of the activators and the sequence motifs in the core promoter. Thus, the discovery that Caudal is a core-promoter-specific activator reveals a new strategy with which complex transcriptional networks can be established. Hence, in a broader sense, these findings show how diversity in core promoter structure can contribute to organismal diversity (Juven-Gershon, 2008).

The Drosophila gap gene network is composed of two parallel toggle switches

Drosophila gap genes provide the first response to maternal gradients in the early fly embryo. Gap genes are expressed in a series of broad bands across the embryo during first hours of development. The gene network controlling the gap gene expression patterns includes inputs from maternal gradients and mutual repression between the gap genes themselves. In this study a modular design is proposed for the gap gene network, involving two relatively independent network domains. The core of each network domain includes a toggle switch corresponding to a pair of mutually repressive gap genes, operated in space by maternal inputs. The toggle switches present in the gap network are evocative of the phage lambda switch, but they are operated positionally (in space) by the maternal gradients, so the synthesis rates for the competing components change along the embryo anterior-posterior axis. Dynamic model, constructed based on the proposed principle, with elements of fractional site occupancy, required 5-7 parameters to fit quantitative spatial expression data for gap gradients. The identified model solutions (parameter combinations) reproduced major dynamic features of the gap gradient system and explained gap expression in a variety of segmentation mutants (Papatsenko, 2011).

Fertilized eggs of Drosophila contain several spatially distributed maternal determinants - morphogen gradients, initiating spatial patterning of the embryo. One of the first steps of Drosophila embryogenesis is the formation of several broad gap gene expression patterns within first 2 hrs of development. Gap genes are regulated by the maternal gradients, so their expression appears to be hardwired to the spatial (positional) cues provided by the maternal gradients; in addition, gap genes are involved into mutual repression. How the maternal positional cues and the mutual repression contribute to the formation of the gap stripes has been a subject of active discussion (Papatsenko, 2011).

Accumulated genetics evidence and results of quantitative modeling suggest the occurrence of maternal positional cues (position-specific activation potentials), contributing to spatial expression of four trunk gap genes: knirps (kni), Kruppel (Kr), hunchback (hb) and giant (gt). Existing data suggest that the central Knirps domain stripe is largely the result of activation by Bicoid (Bcd) and repression by Hunchback. Central domain Kruppel stripe is the result of both activation and repression from Hunchback, which acts as a dual transcriptional regulator on Kr. Hunchback is one of the most intriguing among the segmentation genes. Maternal hb mRNA is deposited uniformly, but its translation is limited to the anterior, zygotic anterior expression of hb is under control of Bcd and Hb itself. Zygotic posterior expression of Hunchback (not included in the current model) is under the control of the terminal torso signaling system. Giant is activated by opposing gradients of Bicoid and Caudal and initially expressed in a broad domain, which refines later into anterior and posterior stripes. This late pattern appears to be the consequence of Kruppel repression (Papatsenko, 2011).

Predicting functional properties of a gene network combining even a dozen genes may be a difficult task. To facilitate the functional exploration, gene regulatory networks are often split into network domains or smaller units, network motifs with known or predictable properties. The network motif based models can explain dynamics of developmental gradients and even evolution of gradient systems and underlying gene regulatory networks. The gene network leading to the formation of spatial gap gene expression patterns is an example, where simple logic appeared to be far behind the system's complexity. Gap genes provide first response to maternal gradients in the early fly embryo and form a series of broad stripes of gene expression in the first hours of the embryo development. While the system has been extensively studied in the past two decades both in vivo and in silico a simple and comprehensive model explaining function of the entire network has been missing (Papatsenko, 2011).

In the current study, a modular design has been proposed for the gap gene network; the network has been represented as two similar parallel modules (or two sub networks). Each module involved three network motifs, two for maternal inputs (one for one gap gene) and a toggle switch describing mutual repression in the pair of the gap genes. Formally, the toggle switches present in the gap gene network are evocative of the bistable phage lambda switch; however, they are operated by maternal inputs and their steady state solutions depend on spatial position in embryo, not environmental variables. The proposed modular design accommodated 5-7 realistic parameters and reproduced major known features of the gap gene network (Papatsenko, 2011).

Posttranscriptional Regulation

Bicoid, a DNA binding homeodomain protein and the primary determinant of anterior pattern in the fly, binds RNA, and acts as a translational repressor of Caudal mRNA. The Bicoid response element maps in the Caudal mRNA 3' untranslated region, and has been localized to a discrete 342-nucleotide segment. The Bicoid response element of Caudal mRNA contains at least two distinct BCD binding sites. An in-frame deletion of 11 amino acids within the BCD homeodomain results in a protein that is unable to regulate Caudal mRNA. Substitutions in the recognition helix of the Bicoid homeodomain results in a nonfunctional translational repressor. Thus the BCD homeodomain is implicated in repression of Caudal translation. Bicoid bound to the 3' UTR of CAD mRNA blocks translational initiation at the 5' end, possibly by interfering with a step that depends on the 5' cap structure (Dubnau, 1996).

In embryos lacking BCD activity as a result of mutation, the CAD gradient fails to form and CAD becomes evenly distributed throughout the embryo. This suggests that BCD may act in the region-specific control of CAD mRNA translation. BCD binds through its homeodomain to CAD mRNA in vitro, and exerts translational control through a BCD-binding region of CAD mRNA (Rivera-Pomar, 1996a).

The Drosophila body organizer Bicoid (Bcd) is a maternal homeodomain protein. It forms a concentration gradient along the longitudinal axis of the preblastoderm embryo and activates early zygotic segmentation genes in a threshold-dependent fashion. In addition, Bcd acts as a translational repressor of maternal Caudal (CAD) mRNA in the anterior region of the embryo. This process involves a distinct Bcd-binding region in the 3' untranslated region of CAD mRNA. Using cotransfection assays, Bcd is found to repress translation in a cap-dependent manner. Bcd-dependent translational repression involves a portion of the PEST motif of Bcd, a conserved protein motif best known for its function in protein degradation. Rescue experiments with Bcd-deficient embryos expressing transgene-derived Bcd mutants indicate that amino acid replacements within the C-terminal portion of the PEST motif prevent translational repression of CAD mRNA but allow for Bcd-dependent transcriptional activation. Thus, Bcd contains separable protein domains for transcriptional and translational regulation of target genes. Maternally-derived Cad protein in the anterior region of embryos interferes with head morphogenesis, showing that CAD mRNA suppression by Bcd is an important control event during early Drosophila embryogenesis (Niessing, 1999).

Bicoid (Bcd), the anterior determinant of Drosophila, controls embryonic gene expression by transcriptional activation and translational repression. Both functions require the homeodomain (HD), which recognizes DNA motifs at target gene enhancers and a specific sequence interval in the 3' untranslated region of Caudal (CAD) mRNA. The Bcd HD has been shown to be a nucleic acid-binding unit. Its helix III contains an arginine-rich motif (ARM), similar to the RNA-binding domain of the HIV-1 protein REV, needed for both RNA and DNA recognition. Replacement of arginine 54, within this motif, alters the RNA but not the DNA binding properties of the HD. Corresponding BCD mutants fail to repress CAD mRNA translation, whereas the transcriptional target genes are still activated (Niessing, 2000).

In order to characterize portions and individual amino acid residues of the Bcd HD that are specifically required for one or both Bcd regulatory functions, transgenes expressing wild-type or mutant bcd cDNAs were placed into the genome of homozygous bcd mutant females and their ability to rescue wild-type zygotic hb activation and cad mRNA translation in their embryos was assayed. Such embryos, referred to as 'bcd embryos,' fail to exert Bcd-dependent transcriptional activation of the zygotic target gene hb in their anterior half. Instead, the embryos show a duplication of the posterior Bcd-independent stripe of hb expression in the anterior region (Niessing, 2000).

Expressed Bcd mutant proteins that lack the helices I and II of the HD (BcdDeltaH1-2) or the amino acid interval between positions 42 and 51 in helix III (BcdT42-N51) fail to restore Bcd-dependent hb transcriptional activation and translational repression of CAD mRNA in the anterior region of bcd embryos. This indicates that the integrity of the Bcd HD is necessary for the control of transcription and translation. Transgene-dependent expression of BcdhIIIAntp, in which the C-terminal half of the Bcd HD is exchanged for the corresponding sequence of the Antennapedia (Antp) HD, rescues Bcd-dependent hb expression in the anterior region of bcd embryos, but no Cad gradient is formed. Bcd mutations in which two adjacent arginines at positions 53-54 and 54-55 of the HD, respectively, were replaced, fail to control Bcd-dependent transcription and translation. Thus, helix III of the Bcd HD is necessary for both transcriptional activation and translational repression, and amino acids within helix III are essential for specifying not only DNA binding but also RNA recognition by the HD. This proposal is consistent with the observation that part of the helix III of the Bcd HD has characteristics of an arginine-rich motif (ARM) (Niessing, 2000).

To test whether the conserved amino acids of Bcd's ARM are indeed required for RNA target recognition and whether single amino acid replacements may allow the DNA and RNA binding properties to separate, alanine replacement mutants of the Bcd HD were generated and their in vitro binding properties assayed. The Bcd HD (HDwt) binds both DNA and RNA, whereas HDK50A, HDN51A, HDR53A, and HDR55A failed to bind to both targets. Bcd HDR54A, which contains alanine in place of arginine in position 54 of the HD, bound DNA properly, but its RNA binding was reduced by more than one order of magnitude. The binding properties of HDK57A were indistinguishable from HDwt. In summary, arginine at position 54 of the HD is critical for specifying RNA versus DNA binding, and its replacement shifts the binding property of the HD to prefer DNA over RNA recognition (Niessing, 2000).

In order to test the in vivo relevance of these binding studies, the corresponding Bcd HD mutants were examined by transgene-dependent expression in bcd embryos. The Bcd mutants were generated in the context of an 8.7 kb genomic DNA fragment spanning the entire bcd locus, which fully rescues bcd embryos after P element-mediated transformation. The transgene-expressed BcdK57A protein, which contains an HD with normal DNA and RNA binding properties, causes Bcd-dependent hb expression and Cad gradient formation, and the embryos developed into normal-looking larvae and fertile adults. BcdN51A, BcdR53A, and BcdR55A, which contain HD mutations that cause the loss of DNA and RNA binding properties in vitro, fail to activate Bcd-dependent hb transcription and to repress translation of CAD mRNA; such embryos develop a bcd mutant phenotype. The BcdR54A mutant, which contains an HD with DNA, but no RNA, binding properties, was able to activate the transcription of hb but not to repress the translation of CAD mRNA. This observation is consistent with the result obtained using the transgene bearing the BcdR54S mutation, which contains a serine residue in place of arginine at position 54. Thus, both Bcd mutants that contain a replacement of arginine at position 54 of the HD fail to control CAD mRNA translation but do activate transcription of hb (Niessing, 2000).

Mutations of bcd that interfere with the control of CAD mRNA translation but not with the activation of transcription cause temperature-dependent head involution defects. The corresponding larvae develop the normal number and identity of head segments, which, however, fail to be properly assembled. The same phenotype would be expected for the BcdR54A mutant embryos, ensuring that the replacement affects only CAD mRNA translational control. bcd embryos expressing the BcdR54A mutant develop a normal segment pattern at 18°C and give rise to normal-looking and fertile adults. At 29°C, however, the majority of the embryos (more than 90%) die as unhatched larvae, and all of them express a strong head defect. The embryos show a normal expression pattern of the segment polarity gene engrailed (en) at stages 9-11, indicating that segments are generated normally. Furthermore, all discernible head markers can be observed in larval cuticle preparations, but, as observed with mutations affecting the translational repressor region of Bcd, the assembly of the head elements is strongly perturbed. The same temperature-dependent phenotype is observed when cad cDNA lacking the Bcd-responsive BBR in the 3'UTR is expressed in the preblastoderm embryo using the GAL4/UAS system. Taken together, the in vivo transgene studies and the in vitro binding results establish that a single amino acid replacement in the ARM of the Bcd HD specifically interferes with Bcd-dependent RNA binding and translational repression of CAD mRNA, without affecting DNA binding and transcriptional activation. The finding is consistent with the observation that an arginine residue at this position is conserved in ARMs but rare in HDs (Niessing, 2000).

The results provide strong evidence that the Bcd HD functions as a nucleic acid-binding unit that enables Bcd to function in transcriptional and translational control. In addition, the findings establish that the direct interaction of Bcd with the BBR of CAD mRNA shown in vitro is necessary to prevent Cad activity from interfering with head morphogenesis. Helix III of the Bcd HD has been identified as a region in which a single amino acid replacement shifts the in vitro binding property of the HD to prefer DNA over RNA recognition and abolishes CAD mRNA translational repression not affecting transcriptional activation by Bcd in vivo. The alpha-helical structure and sequence comparison between HIV-1 REV and the third helix of the Bcd HD indicate that Bcd formally fits as a member of the ARM family of RNA-binding proteins that show a low degree of amino acid sequence identity. The sequence similarity between the ARMs of HIV-1 REV and the Bcd HD is therefore remarkable. However, there is no corresponding sequence similarity observed between the RNA target sequences to which they bind. Furthermore, REV fails to bind the BBR, and Bcd-HD does not recognize the REV response element. Thus, the high degree of amino acid identity and conservation of the critical arginine residue in the ARMs of the Bcd HD and HIV-1 REV is not correlated with similarity at the level of the targets (Niessing, 2000).

Asparagine is absolutely conserved at position 51 of HDs and is also found in the corresponding position in ARM family members. It has been shown to provide base contacts in DNA/HD complexes and RNA target recognition by ARM proteins, respectively. Consistently, mutation of arginine in position 51 of the Bcd HD abolished DNA binding as well as RNA binding. In contrast, the 52-57 region of HDs interacts with DNA electrostatically, whereas some of the corresponding REV arginine residues are hydrogen bonded to bases. Mutating arginine at position 54, which is rare in other HDs, affects RNA binding without altering the DNA binding. In summary, these and earlier findings with respect to the DNA binding properties of HDs support the proposal that the ARM within the helix III of the Bcd HD is necessary for both RNA and DNA target recognition, and that individual amino acids within this portion of the HD specify RNA versus DNA binding (Niessing, 2000).

Although the Bcd HD is by now the only known HD with RNA binding properties, it has been noted that the ARM-containing RNA-binding domain of EIAV-TAT and the ribosomal protein L11 can fold into HD-like structures with the RNA-binding domain exposed as a helix III equivalent. The recently solved crystal structure of this protein bound to a ribosomal RNA fragment shows binding to the minor groove of RNA that is similar in width to a DNA major groove. The results also indicate that L11 uses the same surface as the HD does in binding DNA. The structural similarities and the fact that helix III regions of HDs are generally rich in basic amino acids suggest that HDs hold a high potential to either exert or to adopt RNA binding properties during evolution. The possibility that other HDs also bind RNAs and thereby provide HD proteins with dual regulatory functions is a challenging proposal (Niessing, 2000 and references therein).

Translational control plays a key role in many biological processes including pattern formation during early Drosophila embryogenesis. In this process, the anterior determinant Bicoid (BCD) acts not only as a transcriptional activator of segmentation genes but also causes specific translational repression of ubiquitously distributed caudal (cad) mRNA in the anterior region of the embryo. Translational repression of cad mRNA is dependent on a functional eIF4E-binding motif. The results suggest a novel mode of translational repression, which combines the strategy of target-specific binding to 3'-untranslated sequences and interference with 5'-cap-dependent translation initiation in one protein. The results suggest that 3'-UTR-bound BCD interferes with the assembly of the initiation complex and thereby causes repression of cad mRNA translation (Niessing, 2002).

The cap-dependent mode of translation depends on the assembly of an evolutionarily conserved protein complex that is initiated by the binding of the translation initiation factor 4E (eIF4E) to the m7GpppN-cap structure. Subsequently, the adapter protein eIF4G binds to eIF4E and allows additional factors (including eIF4A, eIF4B, eIF1, eIF1A, eIF2, eIF3, and the ribosomal subunits) to assemble into a complex that initiates translation. The cap-dependent translation initiation process can be regulated by eIF4E-binding proteins such as BP1, BP2, and Maskin. They block the eIF4E::eIF4G association through outcompeting binding to eIF4E, involving a small eIF4E-binding motif of the minimal consensus sequence YxxxxL (Niessing, 2002 and references therein).

The results show that Bcd can associate with cap-associated eIF4E in vitro and that the eIF4E-binding motif of Bcd is necessary for Bcd-dependent translational repression of cad mRNA in the embryo. These findings suggest a repression mechanism in which Bcd blocks the eIF4G::eIF4E interaction necessary for the initiation of cap-dependent cad mRNA translation. Because no interaction between recombinant eIF4E and Bcd could be detected in the absence of cad mRNA, it is concluded that the binding of Bcd to the cad 3'-UTR is most likely a prerequisite for their interaction. This interpretation is consistent with findings where a mutant Bcd, which lacks the ability to bind cad mRNA, is also unable to repress translation (Niessing, 2002).

Bcd-dependent control of translation of cad mRNA is likely to function in a manner similar to BP1, BP2, and Maskin. However, despite the intriguing similarities among BP1/BP2, Maskin, and Bcd, the modes of how they exert translational repression are distinct. BP1 and BP2 are part of a general mRNA repression system, which blocks eIF4E::eIF4G interaction in a reversible, cell-growth-dependent manner in response to insulin receptor signaling. In contrast, Maskin represses translation in an mRNA-specific manner. It binds to the cytoplasmic polyadenylation element-binding protein (CPEB), a factor that interacts with a short uridine-rich cytoplasmic polyadenylation element (CPE) of cyclin B mRNA. CPEB-tethered Maskin acts from the 3'-end of specific mRNAs by binding to eIF4E and blocking an association of eIF4E and eIF4G. In this mode of repression, target specificity of repression is provided by the interaction of CPEB with the CPE, whereas the repression of translation at the 5'-end is executed by Maskin. Bcd uses a strategy that combines these two features of CPEB and Maskin. Its homeodomain directly binds to the Bcd response element (BRE) in the 3'-UTR of cad mRNA and provides also a direct link to the 5'-cap-bound complex involving the eIF4E-interaction motif (Niessing, 2002).

The simplest model to account for Bcd-dependent repression of translation therefore involves three essential steps, which are (1) target recognition by binding to the specific target site within the 3'-UTR, a process mediated by Bcd's arginine-rich RNA-binding motif in the homeodomain, (2) looping of cad mRNA to allow for interaction of the 3'-UTR-bound Bcd with 5'-cap-bound eIF4E, which (3) causes a BP1/BP2-like blocking of the eIF4G-binding site on eIF4E to prevent the assembly of a functional translation initiation complex. The mode of Bcd-dependent repression of translation, therefore, combines the strategy of target-specific binding to 3'-UTRs as shown for a number of other translational repressors with a repression mechanism known from growth regulation and cyclin B-dependent cell cycle regulation (Niessing, 2002).

caudal RNA is localized to the posterior pole during embryogenesis. In bicaudal D embryos, CAD localizes to the two posterior poles arranged in mirror-image symmetry. This is consistant with the notion that cad has a functional role in specification of the posterior abdominal segments (Mlodzik 1987a)

Krüppel, caudal and cut are expressed in the Malpighian tubules before and during differentiation. Two of the genes, Krüppel and cut, are known to be required for development of the tubules. The absence of maternal and zygotic caudal function reduces their normal growth and elongation. Normal Krüppel function, which is known to be required for caudal expression, is also required for cut expression, while cut and caudal are expressed independently of each other. Loss of Krüppel activity confers hindgut characteristics on those cells that normally form the Malpighian tubules. Loss of cut function alters the expression of some markers but not others. The pathway of tissue specific gene regulation, apparently, branches beyond Krüppel to form at least a cut and a caudal branch (Liu, 1992).

While the Bcd gradient has served as a model system in understanding pattern formation in Drosophila, it is suspected that this is not the case in more ancestral insects. The long-germ mode of development as found in Drosophila is probably an adaptation to its particularly rapid embryogenesis. The ancestral type of embryogenesis in insects and arthropods is the short germ type. In these embryos, the germ rudiment forms at the posterior ventral side of the egg. In extreme cases like the grasshopper, it may be restricted to only a few percent of the total egg length - which makes it difficult to imagine how an anteriorly localized BCD mRNA could determine pattern formation at the posterior end of the egg. Moreover, classical experiments have only yielded evidence for a posteriorly localized organizing activity. Therefore, bcd could be considered a late addition during insect evolution and its pivotal function during embryogenesis could be restricted to higher dipterans. This paper is concerned with early pattern formation of the flour beetle Tribolium castaneum. Tribolium is a typical example for short germ embryogenesis, representing the ancestral type of embryogenesis in insects, albeit not in its extreme form, like the grasshopper. In contrast to Drosophila, only cephalic and thoracic segments, but not abdominal segments, are determined during the blastoderm stage. Furthermore, the most anterior 20% of the Tribolium blastoderm cells form an extra-embryonic membrane, the serosa. This structure is not found in this form in higher Dipterans like Drosophila, but is again an ancestral feature of insect embryogenesis. Prior to gastrulation, most blastoderm cells move from anterior and dorsal positions towards the posterior ventral region where they form the embryo proper. This germ rudiment then continues to grow from its posterior end to form a germ band which eventually encompasses all abdominal segments (Wolff, 1998).

Thus, in short germ embryos, the germ rudiment forms at the posterior ventral side of the egg, while the anterior-dorsal region becomes the extra-embryonic serosa. It is difficult to see how in these embryos an anterior gradient like that of Bicoid protein in Drosophila could be directly involved in patterning of the germ rudiment. Moreover, since it has not yet been possible to recover a bicoid homolog from any species outside the diptera, it has been speculated that the anterior Bicoid gradient could be a late addition during insect evolution. This question was addressed by analyzing the regulation of potential target genes of bicoid in the short germ embryo of Tribolium castaneum. Homologs of caudal and hunchback from Tribolium are regulated by Drosophila bicoid. In Drosophila, maternal Caudal mRNA is translationally repressed by Bicoid. Tribolium Caudal RNA is also translationally repressed by Bicoid, when it is transferred into Drosophila embryos under a maternal promoter. This strongly suggests that a functional bicoid homolog must exist in Tribolium. The second target gene, hunchback, is transcriptionally activated by Bicoid in Drosophila. Transfer of the regulatory region of Tribolium hunchback into Drosophila also results in regulation by early maternal factors, including Bicoid, but in a pattern that is more reminiscent of Tribolium hunchback expression, namely in two early blastoderm domains. Using enhancer mapping constructs and footprinting, it has been shown that Caudal activates the posterior of these domains via a specific promoter. These experiments suggest that a major event in the evolutionary transition from short to long germ embryogenesis was the switch from activation of the hunchback gap domain by Caudal to direct activation by Bicoid. This regulatory switch can explain how this domain shifted from a posterior location in short germ embryos to its anterior position in long germ insects, and it also suggests how an anterior gradient can pattern the germ rudiment in short germ embryos, i.e. by regulating the expression of caudal (Wolff, 1998).

The key to understanding the qualitative switch that took place in insect evolution is believed to lie in the more anterior serosa expression domain of Tribolium hb. Reporter gene data suggest that this domain may already be activated by Bcd in Tribolium. To explain the switch in the regulation of the more posterior gap domain of hb expression, one can envision an intermediate state, where the serosa domain and the embryonic (gap) domain have fused into a single domain. To achieve this, the evolution of a few additional Bcd binding sites in the hb upstream region would have been sufficient. In this intermediate stage both Bcd and Cad could have acted as activators on the gap domain of hb. Subsequent loss of Cad regulation would then have moved the posterior boundary of this combined domain towards the anterior. It is noted that the Tribolium hb gene has three known promoters, one of which appears to be specialized for mediating Cad regulation. In Drosophila, only two promoters are present, neither of which has a known responsiveness to Cad. Thus, in all likelihood, the Cad dependent promoter and its associated enhancer was lost. Since no other enhancer activity has been found for later expression patterns of hb in the cad dependent fragment, the loss of this region could have been a single step. Intriguingly, a combined serosa and gap domain is still evident in the lower dipteran Clogmia. In this fly, hb is expressed in a large anterior domain, from which at later stages also the serosa is recruited (Rohr, personal communication to Wolff, 1998). This mechanism, the modification of the way gap genes sense maternal positional information while this information itself remains constant, can explain how the blastoderm fate map changed during evolution of short germ insects to insects with long germ embryos. Moreover, it represents an intriguing example for the importance of regulatory adaptation during the evolution of developmental processes (Wolff, 1998).

A new paradigm for translational control: inhibition via 5'-3' mRNA tethering by Bicoid and the eIF4E cognate 4EHP

Translational control is a key genetic regulatory mechanism implicated in regulation of cell and organismal growth and early embryonic development. Initiation at the mRNA 5' cap structure recognition step is frequently targeted by translational control mechanisms. In the Drosophila embryo, cap-dependent translation of the uniformly distributed caudal (cad) mRNA is inhibited in the anterior by Bicoid (Bcd) to create an asymmetric distribution of Cad protein. d4EHP, an eIF4E-related cap binding protein, specifically interacts with Bcd to suppress cad translation. Translational inhibition depends on the Bcd binding region (BBR) present in the cad 3' untranslated region. Thus, simultaneous interactions of d4EHP with the cap structure and of Bcd with BBR renders cad mRNA translationally inactive. This example of cap-dependent translational control that is not mediated by canonical eIF4E defines a new paradigm for translational inhibition involving tethering of the mRNA 5' and 3' ends (Cho, 2005).

This study describes a new mode of mRNA-specific translational inhibition, which acts by tethering the mRNA 5′ and 3′ end via d4EHP, an eIF4E-related protein, and Bcd. d4EHP binds to the cad mRNA 5′ cap structure, while Bcd binds to BBR in its 3′ UTR. The interaction between d4EHP and Bcd is mediated through a sequence motif in Bcd that resembles, but is distinct from, the consensus eIF4E binding domain present in classical eIF4E binding proteins such as 4E-BPs and eIF4G. Inhibition of cad mRNA translation by the d4EHP:Bcd complex demonstrates for the first time the involvement of a cellular cap binding protein other than eIF4E in cap-dependent translational control. Furthermore, it provides a new molecular mechanism governing the formation of morphogenetic gradients during early Drosophila embryo development (Cho, 2005).

It was previously reported that Bcd inhibits anterior Cad synthesis through a direct interaction with eIF4E (Niessing, 2002). This conclusion was based largely on an in vitro demonstration that Bcd could be recovered from Drosophila extracts using a cap-affinity resin, which was prebound to an excess amount of recombinant eIF4E. However, under these conditions, only a small fraction of Bcd was recovered from the extracts. It is therefore a distinct possibility that Bcd actually bound to the cap-affinity resin through endogenous d4EHP that was also present in the extracts. This possibility is consistent with both the previous data and the present study. Further supporting this conclusion, endogenous deIF4E and Bcd were not shown to interact in the previous study. The data also indicate that the L73R mutation alone is sufficient to explain the previously reported bcdY68A/L73R double mutant phenotype (Cho, 2005).

The role of 4E-BPs in regulating cap-dependent translation is well documented. 4E-BPs inhibit translation by competing with eIF4G for binding to eIF4E and are therefore general inhibitors of cap-dependent translation, although the degree of inhibition varies among different mRNAs. Cup and Maskin are eIF4E binding proteins that regulate translation during oogenesis and embryonic development. They inhibit the translation of specific mRNAs by a simultaneous interaction with eIF4E at the mRNA 5′ end and proteins bound to sequence elements in the 3′ UTR. Thus, Cup and Maskin have to compete with eIF4G for binding to eIF4E. While the exact binding affinities of these proteins for eIF4E have not been determined, it is known that Maskin interacts rather weakly with eIF4E (Cho, 2005).

In contrast to 4E-BP, Cup, and Maskin, Bcd does not need to compete with eIF4G to interact with d4EHP. Rather, it is d4EHP that competes with eIF4E for cap binding, which results in translation being inhibited at the level of cap recognition. As a result of bypassing the need to disrupt the very stable eIF4E:eIF4G interaction, d4EHP should interdict translation more efficiently than 4E-BPs or other eIF4E binding proteins. 4EHP-mediated translational regulation may have a particularly important role in germline development, based on these results and on a recent report that a mutant allele of C. elegans 4EHP (ife-4) shows a severe egg-laying defect (Cho, 2005).

The delineation of a d4EHP-recognition sequence in Bcd (YxxxxxxL; x denotes any amino acid) that interacts with d4EHP via its Trp85 residue highlights the similarities between the d4EHP:Bcd interaction and that of eIF4G with eIF4E (YxxxxLphi in eIF4G; Trp73 in eIF4E; phi denotes any hydrophobic amino acid). Despite these parallels, the inability of Bcd to bind to eIF4E must be explained by structural differences. The presence of two proline residues at position +3 and +6 of the Bcd d4EHP binding motif is predicted to significantly alter the α-helical structure assumed by the YxxxxLphi peptide upon binding to eIF4E and thus prevent Bcd association with deIF4E. Furthermore, the eIF4E interaction surface of eIF4G is not limited to the YxxxxLphi motif but extends over a larger interface; the N-terminal domain of eIF4E is also required for folding and tight binding to eIF4G. Indeed, the ability of d4EHP to bind specifically to Bcd, and not to deIF4G and d4E-BP, can be explained by the importance of the N-terminal KHPL sequence of eIF4E in the interaction with eIF4G and 4E-BP, since this sequence is not conserved in d4EHP (Cho, 2005).

The demonstration that cad translation is repressed through a d4EHP- and Bcd-dependent tethering mechanism adds to the diversity of translational control mechanisms operating in the early Drosophila embryo. Why are so many translational repression pathways necessary? If an individual mechanism alone can reduce translation of a specific mRNA, but not completely abrogate it, a combination of inhibitory interactions may be needed in order to accomplish strict translational control. This can be advantageous if the diversity of factors (like Bcd, which can confer mRNA specificity for a given mechanism) is relatively limited. Multiple mRNAs also have to be translationally repressed in overlapping spatial and temporal domains. Controlling these mRNAs through mechanisms that target different components of the general translational machinery, rather than through a common mechanism, might allow more precise regulation of their individual expression patterns (Cho, 2005).

It is noteworthy that although 4EHP is conserved through evolution, Bcd exists only in higher dipterans. Thus, in other organisms, 4EHP must function during development through proteins that are analogous to Bcd. In summary, this study describes a novel mode of translational control in Drosophila development. Because cap-dependent translation regulation plays such an important role in gene expression, and since 4EHP is also expressed in somatic cells, it is predicted that examples of d4EHP-mediated translational repression other than cad are most likely to exist (Cho, 2005).

The Bin3 RNA methyltransferase is required for repression of caudal translation in the Drosophila embryo

Bin3 was first identified as a Bicoid-interacting protein in a yeast two-hybrid screen. In human cells, a Bin3 ortholog (BCDIN3) methylates the 5' end of 7SK RNA, but its role in vivo is unknown. This study shows that in Drosophila, Bin3 is important for dorso-ventral patterning in oogenesis and for anterior-posterior pattern formation during embryogenesis. Embryos that lack Bin3 fail to repress the translation of caudal mRNA and exhibit head involution defects. bin3 mutants also show (1) a severe reduction in the level of 7SK RNA, (2) reduced binding of Bicoid to the caudal 3' UTR, and (3) genetic interactions with bicoid, and with genes encoding eIF4E, Larp1, polyA binding protein (PABP), and Ago2. 7SK RNA coimmunoprecipitated with Bin3 and is present in Bicoid complexes. These data suggest a model in which Bicoid recruits Bin3 to the caudal 3' UTR. Bin3's role is to bind and stabilize 7SK RNA, thereby promoting formation of a repressive RNA-protein complex that includes the RNA-binding proteins Larp1, PABP, and Ago2. This complex would prevent translation by blocking eIF4E interactions required for initiation. These results, together with prior network analysis in human cells, suggest that Bin3 interacts with multiple partner proteins, methylates small non-coding RNAs, and plays diverse roles in development (Singh, 2011).

The human homolog of Bin3, also called BCDIN3 or methylphosphate capping enzyme (MePCE), was shown to methylate the 5' γ-phosphate on 7SK RNA and to stabilize 7SK RNA in cell culture. This study found that Bin3 associates with and stabilizes 7SK RNA in ovaries and embryos. And, as in human cells, Bin3 activity was specific for 7SK RNA and did not affect U3 RNA or another RNA pol III product, U6 RNA, both of which are methylated by distinct mechanisms. It seem likely, therefore, that Drosophila Bin3 has a similar biochemical activity to its human counterpart despite differing in size and sequence outside the AdoMet binding domain and the highly conserved Bin3-homology domain. Prior attempts to demonstrate protein-arginine methyltransferase activity of Bin3 were negative, consistent with Bin3 methylating RNA rather than protein. In Drosophila, there are two other Bin3-like genes, CG11342 and CG1239, but each is more divergent from the human BCDIN3 within the conserved motif architecture (24% and 39% identity, respectively) than Bin3. It is possible that CG1239, which is expressed in early embryos, could have partially overlapping functions with Bin3 that might contribute to the incomplete penetrance of the bin3 mutations (Singh, 2011).

Putative Bin3 orthologs containing the two conserved motifs are found in at least 70 eukaryotic organisms ranging from the yeast, Schizosaccharomyces pombe to humans, and including Caenorhabditis elegans, Arabidopsis thaliana, Xenopus laevis, and Danio rerio. It is not known what any of these genes do, with the possible exception of the zebrafish bin3 gene which was shown by morpholino knockdown to be important for anterior development and to display RNA splicing defects. Similar defects were sought in splicing of bicoid, caudal, eIF4E, d4EHP, and a control gene, taf1, known to show alternative splicing. No splicing defects were found using a sensitive qRT-PCR approach. It is possible that the splicing defects in zebrafish result from aberrant 5' capping of non-coding RNAs important for splicing (Singh, 2011).

Mammalian 7SK RNA has been studied extensively, but Drosophila 7SK RNA has only been annotated, and prior to this study has not been characterized. This study shows that 7SK RNA is highly expressed in ovaries and embryos and is regulated by Bin3 in a manner similar to that in humans (by BCDIN3). 7SK RNA can be coimmunoprecipitated with Bin3 and Bicoid and may work as a scaffold in translation repression. This is the first indication that 7SK RNA has a function apart from its role in the regulation of the pTEFb transcription elongation factor. While this study focused on Bicoid-dependent regulation, it is likely that 7SK RNA also functions in transcription elongation in other stages of development. Indeed, it was found that Drosophila 7SK RNA mutants showed larval lethality at later stages of development (Singh, 2011).

Bin3 seems to play no role in Bicoid's gene activation function, but instead is crucial for Bicoid-dependent repression of caudal mRNA. Bin3 seems to stabilize Bicoid at the caudal BRE via a mechanism that involves 7SK RNA. As suggested by genetic interaction data, the Bicoid/Bin3/7SK RNA complex may include Larp1, PABP, and Ago2, and target the eIF4E initiation factor (Singh, 2011).

La-related proteins are not restricted to control of transcription elongation. In C. elegans, a Larp1 homolog was shown to be important for downregulation of translation of mRNAs in the Ras-MAPK pathway and to localize to P-bodies, known sites of mRNA degradation, while in mammalian cells, LARP4B plays a stimulatory role in translation initiation. In Drosophila, it has been shown that Larp1 associates directly with PABP independent of RNA and double mutants show enhanced lethality, suggesting that Larp1 facilitates mRNA translation. It is not surprising, therefore, that genetic interactions were observed between bin3 and larp1, as well as with pAbp in the context of caudal translation regulation. Note that it is though that PABP (and Larp1) plays a negative role in translation initiation, as does PABP in the repression of msl-2 mRNA by Sex-lethal (Singh, 2011).

In human cells, BCDIN3 and LARP7 interact cooperatively with 7SK RNA forming a stable core complex that associates transiently with HEXIMS, hnRNPs and the P-TEFb elongation complex (see Drosophila Hexim). An emerging theme is that 7SK RNA serves as a scaffold for stable association of protein partners. In fact, there is evidence that 5' γ-methylation of 7SK RNA by BCDIN3 may occur co-transcriptionally, but that the modified RNA remains associated with both BCDIN3 and LARP7, which bind 7SK RNA cooperatively. It is proposed, therefore, that Bin3 and Larp1 are associated with 7SK RNA at the caudal BRE, but that 5'-methylation does not necessarily occur there. Consistent with the idea of cooperative binding to 7SK RNA, it was found that larp1 mutation enhanced the bin3 mutant phenotype (Singh, 2011).

Some of the phenotypes observed for bin3 mutants were also observed in mutants of the microRNA miR-184, including oogenesis defects and a cellularization defect. This was the rationale behind including ago2 in the genetic analysis. However, no effect was found of bin3 mutation on levels of several miRNAs, including miR-184, it was surprising to observe a genetic enhancement (albeit mild) of the bin3 phenotype when combined with an ago2 mutation. Ago2 has been shown to bind eIF4E and interfere with mRNA circularization mediated by PABP. However, this occurs in the context of the miRNA/RISC complex, so whether and how Ago2 participates in Bicoid-Bin3 repression is not clear, but it could potentially involve the 7SK RNA component (Singh, 2011).

Finally, no interaction was detected between bin3 and D4EHP, which encodes a previously identified partner of Bicoid important for repressing caudal translation. D4EHP interacts with Bicoid and is thought to directly bind the m7G cap of caudal mRNA, thereby displacing eIF4E and blocking all subsequent steps of initiation. Perhaps the Bin3 mechanism works redundantly with the D4EHP mechanism or perhaps Bin3 helps recruit D4EHP, and so that mutation of bin3 would preclude binding of D4EHP to the initiation complex. Thus, bin3 mutation would be epistatic to the D4EHPCP53 mutation. Further investigation will be needed to determine relationship between these two pathways (Singh, 2011).

Bin3 is unlikely to be a dedicated Bicoid interactor and probably has roles as an RNA methyltranferase in many distinct pathways throughout development. In adults, quantitative trait transcript analysis linked bin3 with sleep-wake cycling. While studying Bin3's role in embryonic patterning, strong oogenesis defects were observed, particularly in bin3 null mothers, although other allelic combinations also revealed similar defects, especially at 29°C. Specifically, bin3 loss-of-function mutants showed dorsalized egg shell phenotypes. Conversely, bin3 overepressing lines showed strong ventralized egg shell patterns that appear to result from a failure of the dorsal appendage primordium to resolve into two domains along dorsal midline. These defects are similar to those of early D-V patterning mutations in the grk pathway, and probably do not result from defects that occur in later during morphogenesis step (Singh, 2011).

bin3 loss-of-function mutants resembled mutations in capicua, squid, cup and fs(K10), among others, while bin3 overexpressing lines resembled grk and pAbp mutations. Interestingly, mechanisms for translation repression of unlocalized grk mRNA feature prominently in the D-V patterning pathway, with squid and cup playing a critical role in repression via interaction with eIF4E, and PABP55 being important for release of that repression. Staining of bin3 mutant ovaries revealed a delocalized signal for Gurken protein but not for grk mRNA. Given the role of Bin3 in translation regulation, and the egg shell phenotypes of bin3 mutations, it seems plausible that Bin3 plays a role in negative regulation of grk translation (Singh, 2011).

Results presented in this study show that Bin3 plays a critical role during both oogenesis and embryonic development. In embryos, Bin3 is required for Bicoid to establish the Caudal protein gradient. Bin3 binds 7SK RNA and likely works by methylating 7SK RNA and stabilizing a repressive complex that assembles on the Bicoid-response element in the 3' UTR of caudal mRNA. Bin3's role during oogenesis is less clear, but based on the observed eggshell phenotypes in bin3 mutants, and gurken expression, Bin3 could play a similar role to help ensure that grk mRNA is translated only in the anterior-dorsal region of the oocyte (Singh, 2011).

Estimating binding properties of transcription factors from genome-wide binding profiles

The binding of transcription factors (TFs) is essential for gene expression. One important characteristic is the actual occupancy of a putative binding site in the genome. In this study, an analytical model is proposed to predict genomic occupancy that incorporates the preferred target sequence of a TF in the form of a position weight matrix (PWM), DNA accessibility data (in the case of eukaryotes), the number of TF molecules expected to be bound specifically to the DNA and a parameter that modulates the specificity of the TF. Given actual occupancy data in the form of ChIP-seq profiles, copy number and specificity are backwards inferred for five Drosophila TFs during early embryonic development: Bicoid, Caudal, Giant, Hunchback and Kruppel. The results suggest that these TFs display thousands of molecules that are specifically bound to the DNA and that whilst Bicoid and Caudal display a higher specificity, the other three TFs (Giant, Hunchback and Kruppel) display lower specificity in their binding (despite having PWMs with higher information content). This study gives further weight to earlier investigations into TF copy numbers that suggest a significant proportion of molecules are not bound specifically to the DNA (Zabet, 2014: 25432957).

Protein Interactions

The moleskin gene product is essential for Caudal-mediated constitutive antifungal Drosomycin gene expression in Drosophila epithelia

The homeobox gene, Caudal, encodes the DNA-binding nuclear transcription factor that plays a crucial role during development and innate immune response. The Drosophila homologue of importin-7 (DIM-7), encoded by moleskin, was identified as a Caudal-interacting molecule during yeast two-hybrid screening. Both mutation of the minimal region of Caudal responsible for Moleskin binding and RNA interference (RNAi) of moleskin dramatically inhibited the Caudal nuclear localization. Furthermore, Caudal-mediated constitutive expression of antifungal Drosomycin gene was severely affected in the moleskin-RNAi flies, showing a local Drosomycin expression pattern indistinguishable from that of the Caudal-RNAi flies. These in vivo data suggest that DIM-7 mediates Caudal nuclear localization, which is important for the proper Caudal function necessary for regulating innate immune genes in Drosophila (Han, 2004).

caudal: Biological Overview | Evolutionary Homologs | Developmental Biology | Effects of Mutation | References

Home page: The Interactive Fly © 1995, 1996 Thomas B. Brody, Ph.D.

The Interactive Fly resides on the
Society for Developmental Biology's Web server.