ventral nervous system defective


REGULATION

Promoter

DNA fragments -0.57, -2.2, -2.9, -5.3, and -8.4 kb in length from the upstream regulatory region of the ventral nervous system defective/NK-2 gene were cloned in the 5'-flanking region of a beta-galactosidase (beta-gal) reporter gene in the P-element pCaSpeR-AUG-beta-gal, and the effects of the DNA on the pattern and time of expression of beta-gal were determined in transgenic embryos. Embryos from 11 lines transformed with -8.4 kb of vnd/NK-2 regulatory DNA express beta-gal patterns that closely resemble those of vnd/NK-2. In embryos from four lines transformed with -5.3 kb of vnd/NK-2 DNA, beta-gal is found in the normal vnd/NK-2 pattern in the nerve cord but not in part of the cephalic region. beta-Gal patterns in embryos from transgenic lines containing -0.57, -2.2, or -2.9 kb of vnd/NK-2 DNA do not resemble vnd/NK-2. Null vnd/NK-2 mutant embryos containing the homozygous P-element p(-8.4 to +0.34 beta-gal) expressed little beta-gal in contrast to siblings with a wild-type vnd/NK-2 gene. It is concluded that (1) the 8.4-kb DNA fragment from the vnd/NK-2 gene contains the nucleotide sequences required to generate the normal pattern of vnd/NK-2 gene expression, sequences that may be involved in the switch between neuroblast vs. epidermoblast pathways of development; (2) the 5'-flanking region of the vnd/NK-2 gene between -5.3 and -8. 4 kb is required for vnd/NK-2 gene expression in the most dorsoanterior part of the cephalic region, and (3) Vnd/NK-2 protein is required, directly or indirectly, for maintenance of vnd/NK-2 gene expression (Saunders, 1998).

The Drosophila single-minded gene controls CNS midline cell development by both activating midline gene expression and repressing lateral CNS gene expression in the midline cells. The mechanism by which Single-minded represses transcription was examined using the ventral nervous system defective gene as a target gene. Transgenic-lacZ analysis of constructs containing fragments of the ventral nervous system defective regulatory region have identified sequences required for lateral CNS transcription and midline repression. Elimination of Single-minded:Tango binding sites within the ventral nervous system defective gene does not affect midline repression. Mutants of Single-minded that remove the DNA binding and transcriptional activation regions abolish ventral nervous system defective repression, as well as transcriptional activation of other genes. The replacement of the Single-minded transcriptional activation region with a heterologous VP16 transcriptional activation region restores the ability of Single-minded to both activate and repress transcription. These results indicate that Single-minded indirectly represses transcription by activating the expression of repressive factors. Single-minded provides a model system for how regulatory proteins that act only as transcriptional activators can control lineage-specific transcription in both positive and negative modes (Estes, 2001).

The relationship between Sim and Vnd in the CNS midline cells was examined by immunostaining embryos with both anti-Sim and anti-Vnd. Vnd protein is first seen at embryonic stage 5 in the presumptive mesectoderm and ventral neuroectoderm, preceding the appearance of Sim protein. Sim protein appears during gastrulation (stage 6) in the mesectoderm and overlaps with Vnd protein. During stages 6-9, both Sim and Vnd are colocalized in the CNS midline cells, while Vnd protein continues to be present in cells of the ventral neuroectoderm. By the end of stage 9 and during stage 10, Vnd protein is absent in the CNS midline cells, while Sim protein remains. The absence of Vnd protein is preceded by the reduction of VND RNA at stage 8. These results show that Sim and Vnd proteins overlap within the mesectoderm during several stages and that a considerable lag exists between the appearance of Sim protein and the loss of Vnd protein (~2 h). There is also a substantial lag between the appearance of Sim protein and the loss of midline VND RNA (~1 h). The delay in vnd repression after initial Sim appearance is consistent with an indirect mechanism of repression (Estes, 2001).

Three general models of Sim-mediated repression were tested: (1) Sim directly represses target genes by binding their DNA and repressing transcription in association with a corepressor(s); (2) Sim does not bind DNA of target genes but interacts with positively acting factors preventing their action, and (3) Sim represses indirectly by activating transcription of genes encoding repressive factors. Several complementary experiments demonstrate that midline repression requires activation of repressive gene expression by Sim (Model 3). Ectopic expression experiments utilizing mutant forms of Sim demonstrate that the basic region, PAS domain, and C-terminal regions are all required for both transcriptional activation and repression. Removal of the PAS domain also abolished the ability of Sim to form dimers with Tgo, suggesting that Tgo is necessary for repression. More informative is Db-Sim. This mutant protein was able to dimerize with Tgo and the protein complex accumulates in the nucleus. However, neither midline transcription nor repression occurs, presumably due to the inability of the Sim:Tgo dimer to bind DNA. This argues against a model in which Sim interacts with an activator protein in a non-DNA-binding mode (Model 2) and instead suggests that DNA binding is required for Sim repression (Model 1 or Model 3). However, analysis of the vnd gene using lacZ transgenes indicates that Sim:Tgo binding sites are not required for midline repression (Model 1); mutation of the single CNS midline element (CME; ACGTG) in fragment 2.5RB or mutation of three CMEs in 5.3RS does not affect lacZ expression. Transient transfection experiments have shown that CMEs are relevant targets of Sim:Tgo binding, and in vivo analyses of five different genes have shown that the CME functions in vivo as a Sim:Tgo binding site. However, it remains possible that Sim:Tgo could bind a variant sequence within the vnd gene. Arguing against this are the results indicating that Sim represses indirectly by activating transcription (Estes, 2001).

The C-terminal region of Sim that follows the PAS domain contains multiple transcriptional activation domains. Removal of the C-terminal 211 aa eliminates those activation domains and additional residues. The DeltaC-Sim protein is unable to activate midline transcription or repress vnd expression, even though it dimerized with Tgo and the complex accumulates in nuclei. This is consistent with Sim repressing vnd expression by activating the transcription of repressive factors. However, it is also possible that there is a domain within the C-terminal region that could directly mediate repression. Fusing the VP16 activation domain onto DeltaC-Sim and functionally assaying the fusion protein in vivo tested this. The results show that addition of the VP16 activation domain restores the ability of DeltaC-Sim to activate transcription and repress vnd. These experiments demonstrate that vnd repression correlates with the ability of Sim to activate transcription (Model 3). Another construct removed the Sim AAQ repeat region (a repeating stretch of 10 Ala-Ala-Gln repeats followed by several imperfect repeats). Its deletion does not affect the ability of Sim to dimerize with Tgo, accumulate in nuclei, activate transcription, or repress vnd. Although striking in sequence, its function remains a mystery. The combination of the vnd-lacZ and ectopic Sim-mutant experiments demonstrate that Sim does not directly repress or inhibit vnd gene expression but, instead, activates transcription of genes that encode repressive factors consistent with the third model of repression. This model is also consistent with the delayed timing of vnd repression seen in early embryonic development (Estes, 2001).

A 0.5-kb region necessary for repression maps between -3.6 and -3.1 in the vnd regulatory region. This repression of vnd occurs variably throughout the midline but is seen consistently between embryos. The lack of uniform repression suggests that other elements residing in the vnd gene may help control the maintenance of repression (Estes, 2001).

Midline repression by Sim functions by activating transcription of one or more genes that, in turn, repress transcription of genes normally expressed in the lateral CNS. The nature of these repressive factor genes and how they function are unknown, although plausible candidate genes exist. Since E(spl) proteins repress lateral CNS expression, members of this family are candidates for midline repressors, and several are expressed in the CNS midline cells early in development (m5, m7, and m8). In this scheme, Sim:Tgo would activate factors that would modify or interact with E(spl) proteins to repress proneural gene activity in the midline. The vnd upstream regulatory region contains numerous E(spl) consensus binding sites, although none of the sites lie within the 0.5-kb fragment shown to be important for repression. Since the E(spl) proteins reside in midline cells well before repression occurs, it is unlikely that Sim is required for initial E(spl) transcription, although maintenance of their expression is a possibility. Other potential repressors have not yet been identified (Estes, 2001).

It is unclear what role the CMEs have within the vnd regulatory region. They are not involved in vnd repression nor do they seem to play a role in vnd embryonic CNS expression. The sequences immediately flanking the four CMEs within the vnd regulatory region were compared to CMEs found within genes known to be important for midline activation by Sim. The consensus for sites within genes positively activated by Sim is (A/T)ACGTG, while the consensus for the CMEs within the vnd regulatory region is GACGTG (three of four sites had a G at the first residue). Otherwise, the sequences varies widely and no consensus was found among the sequences flanking the CMEs of Toll, sim, slit, rhomboid, and breathless nor among the sequences flanking the CMEs found within the vnd regulatory region. It may be the larger context of the vnd regulatory region that prevents Sim from interacting with these sites to affect transcription. It is also possible that the CMEs are bound by bHLH-PAS proteins and utilized for postembryonic expression of vnd (Estes, 2001).

Three discrete regions ( -5.3 to -4.2; -4.2 to -3.1, and -3.1 to -2.8) within the 2.5RB domain of the vnd upstream regulatory sequences are necessary for vnd-like expression. 2.5RB was examined for sequences related to the consensus binding sites of known transcription regulators of vnd. Genetic analysis has shown that Dorsal and Twi are required for vnd activation and Sna for mesodermal repression, and vnd is positively autoregulated. Four putative Dorsal binding sites are located between -5.3 and -4.2 and none are observed between -4.2 and -2.8. Seven putative Sna sites are observed: two between -5.3 and -4.2; four between -4.2 and -3.1, and one between -3.1 and -2.8. Two of the Sna sites possess embedded E boxes (CANNTG sequences) that can enhance gene expression. Twi E-box sites show a weak and short consensus sequence and are difficult to identify by sequence alone. Nonetheless, seven putative Twi E-box sites lie between -5.3 and -4.2. These sites have been shown to bind Twi protein when present in other genes (Estes, 2001).

The sites lie close to the Dorsal binding sites, suggesting cooperative binding of Dorsal and Twi. Sequences required for vnd autoregulation are localized within 8.1 kb upstream of the vnd transcription unit. Although not rigorously tested for autoregulation, the 2.5RB transgene shows a similar pattern of expression compared to 8.1HV, suggesting that autoregulatory sequences may be present. At least 15 potential Vnd binding sites are scattered throughout the entire 2.5RB region, consistent with a direct autoregulatory role for Vnd. In summary, sequence analysis of the vnd regulatory region as defined by deletional analysis suggests that Dorsal, Twi, and Sna directly initiate vnd expression and that Vnd directly autoregulates. However, biochemical experiments to test transcription factor binding, coupled with transgenic analysis of DNA containing mutated binding sites, are required to test the functional significance of these sites (Estes, 2001).

It is concluded that midline cell formation occurs by the concerted activation of genes required for midline cell development and repression of genes normally expressed in the lateral CNS. While there are examples of transcription factors that can directly activate and repress (e.g., Dorsal, Kruppel, and the glucocorticoid receptor), the sim mode of activating directly and repressing indirectly may be a common mechanism of lineage-specific gene control (Estes, 2001).

The Drosophila embryonic CNS arises from the neuroectoderm, which is divided along the dorsal-ventral axis into two halves by specialized mesectodermal cells at the ventral midline. The neuroectoderm is in turn divided into three longitudinal stripes -- ventral, intermediate, and lateral. The vnd gene is expressed from cellularization throughout early neural development in ventral neuroectodermal cells, neuroblasts, and ganglion mother cells, and later in an unrelated pattern in neurons. In the context of the dorsal-ventral location of precursor cells, the vnd loss- and gain-of-function CNS phenotypes have been reassessed using cell specific markers. Over expression of vnd causes significantly more profound effects on CNS cell specification than vnd loss. The CNS defects seen in vnd mutants are partly caused by loss of progeny of ventral neuroblasts -- the commissures are fused and the longitudinal connectives are aberrantly positioned close to the ventral midline. The commissural vnd phenotype is associated with defects in cells that arise from the mesectoderm, where the VUM neurons have pathfinding defects, the MP1 neurons are mis-specified, and the midline glia are reduced in number. vnd over expression results in the mis-specification of progeny arising from all regions of the neuroectoderm, including the ventral neuroblasts that normally express the gene. The CNS of embryos that over express vnd is highly disrupted, with weak longitudinal connectives that are placed too far from the ventral midline and severely reduced commissural formation. The commissural defects seen in vnd gain-of-function mutants correlate with midline glial defects, whereas the mislocalization of interneurons coincides with longitudinal glial mis-specification. Thus, Drosophila neural and glial specification requires that vnd expression by tightly regulated (Mellerick, 2002).

Vnd/NK-2 protein was detected in 11 neuroblasts per hemisegment in Drosophila embryos: 9 medial and 2 intermediate neuroblasts. Fragments of DNA from the 5'-flanking region of the vnd/NK-2 gene were inserted upstream of an enhancerless ßgalactosidase gene in a P-element and used to generate transgenic fly lines. Antibodies directed against Vnd/NK-2 and ß-galactosidase proteins then were used in double-label experiments to correlate the expression of ß-galactosidase and Vnd/NK-2 proteins in identified neuroblasts. DNA region A, which corresponds to the -4.0 to -2.8-kb fragment of DNA from the 5'-flanking region of the vnd/NK-2 gene was shown to contain one or more strong enhancers required for expression of the vnd/NK-2 gene in ten neuroblasts. DNA region B (-5.3 to -4.0 kb) contains moderately strong enhancers for vnd/NK-2 gene expression in four neuroblasts. Hypothesized DNA region C, whose location was not identified, contains one or more enhancers that activate vnd/NK-2 gene expression only in one neuroblast. These results show that nucleotide sequences in at least three regions of DNA regulate the expression of the vnd/NK-2 gene, that the vnd/NK-2 gene can be activated in different ways in different neuroblasts, and that the pattern of vnd/NK-2 gene expression in neuroblasts of the ventral nerve cord is the sum of partial patterns (Shao, 2002).

The vnd/NK-2 gene is expressed initially in the Drosophila embryo during the 12th nuclear doubling (stage 4) in two longitudinal stripes of nuclei, each stripe about five nuclei in width. Each Vnd/NK-2 positive stripe of nuclei identifies the ventral (i.e., medial) column of neuroectoderm. At late stage 8, the first NBs delaminate from the neuroectoderm layer. The vnd/NK-2 gene is expressed in all medial NBs. The expression of Vnd/NK-2 in neuroectoderm is narrowed progressively, and by completion of neuroblast formation (late stage 11), expression is restricted to the ventral column of neuroectodermal cells. Transgenic fly lines that express ß-Gal protein in identified NBs, such as en-lacZ, wg-lacZ, and hkb5953, were used with an antibody directed against ß-Gal, whereas an antibody against full-length Vnd/NK-2 protein was used to detect Vnd/NK-2 protein. Vnd/NK-2 protein was detected in all medial NBs; NBs 1-1, MP-2, 5-2, and 7-1 in late stage 8 and early stage 9 embryos. Vnd/NK-2 expression in NB 1-1 is weaker than in other NBs and disappears by stage 11. In addition, NB MP-2 disappears by stage 11. NBs that express both Vnd/NK-2 and ß-Gal proteins were identified by their position and by comparison with previous studies as 1-2, 6-1, 7-1, and 7-2. Incubation of wg-lacZ embryos with anti-Vnd/NK-2 and anti ß-Gal antibodies reveals two NBs per hemisegment that contain both Vnd/NK-2 and ß-Gal proteins, a small cell that will become NB 5-1, and NB 5-2. Hkb-lacZ has been shown to be expressed in NBs 2-1, 2-2, 2-4, 4-2, and 5-4 during stage 11. Only one NB, 2-1, contains both Vnd/NK-2 and ß-Gal proteins per hemisegment in an hkb5953 embryo. Neuroblasts that express vnd/NK-2 mRNA were identified previously by in situ hybridization. Few intermediate and no dorsal NBs containing Vnd/NK-2 were found. NBs 2-2, 3-2, 4-2, 6-2, 7-3, and 7-4 previously were found to contain Vnd/NK-2 mRNA, but using an antibody to Vnd/NK-2 this study did not detect Vnd/NK-2 protein in these NBs (Shao, 2002).

A regulatory code involving Dorsal, Twist and Su(H) regulated vnd

Bioinformatics methods have identified enhancers that mediate restricted expression in the Drosophila embryo. However, only a small fraction of the predicted enhancers actually work when tested in vivo. In the present study, co-regulated neurogenic enhancers that are activated by intermediate levels of the Dorsal regulatory gradient are shown to contain several shared sequence motifs. These motifs permit the identification of new neurogenic enhancers with high precision: five out of seven predicted enhancers direct restricted expression within ventral regions of the neurogenic ectoderm. Mutations in some of the shared motifs disrupt enhancer function, and evidence is presented that the Twist and Su(H) regulatory proteins are essential for the specification of the ventral neurogenic ectoderm prior to gastrulation. The regulatory model of neurogenic gene expression defined in this study permitted the identification of a neurogenic enhancer in the distant Anopheles genome. The prospects for deciphering regulatory codes that link primary DNA sequence information with predicted patterns of gene expression are discussed (Markstein, 2004).

Previous studies identified two enhancers, from the rho and vnd genes, that are activated by intermediate levels of the Dorsal gradient in ventral regions of the neurogenic ectoderm. The present study identified a third such enhancer from the brk gene. This newly identified brk enhancer corresponds to one of the 15 optimal Dorsal-binding clusters described in a previous survey of the Drosophila genome. Although one of these 15 clusters has been shown to define an intronic enhancer in the short gastrulation (sog) gene, the activities of the remaining 14 clusters were not tested. Genomic DNA fragments corresponding to these 14 clusters were placed 5' of a minimal eve-lacZ reporter gene, and separately expressed in transgenic embryos using P-element germline transformation. Four of the 14 genomic DNA fragments were found to direct restricted patterns of lacZ expression across the dorsoventral axis that are similar to the expression patterns seen for the associated endogenous genes (Markstein, 2004).

The four enhancers respond to different levels of the Dorsal nuclear gradient. Two direct expression within the presumptive mesoderm where there are high levels of the gradient. These are associated with the Phm and Ady43A genes. The third enhancer maps ~10 kb 5' of brk, and is activated by intermediate levels of the Dorsal gradient, similar to the vnd and rho enhancers. Finally, the fourth enhancer maps over 15 kb 5' of the predicted start site of the CG12443 gene, and directs broad lateral stripes throughout the neurogenic ectoderm in response to low levels of the Dorsal gradient. In terms of the dorsoventral limits, this staining pattern is similar to that produced by the sog intronic enhancer (Markstein, 2004).

The remaining ten clusters failed to direct robust patterns of expression and are thus referred to as 'false-positives'. Since analysis of spacing and orientation of the Dorsal sites alone did not reveal features that could discriminate between the false positives and the enhancers, whether additional sequence motifs could aid in this distinction was examined. A program called MERmaid was developed that identifies motifs over-represented in specified sets of sequences. MERmaid analysis identified a group of motifs, which was largely specific to the brk, vnd and rho enhancers, suggesting that the regulation of these coordinately expressed genes is distinct from the regulation of genes that respond to different levels of nuclear Dorsal (Markstein, 2004).

The rho, vnd and brk enhancers direct similar patterns of gene expression. The rho and vnd enhancers were previously shown to contain multiple copies of two different sequence motifs: CTGNCCY and CACATGT. A three-way comparison of minimal rho, vnd and brk enhancers permitted a more refined definition of the CTGNCCY motif (CTGWCCY), and also allowed for the identification of a third motif, YGTGDGAA. The CACATGT and YGTGDGAA motifs bind the known transcription factors, Twist and Suppressor of Hairless [Su(H)], respectively. All three motifs are over-represented in authentic Dorsal target enhancers directing expression in the ventral neurogenic ectoderm, as compared with the 10 false-positive Dorsal-binding clusters. Some of the false-positive clusters contain motifs matching either Twist or CTGWCCY; however, none of the false-positive clusters contain representatives of both of these motifs. The rho enhancer is repressed in the ventral mesoderm by the zinc-finger Snail protein. The four Snail-binding sites contained in the rho enhancer share the consensus sequence, MMMCWTGY; the vnd and brk enhancers contain multiple copies of this motif and are probably repressed by Snail as well (Markstein, 2004).

The functional significance of the shared sequence motifs was assessed by mutagenizing the sites in the context of otherwise normal lacZ transgenes. Previous studies have suggested that bHLH activators are important for the activation of rho expression, since rho-lacZ fusion genes containing point mutations in several different E-box motifs (CANNTG) exhibited severely impaired expression in transgenic embryos. However, it was not obvious that the CACATGT motif was particularly significant since it represents only one of five E-boxes contained in the rho enhancer. Yet, only this particular E-box motif is significantly over-represented in the rho, vnd and brk enhancers. vnd-lacZ and brk-lacZ fusion genes were mutagenized to eliminate each CACATGT motif, and analyzed in transgenic embryos. The loss of these sites causes a narrowing in the expression pattern of an otherwise normal vnd-lacZ fusion gene. By contrast, the brk pattern is narrower in central and posterior regions, but relatively unaffected in anterior regions. The brk enhancer contains two copies of an optimal Bicoid-binding site, and it is possible that the Bicoid activator can compensate for the loss of the CACATGT motifs in anterior regions (Markstein, 2004).

Similar experiments were performed to assess the activities of the Su(H)-binding sites (YGTGDGAA) and the CTGWCCY motif. Mutations in the latter sequence cause only a slight reduction and irregularity in the activity of the vnd enhancer, whereas similar mutations nearly abolish expression from the brk enhancer. Thus, CTGWCCY appears to be an essential regulatory element in the brk enhancer, but not in the vnd enhancer. Mutations in both Su(H) sites in the brk enhancer caused reduced staining of the lacZ reporter gene, suggesting that Su(H) normally activates expression. Further evidence that Su(H) mediates transcriptional activation was obtained by analyzing the endogenous rho expression pattern in transgenic embryos carrying an eve stripe 2 transgene with a constitutively activated form of the Notch receptor (NotchIC). rho expression is augmented and slightly expanded in the vicinity of the stripe2-NotchIC transgene. A similar expansion is observed for the sim expression pattern (Markstein, 2004).

To determine whether the shared motifs would help identify additional ventral neurogenic enhancers, the genome was surveyed for 250 bp regions containing an average density of one site per 50 bp and at least one occurrence of each of the four motifs for Dorsal, Twist, Su(H) and CTGWCCY. In total, only seven clusters were identified. Three of the seven clusters correspond to the rho, vnd and brk enhancers. Two of the remaining clusters are associated with genes that are known to be expressed in ventral regions of the neurogenic ectoderm: vein and sim. Both clusters were tested for enhancer activity by attaching appropriate genomic DNA fragments to a lacZ reporter gene and then analyzing lacZ expression in transgenic embryos. The cluster associated with vein is located in the first intron, about 7 kb downstream of the transcription start site. The vein cluster (497 bp) directs robust expression in the neurogenic ectoderm, similar to the pattern of the endogenous gene. The cluster located in the 5' flanking region of the sim gene (631 bp) directs expression in single lines of cells in the mesectoderm (the ventral-most region of the neurogenic ectoderm), just like the endogenous expression pattern. These results indicate that the computational methods define an accurate regulatory model for gene expression in ventral regions of the neurogenic ectoderm of D. melanogaster (Markstein, 2004).

To assay the generality of these findings, genomic regions encompassing putative sim orthologs from the distantly related dipteran Anopheles gambiae were scanned for clustering of Dorsal, Twist, Su(H), CTGWCCY and Snail motifs. One cluster located 865 bp 5' of a putative sim ortholog contains one putative Dorsal binding site, two Su(H) sites, three CTGWCCY motifs (or close matches to this motif), a CACATG E-box and several copies of the Snail repressor sequence MMMCWTGY. A genomic DNA fragment encompassing these sites (976 bp) was attached to a minimal eve-lacZ reporter gene and expressed in transgenic Drosophila embryos. The Anopheles enhancer directs weak lateral lines of lacZ expression that are similar to those obtained with the Drosophila sim enhancer. These results suggest that the clustering of Dorsal, Twist, Su(H) and CTGWCCY motifs constitutes an ancient and conserved code for neurogenic gene expression (Markstein, 2004).

This study defines a specific and predictive model for the activation of gene expression by intermediate levels of the Dorsal gradient in ventral regions of the neurogenic ectoderm. The model identified new enhancers for sim and vein in the Drosophila genome, as well as a sim enhancer in the distant Anopheles genome. Five of the seven composite Dorsal-Twist-Su(H)-CTGWCCY clusters in the Drosophila genome correspond to authentic enhancers that direct similar patterns of gene expression. This hit rate represents the highest precision so far obtained for the computational identification of Drosophila enhancers based on the clustering of regulatory elements. Nevertheless, it is still not a perfect code (Markstein, 2004).

Two of the seven composite clusters are likely to be false-positives: they are associated with genes that are not known to exhibit localized expression across the dorsoventral axis. It is possible that the order, spacing and/or orientation of the identified binding sites accounts for the distinction between authentic enhancers and false-positive clusters. For example, there is tight linkage of Dorsal and Twist sites in each of the five neurogenic enhancers. This linkage might reflect Dorsal-Twist protein-protein interactions that promote their cooperative binding and synergistic activities. Previous studies identified particularly strong interactions between Dorsal and Twist-Daughterless (Da) heterodimers. Da is ubiquitously expressed in the early embryo and is related to the E12/E47 bHLH proteins in mammals. Dorsal-Twist linkage is not seen in one of the two false-positive binding clusters (Markstein, 2004).

The regulatory model defined by this study probably fails to identify all enhancers responsive to intermediate levels of the Dorsal gradient. There are at least 30 Dorsal target enhancers in the Drosophila genome, and it is possible that 10 respond to intermediate levels of the Dorsal gradient. Thus, half of all such target enhancers might have been missed. Perhaps the present study defined just one of several 'codes' for neurogenic gene expression (Markstein, 2004).

The possibility of multiple codes is suggested by the different contributions of the same regulatory elements to the activities of the vnd and brk enhancers. Mutations in the CTGWCCY motifs nearly abolish the activity of the brk enhancer, but have virtually no effect on the vnd enhancer. Future studies will determine whether there are distinct codes for Dorsal target enhancers that respond to either high or low levels of the Dorsal gradient. Indeed, it is somewhat surprising that the sog and CG12443 enhancers essentially lack Twist, Su(H) and CTGWCCY motifs, even though they direct lateral stripes of gene expression that are quite similar (albeit broader) to those seen for the rho, vnd and brk enhancers (Markstein, 2004).

This study provides direct evidence that Twist and Su(H) are essential for the specification of the neurogenic ectoderm in early embryos. The Twist protein is transiently expressed at low levels in ventral regions of the neurogenic ectoderm. SELEX assays indicate that Twist binds the CACATGT motif quite well. The presence of this motif in the vnd, brk and sim enhancers, and the fact that it functions as an essential element in the vnd and brk enhancers, strongly suggests that Twist is not a dedicated mesoderm determinant, but that it is also required for the differentiation of the neurogenic ectoderm. However, it is currently unclear whether the CACATGT motif binds Twist-Twist homodimers, Twist-Da heterodimers or additional bHLH complexes in vivo. Su(H) is the sequence-specific transcriptional effector of Notch signaling. The restricted activation of sim expression within the mesectoderm depends on Notch signaling; however, the rho, vnd and brk enhancers direct expression in more lateral regions where Notch signaling has not been demonstrated. Nonetheless, mutations in the two Su(H) sites contained in the brk enhancer cause a severe impairment in its activity. This observation raises the possibility that Su(H) can function as an activator, at least in certain contexts, in the absence of an obvious Notch signal (Markstein, 2004).

The Dorsal gradient produces three distinct patterns of gene expression within the presumptive neurogenic ectoderm. It is proposed that these patterns arise from the differential usage of the Su(H) and Dorsal activators. Enhancers that direct progressively broader patterns of expression become increasingly more dependent on Dorsal and less dependent on Su(H). The sog and CG12443 enhancers mediate expression in both ventral and dorsal regions of the neurogenic ectoderm, and contain several optimal Dorsal sites but no Su(H) sites. By contrast, the sim enhancer is active only in the ventral-most regions of the neurogenic ectoderm, and contains just one high-affinity Dorsal site but five optimal Su(H) sites. The reliance of sim on Dorsal might be atypical for genes expressed in the mesectoderm. For example, the m8 gene within the Enhancer of split complex may be regulated solely by Su(H). The Anopheles sim enhancer might represent an intermediate between the Drosophila sim and m8 enhancers, since it contains optimal Su(H) sites but only one weak Dorsal site. This trend may reflect an evolutionary conversion of Su(H) sites to Dorsal sites, and the concomitant use of the Dorsal gradient to specify different neurogenic cell types. A testable prediction of this model is that basal arthropods use Dorsal solely for the specification of the mesoderm and Su(H) for the patterning of the ventral neurogenic ectoderm (Markstein, 2004).

Use of a Drosophila genome-wide conserved sequence database to identify functionally related cis-regulatory enhancers

Phylogenetic footprinting has revealed that cis-regulatory enhancers consist of conserved DNA sequence clusters (CSCs). Currently, there is no systematic approach for enhancer discovery and analysis that takes full-advantage of the sequence information within enhancer CSCs. A Drosophila genome-wide database of conserved DNA has been developed consisting of >100,000 CSCs derived from EvoPrints spanning over 90% of the genome. cis-Decoder database search and alignment algorithms enable the discovery of functionally related enhancers. The program first identifies conserved repeat elements within an input enhancer and then searches the database for CSCs that score highly against the input CSC. Scoring is based on shared repeats as well as uniquely shared matches, and includes measures of the balance of shared elements, a diagnostic that has proven to be useful in predicting cis-regulatory function. To demonstrate the utility of these tools, a temporally-restricted CNS neuroblast enhancer was used to identify other functionally related enhancers and analyze their structural organization. It is concluded that cis-Decoder reveals that co-regulating enhancers consist of combinations of overlapping shared sequence elements, providing insights into the mode of integration of multiple regulating transcription factors. The database and accompanying algorithms should prove useful in the discovery and analysis of enhancers involved in any developmental process (Brody, 2012).

DNA sequence conservation histograms of the Drosophila genome reveal that its non-coding DNA is made up of CSCs that are flanked by less-conserved ICR DNA. For example, a conservation histogram of the Drosophila melanogaster vvl gene transcribed region and 60 kb of 3′ flanking DNA (located on the 3L chromosome) identifies multiple peaks of conserved DNA that are flanked by less conserved DNA sequences. EvoPrint analysis reveals that the CSCs can be further resolved into multiple smaller conserved sequence blocks (CSBs). Most regions of chromosomes 2 and 3 gave a similar pattern of CSC density and distribution, while in general CSCs on the X and the 4th chromosomes exhibited less conservation among the twelve species. cis-Decoder alignment of CSBs constituting a CSC identifies both repeat and palindromic sequence (RPS) elements, of ≥ 6 bp in length, and reveals that these account for more than half of the CSC's conserved sequences. A 6.4-kb genomic region was selected because two of its CSCs (vvl-41 and vvl-43) were tested for their regulatory behavior in this study. A previous analysis of enhancer sequence conservation has shown that individual enhancers can be identified by the maintenance of their CSB cluster integrity across Drosophila species, while ICR regions show greater sequence length variability (Kuzin, 2009; Brody, 2012 and references therein).

As a first step in the identification of structurally related CSCs, a genome-wide database of Drosophila CSCs was created by EvoPrinting most of the euchromatic genome of Drosophila melanogaster and nearly all of the previously in vivo characterized enhancers that are included in the REDfly database. Database CSCs were extracted from more than 4,000 author-generated EvoPrints that generally spanned 15–30 kb of genomic DNA. EvoPrints of fewer bases were used depending on genomic context and availability of gap-free sequence data in the orthologous regions of the different species. Most EvoPrints included all of the available melanogaster group drosophilids (D. melanogaster, D. simulins, D. sechellia, D. yakuba, D. erecta, and D. ananassae), one of the obscura group (D. pseudoobscura or D. persimilis), and two to four orthologous regions selected from the more evolutionary distant species: D. willistoni, D. virilis, D. mojavensis, and/or D.grimshawi species. Most of the EvoPrints represented a combined evolutionary divergence of >150 My. Under these conditions, open reading frames that encode conserved protein domains do not show conservation in most of the codon wobble positions, indicating that the additive evolutionary divergence represented in each EvoPrint is sufficient to reveal with near base-pair resolution those sequences that are essential for gene function. EvoPrints of open reading frames, using different combinations of species, reveal that the lack of sequence conservation in the amino acid codon wobble position is not the result of different codon preferences between species (Brody, 2012).

To enhance the detection of conserved DNA and avoid alignment inaccuracies triggered by DNA sequencing errors, sequencing gaps, rearrangements, or genome assembly problems that were unique to any one of the species used in the analysis, relaxed EvoPrint readouts were employed to identify CSCs. A relaxed EvoPrint highlights sequences that are present in all or all but one of the orthologous DNAs used to generate the print. Species with sequencing gaps (identified as blocks of species-specific differences in the color-coded relaxed EvoPrint readouts or identified as gaps in the EvoPrinter scorecard) were avoided in generating EvoPrints, and second and third scoring pair-wise alignments were included in the analysis when rearrangements were detected (Brody, 2012).

To catalogue CSCs, EvoPrints were entered into the EvoPrint CSC cutter algorithm to isolate and annotate individual CSCs separated by at least 150 bp of less-conserved DNA. This program also assigns a file name and consecutive numbers to each CSC in an EvoPrint. In order to insure that enhancers that contain CSB separation gaps of 150 bases or more were not truncated, CSCs were also parsed independently two additional times using ICR cutoffs of 200 and 250 bp. Duplicates are given the same name but an additional notation to distinguish them. Therefore, clusters that were parsed multiple times (∼20% of the database CSCs), due to their having non-conserved intervals >150 or >200 but <250 bases, are present two or three times in the database. The database contains >100,000 non-redundant clusters. To expedite database searches, in addition to cataloging individual CSCs and their CSBs, RPS elements of 6 bp or longer were pre-identified by intra-CSC CSB alignments and stored in the database. Most CSCs that contain more than 150 bp of conserved DNA have RPS elements that account for >50 % of their sequences (Brody, 2012).

CSCs from all previously in vivo characterized enhancers were also included by EvoPrinting all entries in the REDfly database; these are identified in the CSC-database by their REDfly designations. Although most of these CSCs duplicate database entries, CSCs that represent the same region can be identified by their similar cis-Decoder scores and/or their similar identifying names. It should be noted that many REDfly entries were made from data that often did not delimit the exact boundaries of the enhancer. In addition many REDfly entries included multiple CSCs or truncated CSCs whose ends were restriction enzyme sites used for cloning purposes and were not within less-conserved ICRs. To reduce the number of truncated entries, EvoPrinted regions were expanded to include flanking ICRs. Also, since many REDfly entries are redundant, care was taken to eliminate this redundancy by eliminating repeated and overlapping entries (Brody, 2012).

The first step in a CSC database search is to enter into the cis-Decoder input window an EvoPrinted enhancer that spans a single CSC. cis-Decoder then parses and annotates constituent CSBs in forward and reverse/complement directions. By alignment of the CSBs to one another, the program next identifies multi-copy and palindromic elements that are ≥6 bp. A table is generated that shows the copy-number of each repeat, the element frequency in the database, and the number of database CSCs that contain two or more of each element. Based on earlier analysis of known enhancers, matches of less than 6 bp in length were not considered, because searches with 5 bases or less yielded results that were not informative (Brody, 2012).

After identifying RPS elements, the cis-Decoder algorithm searches the CSC database to discover CSCs containing these repeats. The search algorithm also allows for user supplied mandatory sequences, to identify enhancers that are regulated by sequence-specific DNA-binding factors or families of transcription factors. Once database CSCs are identified, the program carries out individual CSB alignments between the input CSC and the database CSCs. Another set of algorithms then rates the individual database CSCs using the following similarity indices when compared to the input CSC: (1) A repeat balance profile, that assesses relative shared repeat copy numbers and weighs them according to the RPS length (shown as a pie chart and as a repeat balance map, which are accessible from the one-on-one alignment page; (2) A correlation coefficient, which reflects the relative frequency of shared sequence elements between the input and database CSCs; (3) The number of shared repeats (full-length RPS elements and shorter elements contained within longer input repeats); (4) Total number of shared elements including RPS and uniquely shared sequences; (5) Percent coverage of aligning input sequences, which reflects the number of conserved bases in the database CSC that align with the input enhancer CSBs, normalized to the total number of conserved sequences in the database cluster; (6) The number of user-specified required elements present in the database CSC; (7) The longest shared sequence between the input and database CSCs (viewed at the cis-Decoder scorecard by placing the cursor on the sequence length number); and (8) The total number of conserved bases within the database CSC. To allow the user to focus attention on any one of the rating criteria, the CSCs can be sorted by any of the similarity indices in addition to sorting by CSC file name. Sorting by file name allows for the rapid identification of closely associated, neighboring CSCs that are structurally related to the input enhancer (Brody, 2012).

To demonstrate the utility of cis-Decoder database search algorithms to identify tissue- and temporal-specific enhancers, one of the late-temporal network NB enhancers (database CSC cas-6) was used that controls the embryonic expression of the gene encoding Cas, a zinc-finger transcription factor expressed during late embryonic CNS NB lineage development. Like endogenous cas mRNA expression, the cas-6 enhancer activates reporter transgene expression in CNS NBs and ventral cord midline cells during embryonic stage 10 and in additional ventral cord and cephalic lobe NBs during stages 11–13. EvoPrint analysis reveals that the cas-6 CSC is made up of 46 CSBs of 6 bp or more and contains 720 conserved base pairs in 1,613 bp of genomic sequence. Mutational analysis of the cas-6 CSC via 5' and 3' deletions revealed that the entire cluster was required for full reporter activity (A. Kuzin, unpublished results cited in Brody, 2012). The cas-6 CSC is located 392 bp 5' to the cas gene predicted transcriptional start site. As described above, one of the first steps in the cis-Decoder analysis is parsing CSBs from the input EvoPrinted enhancer in both forward and reverse directions, and then aligning the CSBs with one another (self-alignment) to discover RPS elements. More than 65% of the conserved bases in the cas-6 CSBs were represented in RPS elements; an alignment revealed that these are either separate, adjacent, or overlapping each other. Core DNA-binding motifs for known transcription factors within CSBs are indicated in the figure (Brody, 2012).

Prominent among the cas-6 RPS elements are three 10mer repeat motifs [TTATGCAAAT], which contain a POU-homeodomain-octamer-binding site [ATGCAAAT]. The highest copy number element [ATGCAAA], containing 7 of the 8 octamer motif sequences, was found 5 times. It is considered a sub-repeat element, since there is only one instance of the heptamer in the CSBs that is independent of longer elements. Also present are multiple elements containing the core ATTA sequence for Antennapedia class homeodomain containing transcription factors. Also present in the RPS elements are two palindromic E-box sequences, CAATTG and CAGCTG, while three additional E-boxes are present in conserved non-repeated sequences. The cas-6 enhancer CSBs also contains Hunchback and Cas core DNA-binding sequences. Given that many of the cas-6 RPS elements are novel sequences, they most likely contain additional binding sites for as yet uncharacterized transcription factors that modulate enhancer regulatory behavior (Brody, 2012).

To identify database CSCs that share repeat and unique elements with the cas-6 CSC, a search was initiated by first identifying CSCs that contained at least three copies of the ATGCAAA element. Although asking for a mandatory sequence is not required, the cas-6 RPS table revealed that the highest copy number element, ATGCAAA, was present 7,208 times in the CSC database and 371 CSCs contained two or more of these elements. The cis-Decoder scorecard for this search revealed that the database contained 104 CSCs with 3 or more of this element. Thus, the search was focused to this limited set of CSCs. Once these CSCs were identified, one-on-one alignments between the input and database CSBs were automatically performed to discover additional shared sequence elements. As expected, the highest scoring database CSC for most of the indices was cas-6 itself. Other high-scoring enhancers were considered as candidate late temporal network NB enhancers and were tested in enhancer-reporter transgenes. For example, while cg7229-5 scored highest for the correlation coefficient, other CSCs scored higher for each of the other metrics (Brody, 2012).

Although the search required the hepamer sequence ATGCAAA to be present at least three times in the database CSC, most of the highest-scoring CSCs (both for correlation coefficients and shared RPS elements) contained at least three RPS elements with the full octamer motif [ATGCAAAT], including cg7229-5, grh-15, vvl-41, and tkr-15. In addition, many of the CSCs that contained octamer motifs also shared, with cas-6, single or different combinations of bHLH E-box DNA-binding sites and repeated HOX-binding sites, including shared sequences flanking the core ATTA motif. One-on-one alignments between cas-6 and related database enhancers reveal different multi-copy repeats are nested within larger unique matches. For example, RPS elements corresponding to a HOX site are seen overlaping a POU-octamer site. This view of overlapping shared motifs represents a map of the substructure of an enhancer in terms of the transcription factor–binding sites that integrate multiple regulatory inputs (Brody, 2012).

cis-Decoder also generates lists sequence elements that are shared between the input and database CSC. Fifty-seven percent of the cg7229-5 conserved sequences aligned with cas-6 conserved sequences. In addition, cis-Decoder also identifies RPS elements within the input and database CSC that are not shared between the two CSCs, and these elements are also listed on the one-on-one alignment page (Brody, 2012).

The relative frequency of appearance of sequences in cg7229-5 that correspond to cas-6 RPS elements is shown by color-coded highlights. This comparison is termed a “repeat balance map,” a visual representation that illustrates the relative frequency of appearance of each of the shared motifs in the comparison between the input and database enhancers. Forty-six percent of the aligning bases within the cg7229-5 CSC are present in the same ratio in the cas-6 CSC. The predominance indicates that many of the shared elements in the two enhancers are present at equal frequency. Another example of a CSC identified in this search that shares balanced RPS elements with the input cas-6 is the grh-15 CSC, also a temporal network NB enhancer (see below) (Brody, 2012).

To test the in vivo cis-regulatory activity of CSCs, CSCs were selected that contained both repeat and unique sequence elements found in the cas-6 enhancer. The CSCs were selected based on rating criteria described above. Enhancer-reporter transgene transformants for the individual CSCs were generated using the targeted φC31 integration system to ensure that the regulatory behavior for each was assessed in the same genomic environment. Although not an exact match, the expression pattern of the cg7229-5 enhancer transgene shares many of the expression dynamics of the cas-6 enhancer-transgene. As with cas-6, onset of cg7229-5 expression is in a subset of midline cells and a single lateral NB at stage 10, and expression in subsequent stages closely matches, but is not identical to, expression of the cas-6 reporter. The insert shows that cg7229-5 reporter GFP expression overlaps but is not identical to that of cas-6 red fluorescent protein reporter (Brody, 2012).

Many of the tested CSCs yielded detectable CNS expression and function as late temporal network CNS neuroblast enhancers. Eleven were expressed in late temporal network ventral cord NBs and three were expressed in other CNS precursors or neurons. Comparing these expression patterns to the cas-6 reporter expression, it is apparent that each functions as a late temporal network enhancer. An indication of the specificity of the search for cas-6-like enhancers is that the search did not identify early temporal NB enhancers, nor did it identify broadly expressed NB enhancers such as that of deadpan (Brody, 2012).

Although the cas-6-related enhancers are active in overlapping neural precursor cells, each has its own unique cis-regulatory identity. Each has a different pattern of expression in subsets of NBs, GMCs, and/or nascent neurons. For example, three identified enhancers (nab-1, CG6559-28, and tkr-15) exhibit early expression in a subset of ventral cord midline cells, while sqz-11 and vvl-41 (identified using cas-8 as the input CSC) exhibit onset in a larger number of midline cells while other enhancers do not activate reporter expression in the midline precursor cells. The cas-8 CSC activated reporter expression in many more precursors at stage 11 than any of the other reporter constructs. tkr-15 is expressed in many cells at stage 11. Since these cells are too small to be considered NBs, they are most likely GMCs or nascent neurons. Comparing different transgene reporter expression patterns in lateral ventral cord cells at stage 11 reveals that for certain CSCs, in particular sqz-11, ct-3, [identified using the pdm-2 NB enhancer as input, fewer lateral cells express, or they exhibit uniquely different spatial expression patterns. This is also true for ct-14 (identified using combined cas-6 and CG6559-28 as input) and vvl-41 (identified cas-8 as input). cas-6 and cas-8 enhancers both drive reporter expression in overlapping subsets of cells that represent sub-patterns of endogenous cas expression (Brody, 2012).

These studies also revealed that there is no apparent consistency in the ordering, overlap, or orientation of shared elements between functionally related enhancers. For example, RPS elements shared between cas-6, cg7229-5, and grh-15 appear in unique contexts within each enhancer. This lack of consistency in positioning of shared elements has also been noted in early sub-lineage NB enhancers (Brody, 2012).

During the functional analysis of database CSCs that share RPS elements with cas-6, one of the CSCs, vvl-43, was found to share 92 RPS and unique sequence elements with cas-6. It did not, however, drive transgene reporter expression in NBs but activated expression instead in the embryonic ectoderm. cis-Decoder analysis of the shared RPS elements revealed that the balance of PRS elements was markedly different between cas-6 and vvl-43. Notable is the large number of conserved HOX motifs within vvl-43 in comparison to cas-6. Expression of vvl-43 in the embryonic ectoderm is segmental, and although temporally late, there is no embryonic CNS expression. Previous studies demonstrate that the vvl-encoded protein, a POU homeodomain factor, is expressed in the CNS and in the ectoderm of embryos, suggesting that vvl-43 functions as an ectodermal enhancer for vvl expression. The disparity of shared element frequencies between cas-6 and vvl-43 is in marked contrast to the similarity of frequencies when comparing cas-6 and cg7229-5. That lack of balance in shared element copy numbers between enhancers suggests that they may have different regulatory behaviors (Brody, 2012).

Another example of how unbalanced RPS elements indicate functionally different enhancers can be seen in the comparative analysis of vvl-41 with vvl-43 CSCs. Like the previous comparisons to cas-6, the vvl-41 and vvl-43 CSCs share similar elements; vvl-41 shares 96 RPS and unique elements with vvl-43 CSCs, and 68% of the vvl-43 conserved sequences are covered by these shared elements. Although these two CSCs have extensive overlap of shared elements, the repeat balance index and correlation coefficient reveal that their shared elements are not balanced in copy number. Consistent with the imbalance in their shared elements, these enhancers displayed markedly different regulatory behaviors in the embryo. Nevertheless, these two enhancers drive reporter expression in different sets of larval neurons. Whereas most of the cells expressing the vvl-41 reporter transgene are sub-esophageal ganglion interneurons, vvl-43 enhancer drives reporter expression in a subset of ventral cord motor neurons. Thus the presence of identical elements in different clusters does not necessarily lead to similar regulatory behaviors, and comparing shared element copy-numbers has a better predictive value for determining enhancer behavior (Brody, 2012).

To further test the ability of cis-Decoder database searches to identify different families of functionally related enhancers and to compare cis-Decoder search protocols to other enhancer search algorithms, database searches were initiated with different well-characterized enhancer types. Using the Krüppel gap enhancer Kr_CD1, the giant gt_(−10) enhancer was identified. Besides sharing HOX sites with different flanking bases, the two enhancer CSCs also share a 14-bp sequence, TGAACTAAATCCGG. Remarkably, this 14-bp element within the Krüppel enhancer was identified as a site of competitive binding by the activator Bicoid and the repressor Knirps transcription factors. The conservation of interlocking or overlapping docking sites for Bicoid and Knirps within both of these gap enhancers supports the contention that large CSBs (containing 7 to 10 bp or more) most likely function as the point of integration of multiple transcription factors in the regulation of enhancer behavior (Brody, 2012).

The search using the Kr_CD1 also identified the kni_(+1) intronic gap enhancer. Shared sequence motifs between Kr_CD1 and kni_(+1) include multiple polyA/polyT motifs, presumably targets of Hunchback, that are found in even balance (five copies) between the two enhancers. Other shared sequences include several HOX-binding sequence elements (Brody, 2012).

Previous work has shown that many segmentation genes utilize multiple enhancers that regulate gene expression in nearly identical patterns. These enhancer pairs have been termed (1) primary enhancers, found closely associated with the transcriptional start site, and (2) “shadow” enhancers, found at a distance from the structural gene. Starting with the primary vnd ventral neuroectoderm enhancer CSC, a cis-Decoder search identified its shadow enhancer based on the balanced copy number appearance of its RPS elements and uniquely shared sequences. In addition to other shared elements, both of these enhancers contain 2 copies of the CACATGA bHLH motif, which matches the optimal DNA-binding site for the transcriptional regulator Twist (Brody, 2012).

The cis-Decoder search algorithms were tested to see if it would be possible to detect enhancers regulated by Notch signaling. Previously identified Notch-targeted enhancers include those associated with the E(spl) complex genes. Multiple alternative binding sites within these enhancers have been identified for Suppressor of Hairless [Su(H)], the transcription factor utilized by the Notch pathway. A cis-Decoder search was initiated with one of the CSCs (Espl-1) to discover other similarly structured CSCs, using as required sequences a single Su(H)-binding site (TGGGAA) and a single bHLH-binding site (CAGCTG). This search resulted in 101 database hits, including CSCs from known Su(H) targets m2, m6, and mγ as well as putative enhancers for the neural determinants Dichaete, deadpan, nervy, tailless, castor, Fps85D, Notum, and extra macrochaetae. In addition, searching with the Notch-targeted deadpan NB enhancer (cis-Decoder CSC dpn-3), that contains two alternative Su(H)-binding sites (GTGAGAA), other putative Notch pathway targeted enhancers were identified: CG7229-5, cas-8, a HLHmβ-associate CSC (HLHmbeta-2), and the m4 PNS enhancer. Thus, cis-Decoder searches can identify functionally related enhancers that regulate gene expression during different phases of development and in different tissues (Brody, 2012).

Each of the embryonic NB enhancers identified above were also tested for regulatory activity during later stages of development, and many were observed to activate transgene reporter expression in the third instar larva and/or adult CNS. Three of the tested enhancer transgene reporters, cg6559-28, grh-15, and tkr-15 exhibited expression in a similar pattern within brain neural precursor cells, thoracic neuromeres and posterior neural precursors of the third instar larva CNS, while the cas-6 and cas-8 enhancers were not active in larvae. The ct-3 and ct-14 CSCs drove expression in small subsets of neurons in the sub-esophageal ganglion and in the ventral cord abdominal neuromeres. Additionally, nab-1 expression was similar to that of the dnabe310 enhancer-trap expression in third-instar larvae CNS. In the adult, many of the enhancers were expressed in a subset of central brain neurons, and in the optic lobe. Specifically, cg6559-28, vvl-14, and nab-1 reporters were expressed in the mushroom body. While cas-6 was not expressed in the adult brain, cas-8 reporter expression was detected in the ellipsoid body in a pattern similar to cas adult expression. The embryonic and adult reporter expression was tested of another 60 CSCs, chosen by a variety of criteria. Many of these activate transgene reporter expression in both the embryonic and adult CNS. Given the fact that CSC sub-regions of these multiuse enhancers have not been tested for reporter activity, it cannot be ruled out that different regions within the cluster have autonomous functions and represent discrete enhancers. However, functional analysis of the nerfin-1 NB enhancer and the cas-6 enhancer CSCs has revealed that full enhancer function requires the complete cluster. The EvoPrinter algorithm provides a methodology for testing for the close apposition of independent enhancers (Brody, 2012).

Although each of the cis-Decoder scorecard indices provides useful information in judging the relationship of the input enhancer to database CSCs, the repeat balance index and the correlation coefficient are more accurate indices when searching for functionally related enhancers, since they take into account not only the number of shared elements but also the RPS copy number balance between the input enhancer and database CSC. The percent alignment coverage is likewise an important indicator of the relationship between the input and database CSCs. Thus, sorting the scorecard by the repeat balance index or by the correlation coefficient increases the likelihood that functionally related enhancers rank at the top of the list. For example, all of the late temporal NB enhancers identified in this study had repeat balance index scores of greater than 1.0, correlation coefficient rankings of above 0.4, and percent coverage of ≥40% (Brody, 2012).

To estimate the number of false-positive predictions and functionally related enhancers that were missed in cis-Decoder searches, the cas-6 was used as the input enhancer. The search returned 111 database hits, of which 27 that shared many repeat elements with cas-6 were tested for enhancer activity in flies. Of these, 12 proved to be late temporal network enhancers, with each being expressed in a different subset of midline, brain, and/or ventral cord neuroblasts. Eleven were expressed exclusively either in adult brain, larval precursors, or in embryonic neurons, and four were considered negative, since their reporter expression was undetectable or found in other tissues other than the nervous system. As for enhancers that were missed in the search, late temporal network enhancers were identified that do not contain three or more complete or partial octamer sequences, or do not score highly using cas-6 as input. The low-scoring enhancers included sqz-11 and vvl-41, which were discovered using cas-8 as the input CSC (mentioned above). Likewise, ct-3 and ct-14 did not contain three octamer sequences, and they also proved to be late temporal network NB enhancers. Finally, five other late temporal network enhancers were identified that do not contain octamer motifs but do contain other repeated elements found in late temporal network enhancers. It is clear from these results that a search for enhancers using a mandatory sequence, such as the octamer motif, is insufficient to detect the full genomic repertoire of late temporal network enhancers. To identify as many functionally related enhancers as possible, multiple database searches using different search criteria, are recommended. Current understanding of the role of octamer motifs in conferring temporal gene expression is incomplete, in that it was not possible to fully distinguish between embryonic late temporal network enhancers, and octamer-site rich larval or adult brain enhancers. Nevertheless, the fact that only four of the 27 clusters tested were not expressed in the CNS, speaks to the efficacy of cis-Decoder search algorithms in detecting neural enhancers (Brody, 2012).

Ideally, it would be useful to make direct comparisons of the cis-Decoder algorithm with other web-based tools for discovery and analysis of cis-regulatory elements. However, not all search programs use evolutionary comparisons, and those that do use different levels of evolutionary divergence to identify conserved sequences in enhancers. The comparative analysis of enhancer discovery programs nevertheless points to factors present in various computational formats that appear to be important for successful cis-regulatory element prediction. These include sequence conservation between related species, motif clustering, and availability of prior information on the presence of known transcription factor–binding sites. In this context, combined use of cis-Decoder methodology with Chip-Seq data, that shows occupancy of cis-regulatory modules by specific transcription factors, will improve identification of functional motifs within enhancers that are bound by specific transcription factors, and resolves additional functionally important flanking sequences. The libraries of repeat and uniquely shared sequences generated by cis-Decoder are useful for sub-structural analysis of enhancers; for example, discovery of the unique element shared by Krüppel and giant gap enhancers demonstrates the ability of cis-Decoder to reveal combinatorial interactions by analysis of blocks of conserved sequences. Other aspects of cis-regulatory biology will also be relevant; for example, the configuration of the chromatin as detected by DNase1 hypersensitivity indicates accessibility of enhancer sequences to transcriptional regulators. The knowledge of chromatin state is invaluable for prediction of enhancer activity, and information concerning specific CSCs can be accessed via the UCSC browser (Brody, 2012).

Efficacy of cis-Decoder in predicting enhancers can be compared to a study that used known cis-regulatory modules to develop a training set of computationally predicted transcription factor–binding sites to predict genomic cis-regulatory modules (Rouault, 2010). That study predicted neural expression of the same cg7229 enhancer that was identified using cis-Decoder. Likewise an algorithm known as Ahab, which uses transcription-factor-binding-site information for known regulators of cellular blastoderm enhancers, successfully predicted the gt_(−10) and kni(+1) gap enhancers (Schroeder, 2004) that also scored highly in the search using the Kr_CD1 gap enhancer as the input CSC. It is important to point out that cis-Decoder search protocols make direct use of CSC information for enhancer prediction, while other resources, such as Genome Surveyor, use site conservation as a criterion, but do not provide information to infer enhancer boundaries. Given that multiple enhancer prediction programs that employ different search criteria are available, it would be advisable to employ several discovery programs before settling on a final list of candidate genomic regions for analysis in enhancer-reporter transgenic studies (Brody, 2012).

The comparative analysis of enhancers described in this report and an additional 60 enhancers, have yielded the following observations considering enhancer structure and behavior: (1) Functionally related enhancers can be identified based on their balanced copy numbers of shared conserved repeat elements. (2) Enhancers that have extensive shared conserved sequence elements (often >60%), but do not have balanced shared repeat copy numbers, may display significantly different regulatory behaviors. (3) Shared repeat and unique elements between functionally related enhancers are not found in any fixed order or orientation. (4) Similarly regulating families of enhancers need not share specific sets of conserved sequence elements, since different enhancers can accomplish the same regulatory behavior with different but overlapping sets of conserved elements. (5) Enhancers that share conserved repeat elements and perform related cis-regulatory functions also contain unique sets of repeat elements that are only partially shared with other related enhancers (Brody, 2012).

These observations have revealed that Drosophila CNS developmental enhancers are highly complex, based on their conserved sequence composition, and many have proven to be multifunctional. The observed complexity of enhancers, specifically with regard to multi-copy repeat motifs, also suggests that enhancer function is realized through a complex process involving combinatorial interactions among many factors and cannot be easily explained by single activator/repressor transcription factor switches. In addition, the fact that functionally diverse enhancers can display such extensive overlap in their conserved sequences underscores the combinatorial complexity of cis-regulation. Because of the lack of fixed order and orientation of shared elements between related enhancers, only the alignment flexibility of the cis-Decoder CSB aligner can rapidly detect the extent and makeup of shared conserved sequences between different enhancers. Until now, enhancer boundaries have, for the most part, been resolved by reporter transgene deletion analysis. The addition of evolutionary clustering of conserved sequences to this identification process will aid in enhancer identification and allow for an assessment of their structure and spatial constraints. cis-Decoder algorithms also allow one to generate libraries of conserved sequence elements that are shared among enhancers; this dataset will be useful for understanding the combinatorial complexity of tissue-specific gene regulation (Brody, 2012).

Transcriptional Regulation

An important question in neurobiology is how different cell fates are established along the dorsoventral (DV) axis of the central nervous system (CNS). The origins of DV patterning within the Drosophila CNS have been investigated. The earliest sign of neural DV patterning is the expression of three homeobox genes in the neuroectoderm -- ventral nervous system defective (vnd), intermediate neuroblasts defective (ind), and muscle segment homeobox (msh) -- which are expressed in ventral, intermediate, and dorsal columns of neuroectoderm, respectively. Previous studies have shown that the Dorsal, Decapentaplegic (Dpp), and EGF receptor (Egfr) signaling pathways regulate embryonic DV patterning, as well as aspects of CNS patterning. This study describes the earliest expression of each DV column gene (vnd, ind, and msh), the regulatory relationships between all three DV column genes, and the role of the Dorsal, Dpp, and Egfr signaling pathways in defining vnd, ind, and msh expression domains. The vnd domain is established by Dorsal and maintained by Egfr, but unlike a previous report vnd is found not to be regulated by Dpp signaling. ind expression requires both Dorsal and Egfr signaling for activation and positioning of its dorsal border, and abnormally high Dpp can repress ind expression. The msh domain is defined by repression: it occurs only where Dpp, Vnd, and Ind activity is low. It is concluded that the initial diversification of cell fates along the DV axis of the CNS is coordinately established by Dorsal, Dpp, and Egfr signaling pathways. Understanding the mechanisms involved in patterning vnd, ind, and msh expression is important, because DV columnar homeobox gene expression in the neuroectoderm is an early, essential, and evolutionarily conserved step in generating neuronal diversity along the DV axis of the CNS (Von Ohlen, 2000).

Early stage 5 embryos express vnd in a narrow domain similar to its final width; ind and msh are not detected. By the end of stage 5, both vnd and ind are expressed with a one to two cell wide gap; again, this expression is seen in domains similar to their final widths. The gap fills in during development resulting in the precise juxtaposition of the vnd and ind domains. Expression of msh in the trunk is not detected until stage 7. Thus, the timing of gene expression progresses from ventral to dorsal: vnd is detected first, ind appears soon after, and msh is observed last (Von Ohlen, 2000).

There is a gap between the initial vnd and ind domains, suggesting that each gene is independently activated at a precise DV position. Subsequently, ind can be expressed in the ventral domain, but this is normally prevented by vnd-mediated repression. Because ind is capable of repressing vnd expression, if ind were to be expressed first in both the ventral and the intermediate columns, it might fully inhibit the expression of vnd. Thus, the temporal pattern of vnd and ind expression is likely to be important for establishing their final spatial pattern of gene expression. The activation and borders of vnd expression appear to be wholly dependent on the Dorsal morphogen gradient. High levels of Dorsal in the mesoderm/mesectoderm anlagen can activate twist, snail, and vnd, but Snail activity represses vnd expression. Intermediate levels of Dorsal are sufficient to activate vnd, but not snail, thus establishing the ventral column of neuroectoderm. It is unclear how the dorsal border of vnd is positioned, but it may be dependent on the concentration of nuclear Dorsal, because if Dorsal levels are increased in dorsal cells, there is a corresponding expansion of the vnd domain. In contrast to a previous report, no evidence has been found that Dpp signaling establishes the dorsal border of the vnd domain. No change was observed in the width of the vnd domain in dpp embryos, and repression of vnd in ectopic Dpp embryos was not observed. In fact, elevated Dpp activity in the neuroectoderm (in sog 4xdpp embryos) gives a slight expansion of the vnd domain, and even higher levels of Dpp (in brk;sog embryos) still fail to repress vnd expression, despite eliminating much of the remaining CNS. The reason the vnd domain is expanded in sog 4xdpp embryos remains unclear; however, it is felt that the combined results clearly demonstrate that Dpp signaling does not repress vnd and therefore cannot position the dorsal border of vnd. All existing data are consistent with Dorsal acting as a direct, concentration-dependent activator of vnd expression. In contrast, the Egfr and Dpp signaling pathways have no role in establishing the correct vnd expression pattern, although Egfr is required to maintain vnd expression later in embryogenesis (Von Ohlen, 2000 and references therein).

Initiation and maintenance of ind expression require both Dorsal and Egfr signaling pathways, but not Dpp activity. The ventral border of ind expression is established by the dorsal limit of vnd expression. The dorsal border of ind expression has more complex regulation. Dpp repression does not establish the dorsal border of ind, since the ind domain is normal in dpp embryos. In contrast, both Dorsal and Egfr are required to activate ind and set its dorsal border. In wild-type embryos, the domains of ind and activated Egfr have identical dorsal borders. When Egfr activity is increased throughout the embryo, ind expression shows a partial dorsal expansion, showing that the dorsal border of Egfr activity sets the precise dorsal border of ind expression. Ectopic Dorsal activity can also expand the ind domain (without affecting the Egfr activation domain), showing that sufficiently high levels of nuclear Dorsal protein can independently activate ind expression. As expected, when Egfr activity and nuclear Dorsal levels are simultaneously increased there is a complete dorsal expansion of the ind domain. The data presented here suggest that ind expression is activated by both Dorsal and Egfr pathways, limited ventrally by vnd, and limited dorsally by lack of Dorsal and Egfr activity. The data do not distinguish between a linear pathway in which Egfr signaling activates or potentiates Dorsal to allow ind transcription and a parallel pathway in which Dorsal and Egfr signaling act independently to activate ind expression (Von Ohlen, 2000).

Although Dpp is not required for any aspect of ind expression in wild type embryos, ectopic Dpp signaling in the neuroectoderm can repress ind expression. This shows that Dpp signaling must be kept low in the intermediate column to allow ind transcription and raises the possibility that the loss of ind expression seen in dorsal embryos is an indirect effect, due to the de-repression of Dpp activity within the neuroectoderm. dorsal;dpp double mutants fail to express ind, however, proving that loss of ind expression in dorsal mutants is not due to de-repression of Dpp within the neuroectoderm. It is proposed that Dorsal must both activate ind expression and repress Dpp signaling to allow ind expression (Von Ohlen, 2000).

msh is expressed in a DV domain that has low Vnd, Ind, and Dpp activity. Overexpression of any of these genes will repress msh expression, and dorsal;dpp embryos that lack all vnd, ind, and dpp expression show ectopic msh expression around the DV axis. Thus, the borders of the msh domain are defined by repression: Vnd and Ind ventrally, and Dpp dorsally. What activates msh expression? msh expression could be activated by 'basal' transcription factors present uniformly in the early embryo. Alternatively, msh expression may be induced by a low level of ubiquitous TGFbeta activity, similar to the observed activation of zebrafish msh homologs. The screw gene encodes a TGFbeta-like protein expressed at low levels throughout the embryo, and although it has no striking CNS phenotype, it would be interesting to see if screw;dpp embryos lose dorsal msh expression, or whether screw;dorsal;dpp embryos lose global msh expression (Von Ohlen, 2000).

Differential activation of the Toll receptor leads to the formation of a broad Dorsal nuclear gradient that specifies at least three patterning thresholds of gene activity along the dorsoventral axis of precellular embryos. The activities of the Pelle kinase and Twist basic helix-loop-helix (bHLH) transcription factor in transducing Toll signaling have been investigated. Pelle functions downstream of Toll to release Dorsal from the Cactus inhibitor. Twist is an immediate-early gene that is activated upon entry of Dorsal into nuclei. Transgenes misexpressing Pelle and Twist were introduced into different mutant backgrounds and the patterning activities were visualized using various target genes that respond to different thresholds of Toll-Dorsal signaling. These studies suggest that an anteroposterior gradient of Pelle kinase activity is sufficient to generate all known Toll-Dorsal patterning thresholds and that Twist can function as a gradient morphogen to establish at least two distinct dorsoventral patterning thresholds. How the Dorsal gradient system can be modified during metazoan evolution is discussed and it is concluded that Dorsal-Twist interactions are distinct from the interplay between Bicoid and Hunchback, which pattern the anteroposterior axis (Stathopoulos, 2002).

The snail, sim, vnd and sog expression patterns represent four different Toll-Dorsal signaling thresholds. snail is activated only by peak levels of the Dorsal gradient; sim and vnd are activated by intermediate levels, and sog is activated by the lowest levels of the gradient. These expression patterns were visualized in mutant and transgenic embryos via in situ hybridization using digoxigenin-labeled antisense RNA probes (Stathopoulos, 2002).

Dorsal target genes are essentially silent in mutant embryos that lack an endogenous dorsoventral Dorsal nuclear gradient. Mutant embryos were collected from females that are homozygous for a null mutation in the gastrulation defective (gd) gene, which blocks the processing of the Spätzle ligand and the activation of the Toll receptor. These mutants permit the analysis of ectopic, anteroposterior Dorsal and Twist gradients in 'apolar' embryos that lack dorsoventral polarity. snail, vnd, and sog are sequentially expressed along the anteroposterior axis of mutant embryos that contain a constitutively activated form of the Toll receptor (Toll10b) misexpressed at the anterior pole using the bicoid (bcd) promoter and 3' UTR. These expression patterns depend on an ectopic anteroposterior Dorsal nuclear gradient. The repression of the vnd and sog patterns at the anterior pole is probably mediated by Snail, which normally excludes expression of these genes in the ventral mesoderm of wild-type embryos (Stathopoulos, 2002).

The activated Pelle-Tor4021 kinase also directs sequential anteroposterior patterns of snail, vnd, and sog expression in gd/gd mutant embryos. As in the case of Toll10b, the activated Pelle kinase was misexpressed at the pole using the bcd 3' UTR. The snail, vnd and sog expression patterns are similar to those obtained with the Toll10b transgene. The vnd and sog expression patterns are probably repressed at the anterior pole by Snail. These results suggest that the levels of Pelle kinase activity are sufficient to determine different Dorsal transcription thresholds (Stathopoulos, 2002).

sog is normally activated throughout the neurogenic ectoderm by the lowest levels of the Dorsal gradient. The low levels of Dorsal present in Tollrm9/Tollrm10 mutant embryos are sufficient to activate sog everywhere except the extreme termini. The twist-bcd transgene leads to the loss of sog expression in anterior regions, probably because of repression by Snail. Snail also appears to repress vnd and sog expression in anterior regions of transgenic embryos that contain the Toll10b or Pelle-Tor4021 transgenes (Stathopoulos, 2002).

The low levels of Dorsal present in Tollrm9/Tollrm10 mutant embryos are insufficient to activate sim, although there is occasional staining in the posterior pole. The twist-bcd transgene leads to the efficient activation of sim in anterior regions. Staining appears to be restricted to those regions where snail expression is lost. These results suggest that a Twist gradient is sufficient to generate multiple dorsoventral patterning thresholds (sim and snail) in the presence of low, uniform levels of Dorsal (Stathopoulos, 2002).

The twist-bcd transgene was introduced into mutant embryos that completely lack Dorsal. Without the transgene these mutants do not express twist, snail, sim, vnd or sog. Introduction of the twist-bcd transgene causes intense expression of twist in the anterior 40% of the embryo. This broad Twist gradient fails to activate snail, but succeeds in inducing weak expression of sim and somewhat stronger staining of vnd at the anterior pole. The activation of vnd in mutant embryos is comparable with the expression seen in wild-type and Tollrm9/Tollrm10 embryos. However, in both wild-type and mutant embryos the vnd pattern is transient, and lost after the completion of cellularization. These results indicate that Twist can activate dorsoventral patterning genes in the absence of Dorsal (Stathopoulos, 2002).

An anteroposterior Twist gradient generates at least two thresholds of gene activity in mutant embryos that contain decreased levels of Dorsal. High levels of Twist activate sim at the anterior pole, whereas lower levels are sufficient to induce the expression of snail in more posterior regions of embryos containing low, uniform levels of the Dorsal protein. These results demonstrate that twist gene activity is not dedicated to mesoderm formation. Instead, Twist supports expression of two regulatory genes, sim and vnd, which pattern ventral regions of the neurogenic ectoderm. The twist-bcd transgene was shown to induce weak expression of both genes even in mutant embryos that completely lack Dorsal (Stathopoulos, 2002).

Sox proteins form a family of HMG-box transcription factors related to SRY, the mammalian testis determining factor. Sox-mediated modulation of gene expression plays an important role in various developmental contexts. Drosophila SoxNeuro, a putative ortholog of the vertebrate Sox1, Sox2 and Sox3 proteins, is one of the earliest transcription factors to be expressed pan-neuroectodermally. SoxNeuro is essential for the formation of the neural progenitor cells in the central nervous system. Loss of function mutations of SoxNeuro are associated with a spatially restricted hypoplasia: neuroblast formation is severely affected in the lateral and intermediate regions of the central nervous system, whereas ventral neuroblast formation is almost normal. Evidence is presented that a requirement for SoxNeuro in ventral neuroblast formation is masked by a functional redundancy with Dichaete, a second Sox protein whose expression partially overlaps that of SoxNeuro. SoxNeuro/Dichaete double mutant embryos show a severe neural hypoplasia throughout the central nervous system, as well as a dramatic loss of achaete expressing proneural clusters and medially derived neuroblasts. Genetic interactions of SoxNeuro and the dorsoventral patterning genes ventral nerve chord defective (vnd) and intermediate neuroblasts defective (ind) underlie ventral and intermediate neuroblast formation. Expression of the Achaete-Scute gene complex suggests that SoxNeuro acts upstream and in parallel with the proneural genes. The finding that Dichaete and SoxN exhibit opposite effects on achaete expression within the intermediate neuroectoderm demonstrates that each protein also has region-specific unique functions during early CNS development in the Drosophila embryo (Buescher, 2002 and Overton, 2002).

The loss of one copy of vnd or ind in a SoxN homozygous mutant background dominantly enhances the SoxN phenotype, suggesting that SoxN genetically interacts with vnd and ind. Since the expression of Vnd and Ind does not require SoxN function, it is concluded that SoxN does not act upstream of vnd and ind, but rather in parallel. In ind mutant embryos, Ac expression in the NE is derepressed in the intermediate region. Nevertheless, NBs fail to form within this region. vnd is required for Ac expression in the ventral NE. However, there seems to be no causal relationship between the loss of Ac expression and the subsequent loss of NBs, since ectopic expression of Ac does not rescue NB formation. Thus, it appears that expression of the genes of the AS-C can confer neural potential to the NE only when SoxN, vnd and ind expression is intact (Buescher, 2002).

It is presumed that the differences between Dichaete and SoxN may well reflect interactions between each Sox protein and a different partner mediated by protein domains outside the highly conserved DNA-binding domain. In accordance with this, Zhao suggests that, in the neuroectoderm, Dichaete interacts with the product of the ind gene to mediate repression of ac. Since ind is specifically expressed within the intermediate neuroectoderm, it is tempting to speculate that this protein might interact specifically with Dichaete to repress ac while it does not interact with SoxN in the same way if indeed at all. However, evidence has been provided for interactions between Dichaete and both ind and vnd in the context of NB specification. Since the data suggest that SoxN and Dichaete function is at least redundant within the vnd-positive medial row, it is very likely that Vnd interacts with SoxN as well as Dichaete (Overton, 2002).

During early CNS development, Nkx6 is co-expressed with Ventral nervous system defective (Vnd) in a subset of medial column NBs, prompting an investigation of the genetic relationship between vnd and Nkx6. Vnd expression marks medial column CNS NBs and is required for the development of these cells. Nkx6 and Vnd expression were compared in wild-type embryos. Surprisingly, while Nkx6 and Vnd are co-expressed in a subset of medial column NBs, their expression patterns are otherwise complementary. At stage 9, Nkx6 is expressed in CNS midline precursors, while Vnd is expressed in ventral neuroectoderm flanking the midline. During stage 10, low-level Nkx6 expression initiates in five Vnd-positive NBs per hemisegment. At stage 11, Vnd and Nkx6 are expressed in non-overlapping groups of GMCs and postmitotic neurons. Notably, at this stage clusters of Nkx6-expressing cells are nestled within stripes of Vnd-expressing cells. The complementary patterns of Nkx6 and Vnd in GMCs and neurons are maintained throughout embryogenesis. These data raised the possibility that opposing activities of Nkx6 and vnd help establish and maintain their respective expression patterns (Broihier, 2004),

To investigate whether the complementary expression patterns of Nkx6 and Vnd arise due to their opposing activities, it was asked if vnd misexpression represses Nkx6. These analyses focus on the genetic relationship between Nkx6 and vnd in postmitotic neurons since these genes exhibit mutually exclusive patterns in these cells. The elav-GAL4 driver was used to express vnd in postmitotic neurons and it was found that this abolishes CNS expression of Nkx6. It was not possible to obtain meaningful loss-of-function data for vnd because nearly all medial column NBs and their progeny, many of which are Nkx6-positive, fail to develop in vnd mutant embryos. The requirement of vnd to promote medial column NB formation inhibited the ability to assay the effect of removing vnd function on Nkx6. Nevertheless, the ability of vnd misexpression to abolish Nkx6 expression supports the model that vnd represses Nkx6 to help establish the complementary expression patterns of Nkx6 and Vnd (Broihier, 2004),

In the reciprocal experiment, it was found that postmitotic misexpression of Nkx6 dramatically reduces the number of Vnd-positive neurons. Normally, 10.0±1.3 neurons express Vnd per hemisegment whereas only 4.2±1.8 neurons express Vnd per hemisegment (n=53) in Nkx6 misexpression embryos. However, Vnd expression is wild type in Nkx6 mutant embryos. Thus, Nkx6 is sufficient but not necessary to repress vnd expression (Broihier, 2004),

These data suggest that while high levels of Nkx6 and Vnd are cross-repressive in postmitotic neurons, these factors function in concert with other regulators during normal development to limit each other's expression. Given the similar expression profiles of Nkx6 and Hb9 and their independent regulation it was asked whether Nkx6 and hb9 act in parallel to repress vnd expression. As observed for Nkx6, hb9 misexpression in postmitotic neurons significantly reduces the number of Vnd-positive CNS neurons while hb9 mutants exhibit wild-type Vnd expression. However, removal of both hb9 and Nkx6 leads to an overproduction of Vnd-positive neurons; 13.6±2.1 Vnd-positive neurons (n=41) develop in double mutant embryos relative to ten in wild type. These results show that hb9 and Nkx6 act in parallel to repress vnd, and support the model that the complementary patterns of Nkx6 and vnd arise at least in part due to their opposing activities (Broihier, 2004),

Sequential patterns of vnd, ind, and msh expression respond to distinct thresholds of the Dorsal gradient

A nuclear concentration gradient of the maternal transcription factor Dorsal establishes three tissues across the dorsal-ventral axis of precellular Drosophila embryos: mesoderm, neuroectoderm, and dorsal ectoderm. Subsequent interactions among Dorsal target genes subdivide the mesoderm and dorsal ectoderm. The subdivision of the neuroectoderm by three conserved homeobox genes, ventral nervous system defective (vnd), intermediate neuroblasts defective (ind), and muscle segment homeobox (msh) has been investigated. These genes divide the ventral nerve cord into three columns along the dorsal-ventral axis. Sequential patterns of vnd, ind, and msh expression are established prior to gastrulation and evidence is presented that these genes respond to distinct thresholds of the Dorsal gradient. Maintenance of these patterns depends on cross-regulatory interactions, whereby genes expressed in ventral regions repress those expressed in more dorsal regions. This 'ventral dominance' includes regulatory genes that are expressed in the mesectoderm and mesoderm. At least some of these regulatory interactions are direct. For example, the misexpression of vnd in transgenic embryos represses ind and msh, and the addition of Vnd binding sites to a heterologous enhancer is sufficient to mediate repression. The N-terminal domain of Vnd contains a putative eh1 repression domain that binds Groucho in vitro. Mutations in this domain diminish Groucho binding and also attenuate repression in vivo. The significance of ventral dominance is discussed with respect to the patterning of the vertebrate neural tube, and ventral dominance is compared with the previously observed phenomenon of posterior prevalence, which governs sequential patterns of Hox gene expression across the anterior-posterior axis of metazoan embryos (Cowden, 2003).

The ability of Vnd to repress msh in addition to ind raises the possibility that transcriptional repressors expressed in ventral regions of the embryo can inhibit repressors active in more dorsal regions. Support for this hypothesis came from using the Krüppel enhancer to misexpress both ind and msh along the anterior-posterior axis. Ectopic Ind failed to repress vnd expression, while ectopic Msh did not repress either vnd or ind expression. To determine if 'ventral dominance' is restriced to the neuroectoderm, the mesodermal repressor snail was misexpressed in transgenic embryos using the even-skipped (eve) stripe 2 enhancer. The stripe2-snail transgene creates an ectopic domain of snail along the anterior-posterior axis. This ectopic expression leads to a gap in the sim expression pattern. The transgene also causes a gap in the vnd pattern, confirming the model that Snail excludes vnd expression in the ventral mesoderm and restricts expression to the neuroectoderm. The stripe2-snail transgene also creates a gap in the ind pattern. These results support the ventral dominance model, whereby repressors located in ventral regions inhibit repressors expressed in more dorsal regions. Consistent with this 'directionality' of repression, ectopic expression of Vnd, Ind, or Msh does not repress snail (Cowden, 2003).

Further support for ventral dominance of the Snail repressor was obtained by analyzing mutant embryos derived from CtBP germline clones. CtBP is a maternally deposited corepressor protein essential for snail-mediated repression. Removal of this corepressor results in ventral derepression of sim and vnd into the presumptive mesoderm due to loss of Snail mediated repression. However, this ventral expansion of vnd does not result in a transformation of mesoderm into medial neuroblasts. Instead, the expanded vnd pattern is lost at slightly later stages, and expression becomes restricted to lateral regions, similar to the endogenous expression pattern. This lateral restriction is consistent with the observation that neuroblasts are formed in lateral regions of CtBP- mutants, and not in ventral regions that normally form the mesoderm. Neuroblast segregation can be visualized using a snail antisense RNA probe, which stains all neuroblasts following gastrulation. Sim may be responsible for the late repression of vnd, because vnd expands into the ventral midline of sim mutant embryos. Repression of vnd by Sim is probably indirect because a Krüppel-sim transgene does not alter vnd expression in the lateral neuroectoderm. Perhaps Sim activates an unknown repressor that ultimately inhibits vnd expression in the midline (Cowden, 2003).

It is conceivable that the cross-regulatory interactions among the Snail, Vnd, Ind, and Msh repressors are indirect. For example, perhaps Vnd activates an unknown repressor, which in turn inhibits the expression of ind and msh in medial neuroblasts. Several experiments were done to determine whether Vnd functions as a transcriptional repressor. The first examined whether Vnd binding sites mediate activation or repression in transgenic embryos (Cowden, 2003).

The IAB5 enhancer drives the expression of a lacZ reporter gene in a series of three adjacent bands in the presumptive abdomen of cellularizing embryos. This staining pattern is maintained through gastrulation and germ band elongation. Vnd binding sites were introduced into this IAB5-lacZ transgene by inserting a 220 bp genomic DNA fragment between the IAB5 enhancer and lacZ reporter. This genomic fragment is located 3' of the ind gene and contains three Vnd binding sites. Insertion of this fragment caused a ventrolateral gap in the IAB5-lacZ staining pattern. This gap coincides with the endogenous vnd expression pattern and is maintained during germ band elongation. At this stage, there is a clear loss of lacZ expression in medial regions of the developing ventral nerve cord. The importance of the Vnd binding sites in mediating this repression was examined by mutagenizing all three sites within the 220 bp DNA fragment. Each site was converted from the 5'-CAAGTG-3' consensus to 5'-CCCGGG-3'. The mutagenized IAB5-lacZ transgene exhibits expanded expression in medial regions of the presumptive nerve cord. This observation suggests that Vnd functions as a sequence-specific transcriptional repressor (Cowden, 2003).

Further evidence that Vnd is a repressor was obtained using an in vivo repression assay in transgenic embryos. The N-terminal region of Vnd contains a putative eh1 Groucho-interaction motif, FxIxxIL. This eh1 motif is present in two known transcriptional repressors, Engrailed and Goosecoid. It is also found in the Ind and Msh proteins. GST pull-down assays suggest that this motif mediates interaction between Vnd and Groucho. A GST-VEH1 fusion protein containing amino acid residues 183 to 226 from Vnd binds S35-labeled Groucho protein produced via in vitro translation. This binding is lost when the GST-Vnd fusion protein is mutagenized to replace the phenylalanine in the FxIxxIL motif with an alanine. Various positive and negative controls were included in these experiments. For example, Groucho does not bind a GST-Ind fusion protein containing the Ind homeodomain. Weak binding is observed with a GST-Eve fusion protein containing the FKPY Groucho-interaction motif (Cowden, 2003 and references therein).

A Gal4-Vnd fusion gene containing the Gal4 DNA binding domain and the N-terminal 543 codons of Vnd was placed under the control of the Krüppel 5' regulatory region. The resulting fusion gene is expressed in central regions of cellularizing embryos. Similar levels of expression were obtained with a mutagenized version of the fusion gene that contains multiple alanine substitutions in the FxIxxIL motif. The regulatory activities of the two Gal4-Vnd fusion proteins were monitored with a lacZ reporter gene that contains a modified version of the rhomboid NEE lateral stripe enhancer. The modified NEE enhancer contains three Gal4 binding sites (UAS) and lacks Snail repressor sites. The reporter gene is expressed in ventral regions, including the mesoderm and portions of the lateral neuroectoderm (Cowden, 2003).

The unmutagenized Gal4-Vnd fusion protein containing an intact FxIxxIL motif attenuates expression of the NEE-lacZ reporter gene. This result suggests that the fusion protein binds UAS sites in the modified NEE enhancer and mediates transcriptional repression, either by direct repression of the core promoter, or quenching Dorsal and other activators within the NEE. In contrast, the mutagenized Gal4-Vnd fusion protein (DeltaVEH1) fails to repress expression from the lacZ reporter gene. This result suggests that the FxIxxIL motif is essential for the repression activity of the normal Gal4-Vnd fusion protein. Altogether, these experiments, along with the analysis of Vnd binding sites, suggest that Vnd functions as a sequence-specific transcriptional repressor that might recruit the Groucho corepressor protein (Cowden, 2003).

Thus the Dorsal gradient directly subdivides the neuroectoderm into separate dorsal-ventral compartments through the differential regulation of three conserved homeobox genes, vnd, ind, and msh. Maintenance of sequential patterns of gene expression depends on cross-regulatory interactions, whereby repressors expressed in ventral regions inhibit repressors active in more dorsal regions. This ventral dominance is evocative of the posterior prevalence phenomenon that governs sequential patterns of Hox gene expression across the anterior-posterior axis of metazoan embryos. At least one of the cross-regulatory interactions is direct and evidence was presented that Vnd functions as a sequence-specific transcriptional repressor (Cowden, 2003).

The Dorsal gradient establishes at least three thresholds of gene expression across the dorsal-ventral axis of early embryos. High concentrations activate target genes such as twist and snail in ventral regions that form the mesoderm. Intermediate concentrations activate the rhomboid gene in ventral regions of the neuroectoderm. Finally, low levels of the gradient activate the sog gene in both ventral and dorsal regions of the neuroectoderm. The same low levels of Dorsal repress target genes important for the differentiation of the dorsal ectoderm, including dpp, zen, and tolloid (Cowden, 2003).

Mutant embryos lacking Dorsal fail to activate early expression of either vnd or ind. Conversely, ectopic Dorsal activity leads to a corresponding dorsal shift in the vnd and ind expression patterns. The lateral stripes of vnd expression encompass ventral regions of the neuroectoderm, similar to the rhomboid (rho) pattern. rho is a direct Dorsal target gene that is expressed in the neuroectoderm and encodes a membrane-associated protease that processes the EGFR ligand spitz. Like rho, vnd appears to be a direct target of the Dorsal gradient: an intronic enhancer containing clustered Dorsal and Twist binding sites directs lateral stripes of expression in transgenic embryos. The ind lateral stripes appear to straddle the region between the vnd/rhomboid ventrolateral stripes and the broad sog lateral stripes, and previous studies suggest that ind may be regulated in a different manner from vnd. The regulation of ind relies on both the Dorsal gradient and the EGF signaling pathway. Removal of either Dorsal or the EGF receptor results in the loss of ind expression from the neuroectoderm. It is unclear whether Dorsal directly activates ind or simply establishes a domain of EGF signaling through the regulation of rhomboid (rho). However, given the early onset of ind expression and the misexpression of ind by ectopic Dorsal, it is likely that Dorsal is essential for its regulation. Consistent with the possibility that early ind expression pattern might reflect a threshold readout of the Dorsal gradient is the finding that the low levels of Dorsal present in Tollrm9/Tollrm10 embryos are sufficient to activate ind, but not msh. Moreover, the ind lateral stripes do not extend beyond the sog expression pattern, which is known to be directly activated by vanishingly low levels of the Dorsal gradient. Finally, a 3' ind enhancer that encompasses the three Vnd binding sites used in this study contains optimal Dorsal and Twist binding sites, suggesting that it is directly regulated by the Dorsal and Twist gradients (Cowden, 2003).

The initial compartmentalization of the neuroectoderm appears to depend on threshold readouts of the Dorsal gradient. This strategy is different from the subdivision of the other two primary embryonic tissues, the mesoderm and dorsal ectoderm. Patterning the mesoderm depends on interactions between twist and dpp. The Snail repressor establishes the limits of mesoderm invagination, while the localized expression of Dpp restricts induction of the lateral mesoderm to dorsal-lateral regions. Similarly, subdivision of the dorsal ectoderm depends on the differential regulation of the Dorsal target genes sog and dpp. Both genes respond to the same low levels of the Dorsal gradient, but sog is activated by Dorsal, while dpp is repressed. Subsequent protein-protein interactions between Sog and Dpp establish a broad Dpp signaling gradient in the dorsal ectoderm (Cowden, 2003).

Transcriptional repression of ind by Vnd was predicted from previous genetic studies but lateral repression of msh was somewhat unexpected. Previous studies have shown that ectopic Vnd represses msh expression in the procephalic neuroectoderm, where the vnd and msh expression patterns overlap. This result was extended in the present study using a Krüppel-vnd transgene. It would appear that Vnd represses both ind and msh to specify medial neuroblasts. A similar result was seen using the eve stripe 2 enhancer to misexpress snail. Previous studies have shown that Snail acts as a transcriptional repressor to create the boundary between mesoderm and neuroectoderm. As expected, ectopic snail repressed vnd expression but surprisingly, ind was also repressed. These results suggest that the Dorsal gradient separates domains along the dorsal-ventral axis by activating a series of localized transcriptional repressors. According to this model, repressors located in ventral regions selectively repress those located more dorsally, while dorsal repressors do not inhibit ventral repressors. For example, ectopic Vnd represses ind but not snail, while ectopic Ind fails to repress vnd or snail. According to this model, ectopic Ind should repress msh expression. However, because none of the transgenic Krüppel-ind lines persisted until germband elongation when msh expression is uniform, it was not possible to determine if ectopic Ind repressed msh. Similarly, while ectopic Msh failed to repress snail, vnd, or ind expression, the lack of early target genes that are regulated by Msh prevents any definitive conclusions regarding its role as a transcriptional repressor. Both Ind and Msh contain putative eh1 domains, suggesting that they may function as Groucho dependent repressors and previous work supports such a role for Ind and Msh in the ventral nerve cord (Cowden, 2003).

'Ventral dominance' might govern the patterning of the ventral nerve cord in older embryos, in addition to the prepatterning of the neuroectoderm in pregastrulating embryos. Sim might exclude vnd, ind, and msh expression in the ventral midline. In embryos lacking maternal CtBP products, Snail fails to act as a repressor, allowing the ventral expansion of sim and vnd into the presumptive mesoderm. However, vnd expression is ultimately lost from ventral regions, while sim expression persists. As a result, ventral regions form an expanded mesectoderm, while neuroblasts arise from lateral regions. These observations suggest that Sim excludes vnd expression from ventral regions in CtBP mutants, either directly by acting through a CNS specific enhancer or indirectly by activating an unknown repressor. This putative repressor probably does not rely on the CtBP corepressor, as it is still capable of repressing vnd in CtBP germ line clones. According to a ventral dominance scenario, the misexpression of this unknown repressor should inhibit the expression of vnd, ind, and msh in the ventral midline. One potential target for the indirect repressor could be the EGF pathway. The ventral midline is a well-characterized source of EGF signaling and both vnd and ind rely upon EGF signaling for maintenance of expression. By eliminating EGF activation, this midline repressor could prevent vnd and ind expression (Cowden, 2003).

It is conceivable that the ventral dominance model governing cross-regulatory interactions among Vnd, Ind, Msh, Snail, and possibly sim, also applies to the patterning of the vertebrate neural tube. The vertebrate homolog of vnd, Nkx2.2, is expressed in ventral regions of the neural tube, while the homologs of ind (Gsh) and msh (Msx) are expressed in intermediate and dorsal regions, respectively. These neural tube expression patterns match the dorsal-to-ventral positions of vnd, ind, and msh in the ventral nerve cord of Drosophila. Furthermore, the vertebrate homolog of Vnd, Nkx2.2, also functions as a Groucho-dependent transcriptional repressor. A clear prediction of this study is that the misexpression of Nkx2.2 throughout the vertebrate neural tube should lead to the repression of both Gsh and Msx. In contrast, the misexpression of Gsh should repress Msx, but not Nkx2.2. Thus, a cascade of homologous localized transcriptional repressors could subdivide both the vertebrate and invertebrate CNS (Cowden, 2003).

Threshold-dependent BMP-mediated repression: a model for a conserved mechanism that patterns the neuroectoderm

Subdivision of the neuroectoderm into three rows of cells along the dorsal-ventral axis by neural identity genes is a highly conserved developmental process. While neural identity genes are expressed in remarkably similar patterns in vertebrates and invertebrates, previous work suggests that these patterns may be regulated by distinct upstream genetic pathways. This study asked whether a potential conserved source of positional information provided by the BMP signaling contributes to patterning the neuroectoderm. This question was addressed in two ways: (1) it was asked whether BMPs can act as bona fide morphogens to pattern the Drosophila neuroectoderm in a dose-dependent fashion, and (2), whether BMPs might act in a similar fashion in patterning the vertebrate neuroectoderm was examined. In this study, it was shown that graded BMP signaling participates in organizing the neural axis in Drosophila by repressing expression of neural identity genes in a threshold-dependent fashion. Evidence is also provided for a similar organizing activity of BMP signaling in chick neural plate explants, which may operate by the same double negative mechanism that acts earlier during neural induction. It is proposed that BMPs played an ancestral role in patterning the metazoan neuroectoderm by threshold-dependent repression of neural identity genes (Mazutani, 2006; full text of article).

The neural identity genes vnd, ind, and msh are expressed in a series of non-overlapping DV domains in the Drosophila embryo. These genes are expressed in a highly dynamic fashion and are activated in a ventral-to-dorsal sequence. The BMP antagonist Sog is expressed throughout the neuroectoderm; prior to the activation of neural identity gene expression and fades dorsally as the Dorsal gradient collapses. By the time msh is expressed in a single contiguous dorsal stripe, sog expression is largely lost from these dorsal-most cells. During this same period, the BMP2/4 homolog Dpp is expressed in adjacent dorsal cells, where it represses the expression of neural genes and acts in a graded fashion to pattern the non-neural ectoderm. It is possible that Dpp also signals to the neuroectoderm, although previous single and double mutant analyses of the dpp pathway have not resolved whether Dpp acts in a graded fashion to help establish the order of the neural domains. In none of these studies, was it possible to sort out the contribution of BMP signaling from that of the Dorsal gradient. To answer whether Dpp acts as a morphogen to pattern the Drosophila neuroectoderm, a system was developed for selectively analyzing its effects in the absence of other DV cues (Mazutani, 2006).

In order to separate the potential patterning effect of BMP signaling in Drosophila from that imposed by the Dorsal gradient, a genetic system was designed that allowed replacement of the normal ventral-to-dorsal gradient of nuclear Dorsal with a uniform neuroectodermal level of Dorsal along the entire DV axis of the embryo. These lateralized embryos were created by first eliminating polarized DV maternal patterning acting upstream of Toll signaling and then adding back uniform adjusted levels of Dorsal across the entire DV axis using activated alleles of the Toll receptor. Uniform maternal Toll signaling was adjusted to specific levels using activated Toll alleles of differing strengths and by altering the dose of maternal Dorsal. In such lateralized embryos, the response was then tested of neural genes to an ectopic BMP gradient formed along the AP axis. This BMP gradient was created by expressing dpp under the control of the even-skipped stripe 2 enhancer of dpp (st2-dpp) construct (Mazutani, 2006).

In lateralized embryos, pan-neuroectodermal markers such as sog are expressed around the entire circumference of the embryo. As expected from the threshold-dependent activity of Dorsal, mesodermal, and dorsal ectodermal markers are absent in these same embryos. The consistent and uniform amounts of Dorsal produced in these lateralized embryos correspond to mid-neuroectodermal levels as revealed by expression of ind along the full DV axis and the absence of vnd expression. The AP limits of ind expression are similar to those in wild-type embryos. Within this domain, msh expression is not detectable, presumably because Ind is acting in a ventral-dominant fashion to repress it. However, in more anterior cells abutting the ind domain, where msh expression normally extends further than ind, msh is expressed in a ring around the embryo. These initial studies indicate that both ind and msh can be expressed in mid-neuroectodermal lateralized embryos, and that Ind efficiently excludes msh from its domain (Mazutani, 2006).

Once conditions were established for reliably producing lateralized embryos, whether it was possible to induce a graded Dpp response by crossing a st2-dpp construct into the lateralized background was tested. The sole source of dpp expression in these embryos is provided by st2-dpp, except at the poles where endogenous dpp expression is independent of Dorsal regulation. The expected pattern of BMP pathway activation in such embryos, assessed by in situ phosphorylation of the signal transducer, phosphorylated form of Mothers against dpp (pMAD), is a broad band centered over the st2-dpp stripe. Expression of the epidermal Dpp target gene u-shaped (ush) was also tested as a second marker for BMP activation. Because lateralized embryos ubiquitously express the BMP inhibitor sog, neither pMAD nor ush expression could be detected near the stripe of dpp expression. However, when sog function was eliminated in st2-dpp lateralized embryos, pMAD was activated in a broad domain extending approximately eight cell diameters beyond the narrower dpp stripe. In addition, ush expression was also activated in this region. These results indicate that Dpp diffusing from a sharp stripe can elicit a graded response over significant distances (Mazutani, 2006).

The effect of graded Dpp activity on the relative patterns of ind and msh expression was examined. Multiplex in situ hybridization methods were used to examine the simultaneous expression of msh, ind, and ush, while scoring for the sog+ versus sog− genotype of the embryos. These experiments revealed a clear dose-dependent repression of ind expression characterized by strong repression near the source of dpp and graded reduction in expression extending approximately 20 cell diameters posteriorly. In contrast, the opposite effect was observed with regard to msh expression, resulting in its activation in cells expressing the lowest levels of ind. In control sog+ lateralized embryos, where BMP signaling is blocked, st2-dpp had no discernable effect on the pattern or intensity of either msh or ind expression. These results can be understood if Dpp signaling preferentially represses expression of ind in sog−; st2-dpp lateralized embryos, thereby relieving ind-mediated repression of msh in cells near the Dpp source. The induction of msh expression near the Dpp stripe followed by a zone of ind expression mimics the wild-type configuration of gene expression and provides the first evidence that BMP signaling can influence the pattern of neuroectodermal gene expression in the absence of other DV cues such as the Dorsal gradient. Similar long-range inhibition of ind and short-range induction of ectopic msh expression can be observed in sog−; eve2-dpp embryos with an intact Dorsal gradient, indicating that ind is also likely to be more sensitive than msh to BMP-mediated repression in wild-type embryos. The fact that the zone of ind repression extends considerably further from the dpp stripe than the region of msh activation indicates that msh is not responsible for ind repression, consistent with existing evidence that msh does not regulate ind. It seems likely, therefore, that BMP signaling acts directly to repress ind expression. These data support the prevailing ventral-dominant model for cross-regulation of neural identity genes, and exclude an alternative model in which Dpp signaling activates msh, which in turn inhibits ind (Mazutani, 2006).

Previous studies of the ventral-most neural identity gene, vnd, reported only a mild expansion of its expression domain in dpp− mutants, or no consistent effect. The sensitive lateralized system was exploited to re-examine the BMP response of vnd in order to resolve these existing ambiguities. st2-dpp was expressed in embryos with uniform levels of Dorsal corresponding to the ventral neuroectoderm, which are sufficient to induce ubiquitous expression of vnd. In such 'ventro-lateralized' embryos, both ind and msh expression are absent, presumably due to repression by vnd. Elimination of sog function in these embryos resulted in activation of BMP signaling as judged by the localized activation of the epidermal marker ush; however, vnd expression remained unaltered. When the function of both sog and the transcriptional repressor of BMP signaling, brinker (brk), was eliminated, stronger and expanded expression of ush and potent repression of vnd was observed in a broad zone centered over st2-dpp. These results indicate that vnd is indeed sensitive to BMP-mediated repression and that Brk can block the repressive as well as activating functions of BMP signaling. In analogy to what was observed in mid-lateralized embryos, it might have been expected that relief of Vnd repression in ventro-lateralized embryos would result in activation of ind in cells lacking vnd expression. However, no expression of either ind or msh was detected in these embryos, even near the edges of the vnd repression domain. These data suggest that the high levels of Dpp signaling generated under these experimental conditions are sufficient to repress vnd, as well as ind and msh. Such strong BMP signaling, which is similar to that acting in the non-neural ectoderm of wild-type embryos, may obscure potential differences in the relative sensitivities of these genes to BMP-mediated repression by repressing expression of all neural genes. Although it remains to be determined what the relative sensitivity of vnd is to BMP repression, the fact that vnd is subject to such repression raises the possibility that Dpp might also regulate vnd expression along its dorsal border in wild-type embryos, despite the low levels of Dpp that diffuse into that region. Since the concentration of Dorsal is limiting with regard to activating vnd in cells along this border, these cells would be expected to be the most susceptible to BMP-mediated repression (Mazutani, 2006).

This analysis of BMP signaling in lateralized embryos showed that Dpp can regulate the expression of ind and msh in a dose-dependent fashion along the AP axis, and can also repress vnd expression. To test whether Dpp plays a similar dosage-sensitive role in the regulation of neural identity genes along the DV axis in the presence of an intact gradient of nuclear Dorsal, an experiment was devised to locally inhibit the response of neural genes to Dpp within the neuroectoderm of embryos with normal DV polarity. Because Brk can suppress BMP-mediated repression of vnd, it was reasoned that mis-expression of brk with the eve-st2 enhancer might also relieve BMP repression of ind and msh. This localized expression of the st2-brk construct has the advantage of providing an internal comparison of gene expression domains within the same embryo. In embryos carrying the st2-brk construct, all three neural domains shifted dorsally at the site of brk over-expression. msh expression was de-repressed in a stripe dorsally as has been observed previously in dpp minus mutants, and the border between msh and ind shifted dorsally by approximately 4-6 cells. The dorsal shift in ind expression was observed prior to initiation of msh expression, consistent with their normal ventral-to-dorsal sequence of activation. In addition, a modest but consistent dorsal shift of 1-2 cells was observed in the ind/vnd border within the zone of st2-brk expression. The domains of msh and ind expression also shift in other situations where BMP signaling is altered in the context of an intact Dorsal gradient, which reinforces the view that BMP signaling plays a role in determining the positions and extents of these expression domains in wild-type embryos (Mazutani, 2006).

The results described above indicate that graded Dpp activity normally plays an important role in establishing the position of the border between the msh and ind domains, and to a lesser degree influences the ind/vnd border, which forms 10-12 cells from the dorsal source of Dpp. The co-ordinate shifts in the borders of neural identity gene expression in st2-brk embryos are consistent with the known ventral-dominant chain of repression among vnd, ind, and msh. This analysis also provides additional support for cis-acting vnd sequences being sensitive to BMP repression and suggests that the dorsal border of vnd expression is normally determined by balancing the opposing influences of Dorsal activation and BMP-mediated repression. It is noted that the dorsal expansion of vnd expression in st2-brk embryos does not necessarily imply that vnd is more sensitive to BMP-mediated repression than ind or msh, but instead that at limiting levels of Dorsal, even low levels of BMP signaling can exert a repressive effect on vnd expression (Mazutani, 2006).

Genetic control of dorsoventral patterning and neuroblast specification in the Drosophila central nervous system

The Drosophila embryonic CNS develops from the ventrolateral region of the embryo, the neuroectoderm. Neuroblasts arise from the neuroectoderm and acquire unique fates based on the positions in which they are formed. Previous work has identified six genes that pattern the dorsoventral axis of the neuroectoderm: Drosophila epidermal growth factor receptor (Egfr), ventral nerve cord defective (vnd), intermediate neuroblast defective (ind), muscle segment homeobox (msh), Dichaete and Sox-Neuro (SoxN). The activities of these genes partition the early neuroectoderm into three parallel longitudinal columns (medial, intermediate, lateral) from which three distinct columns of neural stem cells arise. Most of the knowledge of the regulatory relationships among these genes derives from classical loss of function analyses. To gain a more in depth understanding of Egfr-mediated regulation of vnd, ind and msh and investigate potential cross-regulatory interactions among these genes, loss of function was combined with ectopic activation of Egfr activity. Ubiquitous activation of Egfr expands the expression of vnd and ind into the lateral column and reduces that of msh in the lateral column. This work has identified the genetic criteria required for the development of the medial and intermediate column cell fates. ind appears to repress vnd, adding an additional layer of complexity to the genetic regulatory hierarchy that patterns the dorsoventral axis of the CNS. This study also demonstrates that Egfr and the genes of the achaete-scute complex act in parallel to regulate the individual fate of neural stem cells (Zhao, 2007a).

The Dorsal gradient initiates patterning of the CNS via the transcriptional regulation of the expression vnd, rhomboid and zen. Dorsal-mediated activation of rhomboid, the rate-limiting factor in Egfr-signaling and vnd establishes the initial expression domains of two of the earliest positive activators of CNS patterning along the DV axis. Similarly, Dorsal-mediated repression in the ventral and ventrolateral ectoderm limits the expression of zen and decapentaplegic (dpp) to the dorsal ectoderm. Dpp functions as a morphogen and defines via a repressive mechanism the lateral limit of the developing CNS (Zhao, 2007a).

Within the CNS, vnd and rhomboid exhibit differential sensitivity to the dorsal gradient with vnd being activated solely within the medial column and rhomboid in both the intermediate and medial columns. Since rhomboid is the limiting factor in Egfr signaling, its presence activates Egfr-signaling activity in the medial and intermediate columns. In wild-type embryos, Egfr activity maintains vnd expression in the medial column and is necessary to promote ind expression in the intermediate column. The ability of vnd to repress ind expression explains the restriction of ind expression to the intermediate column. vnd expression persists throughout most of the medial column until the end of embryogenesis; in contrast, ind expression is extinguished in the intermediate column neuroectoderm by stage 10 after the first two (of five) waves of NB segregation (Zhao, 2007a).

This work adds a new regulatory relationship into the genetic regulation of CNS patterning, since it was found that ind helps establish the lateral limit of vnd expression. ind could perform this function via the direct repression of vnd, a possibility supported by gain-of-function and loss-of-function experiments. If this model is correct, the mutual repression of vnd and ind would bear striking similarity to the reciprocal repressive interactions observed for the class I and class II homeodomain proteins that pattern the DV axis of the vertebrate CNS. In this context, it is important to note that the vertebrate ortholog of vnd, Nkx2.2., is a class II protein that plays a key role in patterning some of the ventral-most regions of the vertebrate CNS. Alternatively or additionally, vnd and ind could establish their mutual sharp boundary indirectly via the regulation of other factors. For example, differential regulation of homophilic cell-adhesion molecules could account for the observed phenotype. Differential expression of cell-adhesion molecules on medial versus intermediate column cells would cause these cells to associate preferentially with cells from the same column and result in a sharp boundary between the two cell populations that minimized interaction. Loss of such differences would reduce the requirement to minimize interactions and likely result in a jagged boundary. Additional work is necessary to identify the precise mechanism through which ind helps establish the lateral limit of vnd expression. Previous work has shown that misexpression of ind along the anterior-posterior axis using the Kruppel enhancer failed to repress vnd expression in the medial column. However, this is not contradictory to the current findings of this study. This work suggests that ind can repress vnd in the intermediate and lateral columns but not in the medial columns. It is likely that some factors that are present in the intermediate and lateral columns but are absent in the medial column help ind to repress vnd (Zhao, 2007a).

In addition, this work demonstrates that Egfr and vnd are sufficient to confer medial fate and that Egfr and ind are sufficient to confer intermediate fate. Although loss-of- function studies have shown that both Egfr and vnd are necessary for NBs to acquire medial fate, it is not clear whether Egfr functions solely through vnd. It has been shown that ectopic vnd expression results in partial transformation of lateral column into medial column. The current work shows that ectopic Egfr activity can induce the expression of vnd and together Egfr and vnd fully transform the lateral column into the medial column. Therefore, Egfr likely plays additional roles in determining medial cell fate other than maintaining vnd expression in the neuroectoderm. However, it remains unclear whether Egfr contributes to the intermediate column NB fate determination other than through its regulation of ind and whether ind by itself is sufficient to confer intermediate fate. Further studies are necessary to dissect the regulatory mechanisms that control intermediate column NB fate specification. In addition, while this work did not address the roles of Dichaete and Sox-Neuro, it has been reported that ubiquitous EGFR signaling activates Dichaete expression throughout the neuroectoderm. Because Dichaete and SoxNeuro cooperates with vnd in the mediate column and ind in the intermediate column in NB fate specification, they are likely to act as co-factors with Vnd and Ind in embryos expressing Egfr over a prolonged period to specify NB fate in the lateral column (Zhao, 2007a).

These experiments also underline the importance of temporal regulation of gene expression during CNS patterning. This is most notable with respect to the dynamic regulation of ind and vnd expression by Egfr signaling. Previous work suggested that the spatial dynamics of Egfr activity in the CNS account for the transient nature of ind expression in the intermediate column. Prior to NB formation Egfr activity is present in the intermediate column and activates ind expression in this domain. Once NBs begin to form Egfr activity disappears from the intermediate column and ind expression is also lost from intermediate column neuroectodermal cells. These data supported a simple regulatory relationship in which the presence of Egfr activity is necessary for ind expression in the intermediate column. However, while Egfr is necessary to activate ind in the intermediate column and sufficient to activate ind in the entire CNS, this study finds that ind expression turns over at its normal time even in the presence of ubiquitous and prolonged Egfr activity in the CNS. Thus, even though Egfr activity is necessary and sufficient for the activation of ind, once activated ind expression in the CNS appears to become independent of Egfr activity and other factors must regulate its temporally precise downregulation in the CNS (Zhao, 2007a).

Similarly, vnd also exhibits differential sensitivity to Egfr activity as a function of time. In contrast to ind, Egfr activity is not necessary to activate vnd expression in the medial column, however, Egfr activity is required later to maintain vnd expression in this domain. Thus, vnd and ind exhibit opposite responses to the Egfr signaling -- ind is activated but not maintained by Egfr activity while vnd is maintained but not activated by this pathway. It is interesting to note that vnd becomes competent to respond to Egfr signaling about the time ind loses its ability to respond to this signal. While the differential competency of the vnd and ind promoters to Egfr signaling is essential for proper DV patterning of the CNS, the molecular bases of these differences remain unknown. Some of the specificity likely resides within the promoters or regulatory regions of the genes themselves. However, since both promoters are Egfr-responsive albeit at different times additional levels of regulation appear necessary to explain the complexity in regulation. Alteration to higher order chromatin structure is known to play a key role in controlling the competency of different promoters to respond to specific signals and is a clear candidate to help mediate the differential responses of ind and vnd to Egfr-activity. However, how chromatin structure affects the ability of ind and/or vnd to respond to Egfr-activity remains unexplored. Future work that addresses the influence of modulation of chromatin structure on the ability of these and other genes to respond differentially to the same inputs should shed light on basic principles of gene regulation during development (Zhao, 2007a).

Genetic studies indicate that the activities of Egfr and the ac/sc genes converge to specify the fate of MP2 and possibly other NBs. Additional work on genes that regulate NB fate suggests that distinct convergent signals may play a general role in NB specification. For example, the transcription factor Huckebein is expressed in NB 4-2 and its associated proneural cluster and helps promote the fate of some of the neurons that develop in the 4-2 lineage. However, in the absence of huckebein function, the 4-2 lineage retains many of its wild-type characteristics. Thus additional intrinsic and extrinsic cues likely converge with huckebein to control the fate of NB4-2 and enable it to elaborate its proper cell lineage. Similar, albeit less detailed observations, have been made for runt and msh. These genes are expressed in specific NBs and the cell clusters from which they delaminate. Each gene appears to regulate only a subset of the distinguishing characteristics of the neuronal lineages that arise from their respective NBs yet none of them appears deterministic for a specific NB fate. Thus, it is speculated that convergent regulation of NB fate by multiple intrinsic and extrinsic factors is a general theme in CNS development and that classical double and triple mutant analyses will be essential to reveal convergent pathways involved in NB as well as neuronal specification (Zhao, 2007a).

A Myc-Groucho complex integrates EGF and Notch signaling to regulate neural development

Integration of patterning cues via transcriptional networks to coordinate gene expression is critical during morphogenesis and misregulated in cancer. Using DNA adenine methyltransferase (Dam)ID chromatin profiling, protein-protein interaction between the Drosophila Myc oncogene and the Groucho corepressor was identified that regulates a subset of direct dMyc targets. Most of these shared targets affect fate or mitosis particularly during neurogenesis, suggesting the dMyc-Groucho complex may coordinate fate acquisition with mitotic capacity during development. An antagonistic relationship was found between dMyc and Groucho that mimics the antagonistic interactions found for EGF and Notch signaling: dMyc is required to specify neuronal fate and enhance neuroblast mitosis, whereas Groucho is required to maintain epithelial fate and inhibit mitosis. The results suggest that the dMyc-Groucho complex defines a previously undescribed mechanism of Myc function and may serve as the transcriptional unit that integrates EGF and Notch inputs to regulate early neuronal development (Orian, 2007).

Gro is a downstream transducer of several signaling pathways and was placed at the crossroads of the Notch and EGF signaling pathways during patterning of the Drosophila nervous system, where EGF-induced site-specific phosphorylation of Gro attenuates it repression activity. During embryonic stage 9, the CNS matures in three bilaterally symmetrical longitudinal rows of neuroblasts, with the homeobox transcription factors, Vnd, Ind, and Msh, specifying the medial (ventral), intermediate, and lateral rows, respectively. EGF regulates the expression of both Vnd and Ind and is thus required for the formation of the ventral and intermediate rows. Interestingly, both Vnd and Ind are among the 38 dMyc-Gro shared targets identified in this study. Gro and dMyc, but not dMnt, are expressed in neuroblasts of stage 9 embryos. Because dMyc-Gro targets are associated with both neuroblast fate and mitosis, it is hypothesized that EGF and Notch coregulate cell fate and mitosis within the developing neuroectoderm via dMyc-Gro antagonism. Vnd expression (a shared Myc-Gro target whose expression overlaps with and is required for establishment of S1 neuroblasts), the overall number of neuroblasts, and mitotic activity in wild-type embryos were compared to groe47 loss-of-function (LOF) mutants (in which the maternal contribution of Gro is removed), Egfr2, or Notch55e11 [note that dMyc LOF embryos cannot be generated]. These parameters were also evaluated in embryos overexpressing either dMyc or Gro using the conditional Gal4/upstream activating sequence (UAS) expression system. Vnd expression is stronger and expanded in both Notch and gro LOF embryos, as well as in embryos overexpressing dMyc when compared with wild type. These mutants also show neuroblast hyperplasia and elevated mitotic activity. Furthermore, Egfr LOF or Gro-overexpressing embryos show reduced Vnd expression, neuronal hypoplasia, and reduced mitotic activity, consistent with the molecular nature of the dMyc-Gro common targets (Orian, 2007).

Myc proteins are required for both cell growth/size and cell proliferation. The model in which Myc functions are mediated by heterodimerization with Max and antagonized by Mxd (Mad/Mnt) proteins has been well established. However, recent studies suggest that a set of interactions outside the canonical Myc/Max/Mxd network also regulate some of Myc's functions. Interestingly, the current studies point to a subset of dMyc direct targets that are not shared with either dMax or dMnt. Furthermore, dMnt-Dam and dMax-Dam were not recruited to these dMyc targets even in experiments where the Dam fusions were coexpressed in the presence of high levels of dMax or dMyc, respectively, suggesting that previously uncharacterized mechanisms may mediate Myc's recruitment to DNA, and proteins other than dMnt may antagonize its transcriptional activity on this set of targets. This study reports the identification of Gro as the first component in a pathway that antagonizes dMyc function independent of dMnt and operates during Drosophila neurogenesis (Orian, 2007).

Transcriptionally, dMyc was found to be positively required for the expression of dMyc-Gro targets, activity that is antagonized by Gro. Importantly, dMyc is not a Gro target, and reducing Gro levels does not affect dMyc protein levels. Furthermore, Gro antagonism is limited only to the dMyc-Gro subset of shared targets and does not involve dMnt: there is no overlap between genes bound by dMnt or Gro, dMnt is not expressed in cells where the dMyc-Gro interaction is observed, RNAi to dMnt does not affect Myc-Gro shared target expression, and overexpression of dMnt does affect PNS development (Orian, 2007).

Although the possibility that dMyc-Gro targets are coregulated by individual dMyc and Gro complexes cannot be excluded, the results suggest that dMyc and Gro are part of a single larger protein complex. First, the observation that RNAi to dMyc results in reduction of target expression and is restored by coreducing Gro suggests that other activators coregulate shared target expression along with dMyc. Second, biochemical purification, binding data, and DNA adenine methyltransferase (Dam)ID Southern analyses support the idea that both proteins physically interact with one another yet associate with DNA through distinct binding sites. Third, Gro does not bind directly to DNA but must be recruited to targets by sequence-specific DNA-binding transcription factors. Fourth, most of the dMyc-Gro targets lack E-box sequences associated with canonical Myc network targets, suggesting that dMyc and Gro may be recruited to shared targets via a novel mechanism or by other protein(s) yet to be identified. Candidates for recruiting Gro may be the E(spl) proteins that convey the Notch signal, antagonize the EGF pathway, interact with Gro, and exhibit similar phenotypes. Thus, the identification of the entire dMyc-Gro complex and its regulation will be an important next step (Orian, 2007).

Gro's role as a downstream transducer of Notch signaling during neurogenesis is well documented, and mounting evidence supports Myc as a key player in progenitor cell proliferation. This study has identified a previously undescribed role for dMyc, together with Gro, during Drosophila early neuronal development. dMyc and Gro are required to directly regulate key fate controlling genes such as the homeodomain proteins vnd and ind that are downstream targets of EGF signaling. Because Vnd was identified as a regulator of the proneural gene complex, the differential regulation of vnd by dMyc and Gro implicates them as antagonistic regulators upstream of proneural genes. Thus, it is proposed that dMyc is transiently required within the neuroectoderm, where it promotes specific fate acquisition and allows mitotic expansion of committed neuronal cells (Orian, 2007).

Phenotypically, it was observed that, similar to EGF, dMyc promotes neurogenesis both in the PNS and CNS, whereas Gro and Notch inhibit neuroblast formation and mitosis. This is a different role than that previously ascribed to dMyc, because it is usually associated with regulation of cell size and organismal growth, functions that are antagonized by dMnt. Consistent with this, a recent study identified EGF-induced phosphorylation of c-Myc, Max, and TLE proteins in mammalian cells. The antagonistic relationship of Myc/EGF to Gro/Notch is likely to be highly dependent on the developmental context and the specific progenitor niche. For example, in cellular contexts in which Notch promotes proliferation, such as during the development of T cells in acute leukemia, Myc is a direct target of mutated Notch1 and is required for T cell proliferation and development. The current findings also fit well with observations that N-Myc is required during mouse progenitor development, and that the fly tumor suppressor Brat regulates dMyc levels posttranscriptionally in larval neuroblasts resulting in a 'tumorous' phenotype (Orian, 2007).

Taken together, the snapshot provided by DamID data leads to the suggestion of a model in which changes in neuronal progenitor fate and mitosis are determined by the balance between EGF and Notch signaling that is likely transcriptionally mediated by the dMyc-Gro complex. During epithelial development, Notch, like Gro, is required to specify and maintain epithelial fate. It is proposed that Gro sequesters dMyc in an inactive multiprotein complex formed by associating with dMyc, preventing the activation of dMyc-Gro shared targets. Upon EGF signaling, a molecular switch takes place whereby Gro is phosphorylated, and its repression is attenuated. dMyc, as part of an as-yet-to-be-identified activation complex, is then liberated to activate zygotic transcription of a subset of targets that determines neuronal fate and enhances mitosis. One of these targets is dMax, which is specifically expressed in the neuroectoderm. Activation of dMax would be expected to establish a feed-forward loop required for the subsequent activation of (E box-containing) Myc targets to promote cell growth. As development progresses, the dMnt gene would be induced, and dMnt-dMax complexes would replace dMyc-Max complexes, thereby promoting cellular differentiation (Orian, 2007).

Finally, both EGF/dMyc and Notch/Gro misregulation and mutation are intimately involved in hematological, epithelial, and neuroectodermal cancers. Thus, identification of a dMyc-Gro complex that could serve as a molecular junction to integrate EGF and Notch signaling inputs is highly relevant for both developmental biology and cancer (Orian, 2007).

Targets of Activity

Although intermediate neuroblasts defective was identified in a screen for Tinman transcriptional targets, ind and Tinman are expressed in nonoverlapping, nonadjacent regions of the CNS and mesoderm, so it is unlikely that Tinman regulates ind directly. Furthermore, tinman mutant embryos have no change in ind expression. Therefore it was hypothesized that ind is transcriptionally regulated by Vnd, a homeodomain protein related closely to Tinman. Vnd is produced in the ventral neuroectoderm immediately adjacent to the ind-expression domain. Genetic and molecular data demonstrate that ind is transcriptionally repressed by Vnd. In wild-type embryos the two genes are expressed in adjacent but nonoverlapping portions of the neuroectoderm. vnd is expressed in the ventral column, whereas ind is expressed in the intermediate column. In vnd mutant embryos, ind expression is broader and encompasses what would normally be the vnd-expression domain. This can be observed clearly in lateral views of whole-mount embryos as well as in embryo cross sections. These genetic experiments show that vnd is required to repress ind expression within the ventral column neuroectoderm (Weiss, 1998).

To determine whether Vnd regulates ind transcription directly, bacterially expressed Vnd protein was used to perform electrophoretic mobility-shift and footprinting assays with the genomic ind DNA fragment identified in the initial screen for Tinman regulated proteins. Vnd specifically binds the fragment of ind genomic DNA isolated in the screen. Three specific binding sites of roughly equal affinity can be identified using footprinting assays. The three sites protected in the footprinting assay each contain one copy of the sequence GTGAACT, which has been found to be a recognition sequence for both Vnd and the Tinman-related Nkx2.5 vertebrate protein (Weiss, 1998 and references).

VND regulates the expression of achaete and scute to the medial column of the ventral nervous system in at least two ways (Skeath, 1994). First, vnd is essential for AS-C gene expression in the medial column of every other neuroblast row through regulatory elements located 3' to achaete. Second, through a 5' regulatory region, vnd functions to increase or maintain proneural gene expression within the proneural cluster that normally gives rise to the neuroblast (Skeath, 1994). It also targets Enhancer of split and HLH-m5 of the Enhancer of split complex (Kramatschek, 1994).

E(spl)-C gene expression is dependent on lateral inhibition and the Notch pathway acting through Suppressor of Hairless. The role of VND in the transcriptional activation of E(spl)-C genes is currently unclear. Perhaps VND can activate proneural genes which in activate E(spl)-C genes.

VND appears to autoregulate. There are 25 high affinity VND binding sites (consensus T[T/C]AAGT[G/A]G) within the 2.2 kb 5' region of the vnd transcription start site (D.Tsao, 1994 and M. Nirenberg, personal communication to F. Jimenez, 1995).

Neurogenesis in Drosophila melanogaster starts by an ordered appearance of neuroblasts arranged in three columns (medial, intermediate and lateral) in each side (right and left) of the neuroectoderm. In the intermediate column, the receptor tyrosine kinase Egfr represses expression of proneural genes achaete and scute, and is required for the formation of neuroblasts. Most of the early function of Egfr is likely to be mediated by the Ras-MAP kinase signaling pathway, which is activated in the intermediate column, since a loss of a component of this pathway leads to a phenotype identical to that of Egfr mutants. MAP-kinase activation is also observed in the medial column where escargot (esg) and proneural gene expression are unaffected by Egfr. The homeobox gene ventral nerve system defective (vnd) is required for the expression of esg and scute in the medial column. vnd acts through the negative regulatory region of the esg enhancer that mediates the Egfr signal, suggesting vnd's role is to counteract Egfr-dependent repression. Thus, the nested expression of vnd and the Egfr activator Rhomboid is crucial to subdivide the neuroectoderm into the three dorsoventral domains (Yagi, 1998).

To investigate the involvement of Egfr in neurogenesis, mutant phenotypes of Efgr and its activator rhomboid were examined at various stages of neurogenesis. The dorsoventral subdivision of the neuroectoderm in stage-6 embryos is detectable by expression of esg, which is expressed in the lateral and medial columns but not in the intermediate column. A loss-of-function, temperature-sensitive mutation of Egfr and a null mutation of rho were used for analysis throughout this work. Egfr and rho mutations cause ectopic expression of esg in the intermediate column. Repression of esg in the intermediate column is likely to require a relatively high dose of Egfr signal. To examine the potential role of Egfr in neurogenesis, expression of the proneural genes ac and sc was carried out. These two proneural genes begin expression in the neuroectoderm of stage-7 embryos in a DV pattern of expression similar to that of esg in the previous stage. In Egfr and rho mutant embryos, ac and sc become ectopically expressed in the intermediate column. This phenotype is less penetrant and, occasionally, gaps of ac and sc expression are observed in the intermediate column. Since sc expression was similarly derepressed in Egfr mutant embryos, these phenotypes are likely to represent the near null phenotype of Egfr in the neuroectoderm. These data indicate that, in the intermediate column, the Egfr signal represses not only esg but also proneural genes, which are known to play key roles in neurogenesis. The effect of Egfr on neuroblast formation was monitored by the neuroblast marker Snail. Anti-Sna staining reveals three columns of SI neuroblasts in the control embryo: the intermediate column is distinguishable by the delayed onset of formation and number of Sna-positive cells. In Egfr and rho mutants, Sna-positive neuroblasts in the intermediate position are frequently missing, with a higher frequency of loss in Egfr embryos. In rho mutant embryos, the frequency of the loss of intermediate column neuroblasts is variable among embryos (Yagi, 1998).

To further examine the effect of the loss of Egfr signaling on the late events of neurogenesis, the progeny was traced for one of the intermediate neuroblasts, NB4-2. NB4-2 gives rise to the RP2 motor neurons, which can be identified by the expression of Even-skipped (Eve) and its unique position. Loss of RP2 neurons in stage 13 is observed (over half the cases examined) with the frequency of loss slightly higher in Egfr than in rho mutants, reflecting the earlier defect in neuroblast formation in stage 9. It is known that the Ras-MAPK signaling cascade is the major target of Egfr in many tissues. To understand whether Ras-MAPK signaling also mediates the Egfr signal in the neuroectoderm and to determine the relative contribution of each component of the pathway, the expression of esg and sc was examined in embryos lacking one of the Ras-MAPK signaling components. The phenotype of mutants lacking either Sos, Ras1, Draf or Dsor1 was examined. As in wild-type embryos, embryos mutant for any of the four genes examined express esg in three separate domains: procephalic neurogenic region, amnioserosa and neuroectoderm. In all cases, the anterior limit of the procephalic expression and the posterior limit of neuroectodermal expression are expanded to the terminus, consistent with the fact that Ras-MAPK is required for the terminal fate specification controlled by Torso receptor tyrosine kinase. All mutants exhibit specific defects within the neuroectoderm where esg expression is derepressed in the intermediate column. Essentially the same phenotype is also observed with sc expression, suggesting the loss of Ras-MAPK signaling has the same consequence as the loss of Egfr. All four Ras pathway mutants show, qualitatively, the same phenotype in the neuroectoderm. The neuroectoderm phenotype in Ras1 mutants is not rescued by a paternal copy of the wild-type gene, suggesting that a relatively high dose of the Ras signal is required for repression of esg and sc in the neuroectoderm (Yagi, 1998).

Rhomboid (rho) is initially expressed in the medial half of neuroectoderm, but repression of esg, ac and sc transcription by Egfr and Ras-MAPK occurs only in the intermediate column, posing a question as to whether or not the site of MAPK activation and the site of transcriptional repression exactly correspond. The spatial and temporal pattern of MAPK activation has been described by the use of an antibody that specifically reacts with the phosphorylated and activated form of MAPK (diphospho-MAPK=dpMAPK), which shows that dpMAPK is distributed in a broad domain in the neuroectoderm in stage 5-7 embryos. dpMAPK is distributed in an 8- to 10- cell-wide area in the neuroectoderm in stage-5 embryos and becomes restricted to the ventral region at the end of gastrulation. This rapidly evolving pattern of dpMAPK expression made it difficult to determine the exact correlation between distribution of dpMAPK and the DV subdomains in the neuroectoderm. A protocol was used to double label embryos with dpMAPK and antisense RNA probes to study the spatiotemporal relationship between expression of dpMAPK, its activator Rhomboid (Rho) and its downstream target, esg. Initial expression of dpMAPK overlaps with that of Rho in stage-5 embryos; dpMAPK expression remains in this broad domain when Rho expression became restricted to the medial column at gastrulation in stage 6, and finally narrows down to a 2- to 3-cell-wide stripe abutting the stripe of Rho at stage 7. Comparison with the mesodermal marker sna shows that the ventral border of dpMAPK expression abuts the neuroectoderm-mesoderm border. Examination of histochemically stained material reveals a sharp ventral border of dpMAPK expression, which gradually declines in the dorsal direction, resembling the pattern of Rho expression. In Egfr mutant embryos, dpMAPK staining is not detectable. These results demonstrate that MAPK activation in the neuroectoderm is dependent on Egfr and follows the spatial expression pattern of Rho, but persists for some time after termination of Rho transcription. The latter observation may reflect perdurance of Rho or its target protein, Spitz (Spi). Alternatively, a ligand other than Spi, such as Vein, might be activating Egfr. The dorsal limit of dpMAPK expression was determined relative to the three separate columns of neuroectoderm revealed by esg expression. In stage 5, the dorsal limit of dpMAPK reaches halfway within the intermediate column and subsequently retracts to the medial column in stage 6 and 7. These data indicate MAPK is activated at least in the ventral half of the intermediate column of the neuroectoderm when it is required to repress transcription of esg. It is concluded that transcription of esg is repressed by a marginal level of MAPK activation (Yagi, 1998).

Why does the high level of dpMAPK in the medial column fail to repress transcription of esg, ac and sc? One possibility is that a factor is present in the medial column that antagonizes or overcomes the events downstream of dpMAPK. A candidate for such a gene is vnd, which is expressed in the medial column in late stage 5 and is required for expression of ac. Expression of esg and sc was examined in vnd null mutant embryos: their expression in the medial column was found to be lost. To understand how vnd controls gene expression in the medial column, a target for vnd was sought in the cis-regulatory regions of an esg enhancer. Expression of esg is regulated by the neurogenic enhancer, which can be divided into two regions, the activator region, which mediates activation in the entire neuroectoderm, and the repressor region, which mediates Egfr-dependent repression. Expression of the esg-lacZ fusion genes was examined in the vnd mutant background. The construct esg-lacZ D1 containing the complete neurogenic enhancer reproduces neuroectodermal expression of esg and is regulated by vnd in the same manner as esg. In contrast, the construct esg-lacZ D5 lacks the repressor region for the Egfr-mediated regulation and is expressed in all three columns. Evidence is provided that vnd does not regulate esg-lacZ D5 and that the target site for vnd regulation is included in the repressor region. vnd is also shown not to be involved in activation of esg or Egfr; rather, it works to counteract the negative effect of Egfr (Yagi, 1998).

Given the results of the present work showing that vnd counteracts the negative regulatory effect of Egfr, a model is proposed for the DV structuring of the neuroectoderm. A gradient of nuclear localized Dorsal protein induces expression of dorsoventrally regulated genes such as dpp, sna, and twi, which determine the extent of the neuroectoderm, and the nested expression domains of rhomboid and vnd. rho determines the domain of MAPK activation, which covers the medial and intermediate columns. vnd is expressed in the medial column where it counteracts the Egfr signal to allow expression of esg. Thus the three columns in the stage 5-6 neuroectoderm are distinguished by unique combinations of activated MAPK and vnd expression. In the lateral column, neither of them are activated or expressed, and esg transcription is activated by default. In the intermediate column, MAPK is activated and represses esg transcription. In the medial column, vnd counteracts activated MAPK to allow the default pathway to activate esg transcription. It is possible that proneural genes are also regulated by the same mechanism. Loss of the Egfr signal leaves two domains, one with and the other without expression of vnd, the pattern likely to be reflected in the appearance of only two neuroblast columns in the later stage. Thus it is proposed that the primary role of Egfr signal in this stage is to define the intermediate domain to the neuroectoderm which is otherwise separated into two domains. It is possible that Egfr signal and vnd have later roles in promoting neuroblast formation in the intermediate and medial columns, respectively (Yagi, 1998).

The interactions responsible for the nucleotide sequence-specific binding of the vnd/NK-2 homeodomain of Drosophila to its consensus DNA binding site have been identified. A three-dimensional structure of the vnd/NK-2 homeodomain-DNA complex is presented, with emphasis on the structure of regions of observed protein-DNA contacts. This structure is based on protein-DNA distance restraints derived from NMR data, along with homology modeling, solvated molecular dynamics, and the results from methylation and ethylation interference experiments. Helix III of the homeodomain binds in the major groove of the DNA and the N-terminal arm binds in the minor groove, in analogy with other homeodomain-DNA complexes whose structures have been reported. The vnd/NK-2 homeodomain recognizes the unusual DNA consensus sequence 5'-CAAGTG-3'. The roles in sequence specificity and strength of binding of individual amino acid residues that make contact with the DNA are described. It is shown, based primarily on the observed protein-DNA contacts, that the interaction of Y54 with the DNA is the major determinant of this uncommon nucleotide binding specificity in the vnd/NK-2 homeodomain-DNA complex (Gruschus, 1997).

Snail, a zinc-finger transcriptional repressor, is a pan-neural protein, based on its extensive expression in neuroblasts. Previous results have demonstrated that Snail and related proteins, Worniu and Escargot, have redundant and essential functions in the nervous system. The Snail family of proteins control central nervous system development by regulating genes involved in asymmetry and cell division of neuroblasts. Whether the neuroblast expression of snail and worniu is regulated by proneural genes was examined. Such a result would place the snail family in the well established genetic hierarchy that controls early neuroblast differentiation. The scuteB57 deletion mutant uncovers the three pro-neural genes: achaete, scute and lethal of scute. In this mutant, the expression of worniu in neuroblasts is significantly reduced. Only a few neuroblasts within each segment exhibit staining, and the expression level is substantially lower than in the wild type. The expression of worniu is also regulated by vnd and ind, such that in these mutant embryos the whole ventral and intermediate columns of staining are missing. In the mshDelta68 mutant, no abnormal expression of worniu was detected. Previous results have shown that the neuroblast expression of snail is slightly affected in achaete-scute and vnd mutants but is not affected in a daughterless mutant. In ind and msh mutants, Snail protein expression was observed in many neuroblasts but the spatial pattern was rather disorganized. In summary, most of the proneural genes tested have profound effects on the expression of worniu, and have detectable but lesser effects on that of snail. The predominant expression of snail and worniu in neuroblasts and their regulation by proneural genes suggests that the snail family genes may have important functions within neuroblasts (Ashraf, 2001).

Nk(x)-type homeobox genes are an evolutionarily conserved family that regulate diverse developmental processes. A novel Drosophila gene, Nkx6, is described which encodes an Nk-type transcription factor most homologous to vertebrate Nkx6.1 and Nkx6.2. The homeodomains and NK decapeptide domains of all three proteins are highly conserved. Nkx6 is expressed in the embryonic brain, ventral nerve cord, hindgut, and internal head structures. Nerve cord expression is in midline precursors, several ventral and intermediate column neuroblasts, and later in neurons but not glia, similar to the known expression of Nkx6 genes in the neural tube. Nkx6 is positively regulated, directly or indirectly, by vnd in brain precursors. In vnd mutants, head neuroectoderm Nkx6 expression is abolished where it is normally co-expressed with vnd. Conversely, vnd-overexpression leads to ectopic Nkx6 expression in the brain. These findings further highlight the importance of interactions between Nk(x)-type genes in regulating their expression (Uhler, 2002).

Two overlapping Nkx6 cDNAs were isolated from a Drosophila embryonic cDNA library whose combined insert length was 3041 bp. Multiple genomic clones, which map to polytene band 70E4-5, were also isolated and used to identify the intron/exon boundaries. Both low stringency hybridization of genomic Southern blots and BLAST searches of the Drosophila genome sequence databases failed to identify other Nkx6 homologs. The cDNA has an open reading frame of 1539 nucleotides. Alignment of Drosophila Nkx6 with murine, rat and human Nkx6 sequences reveals considerable similarity within the homeodomain region, and in the amino and carboxy termini. The NK decapeptide domain is also conserved, suggesting that Nkx6 may recruit the Groucho co-repressor as shown for vertebrate Nkx6 proteins. Nkx6 does not possess the putative DNA binding interference domain present in the carboxy termini of Nkx6.1 and Nkx6.2 (Uhler, 2002).

The expression pattern of Nkx6 mRNA was determined by in situ hybridization to whole-mount embryos. Expression is detected in the developing hindgut, ventral maxillary epidermis-derived head structures, and the developing CNS. Nkx6 expression initiates at stage 6 as two bilateral clusters in the head neuroectoderm. By stage 8, expression is seen in the hindgut primordium and ventral midline precursors. Expression in the nerve cord neuroectoderm, which is limited to anterior segments, begins at stage 9. Ventral nerve cord neuroblasts begin expressing transcripts at stage 10. During early stage 10, transient weak expression occurs in most neuroblasts immediately flanking the midline (ventral neuroblasts) and some intermediate neuroblasts. Expression then becomes restricted to two bilateral clusters of 3¯5 neuroblasts per neuromere by the end of stage 10, concomitant with declining midline expression. Transcripts are detected in GMCs from late stage 10 onwards. By late stage 11, only one to two neuroblasts per hemineuromere express Nkx6. Strong expression in the ventral maxillary epidermis can be distinguished from anterior neuroectodermal expression at this stage. Expression in the nerve cord shifts from a tightly clustered distribution to a disperse pattern during early stage 12. By stage 13 there are approximately 15¯20 Nkx6-positive cells per hemineuromere. Expression in the CNS, head structures, and hindgut persists at least until late stage 16 (Uhler, 2002).

To identify which precursor cells express Nkx6, embryos were double-labeled for Nkx6 mRNA and cell-specific markers. Focus was first placed on dorso-ventral CNS patterning genes. During early neurogenesis, the transcription factors Single-minded (Sim), Vnd, Intermediate neuroblasts defective (Ind), and Muscle specific homeobox (Msh) subdivide the CNS into midline, ventral, intermediate and lateral columns respectively. Nkx6 is clearly expressed in midline precursors since it is co-expressed with Sim. During stage 10, Nkx6 expression changes from weak paramedian expression to strong clustered expression, with approximately one to two cells per cluster co-expressing Nkx6 mRNA and Vnd protein, and two cells per cluster co-expressing Nkx6 and Ind. No Nkx6 transcripts are detected lateral to the Ind column (Uhler, 2002).

Double-labeled embryos for Nkx6 and three transcription factors, Engrailed, Achaete and Castor, were examined. Double labeling for Nkx6 and Engrailed, which is expressed in the posterior of each neuromere, positioned the Nkx6 -positive neuroblast clusters to the anterior half of each neuromere. Double-labeling for Nkx6 and Achaete, expressed in MP2 and X neuroblasts at stage 10, revealed that Nkx6 is expressed in a neuroblast directly anterior to MP2, either NB 3-1 or 2-2. One, occasionally two, Nkx6-positive neuroblasts are located just lateral to MP2, NBs 3-2 and 4-2. Castor is expressed in seven neuroblasts per hemineuromere at stage 10 and 18 by stage 11. Nkx6 and Castor do not co-localize during stage 10 (except at midline). By stage 11, they co-localize in NB 3-2, positioned just anterior to NB4-2 which expresses Nkx6 but not Castor. Very weak co-expression is also detected in NB2-2 and 3-1. Together, these results suggest that Nkx6 stage 10¯11 positive neuroblast clusters locate anteriorly in ventral and intermediate column neuroblasts, including NBs 2-2, 3-1, 3-2 and 4-2 (Uhler, 2002).

Since Nkx6.2 is expressed in adult glia, whether Drosophila Nkx6 is expressed in embryonic glia as well as in neurons was assessed by double labeling for the specific neuroectodermal glial marker Repo. The midline glia, which develop from the mesectoderm, do not express Repo. Since Nkx6 and Repo are not co-expressed at any embryonic stage, Nkx6 expression is specific to neuronal cells (Uhler, 2002).

Next, embryos were double-labeled for Nkx6 transcripts and several well-characterized markers of interneurons and motorneurons, Even-skipped (Eve), 22C10 (Futsch), and Fasciclin II (FasII). Sensory neurons, which are in the peripheral nervous system, do not express Nkx6. At stage 11, the transcription factor Eve is expressed in the NB 4-2->GMC4-2a->RP2 lineage. Although Nkx6 is expressed in the NB 4-2 parent, no transcripts are detected in any Eve-positive cells at this or later embryonic stages. Several pioneer neurons begin expressing 22C10 and/or FasII at stages 11 and 12 (including aCC, pCC, MP1, SP1, vMP2 and dMP2), none of which co-express Nkx6. Due to poor resolution of in situ staining, it is possible that cells may co-express Nkx6 and 22C10 or FasII at later stages (Uhler, 2002).

The position of Nkx6-expressing neurons was determined using an antibody, BP102, which recognizes all CNS axons. At stage 16, Nkx6 is expressed across the entire medio-lateral axis of the nerve cord. The majority of Nkx6-positive cells are positioned at and below (ventral to) the level of the neuropile (Uhler, 2002).

The complexity of Nkx6 expression in the CNS likely reflects the activities of multiple regulators. Candidates for Nkx6 regulation include the early CNS patterning genes, vnd and sim, which co-localize with Nkx6 mRNA in the head and midline respectively shortly after sim and vnd are activated. It was asked whether either of these genes regulate Nkx6 by monitoring the distribution of Nkx6 transcripts in embryos where sim and vnd expression is perturbed. sim transcripts are expressed in midline precursors from the cellular blastoderm (stage 5) onwards, and is required for proper midline cellular and molecular differentiation. Nkx6 expression is abolished specifically in the midline of sim mutant embryos, suggesting that sim positively regulates Nkx6, directly or indirectly, while Nkx6 expression in the adjacent neuroblast layer is maintained (Uhler, 2002).

Mutant and misexpression assays were used to assess whether vnd regulates Nkx6 expression. In the head, Nkx6 is activated within an hour of Vnd protein expression: Nkx6 is activated during gastrulation (stage 6), while Vnd is expressed from stage 5 onwards. In wild-type embryos Nkx6 expression in the head localizes within the Vnd expression domain, with the later exception of a single neuroblast per lobe which begins expressing Nkx6 at stage 10. Nkx6 expression is affected in vnd mutants as early as stage 6, where expression in the head neuroectoderm is not activated. At later embryonic stages, all brain expression is absent apart from in the isolated Vnd-negative neuroblasts and their progeny. Anterior neuroectodermal expression in the nerve cord is never initiated. Nkx6 expression in nerve cord neuroblasts is reduced and disordered in vnd mutants, with reductions typically occurring at ventral positions where neuroblasts normally co-express Nkx6 and Vnd. The significance of this effect is uncertain, since many ventral neuroblasts in the nerve cord do not form and residual neuroblasts often switch to intermediate identities in vnd mutants (Uhler, 2002).

Whether vnd can alter Nkx6 expression was examined by misexpressing vnd at different times during development. vnd was ubiquitously expressed at early stages, using heat-shock overexpression beginning during gastrulation. As early as stage 8, Nkx6 mRNA expression in the head neuroectoderm is slightly expanded compared to wild-type, while expression in the nerve cord neuroectoderm is unchanged. Next, vnd was misexpressed throughout the neuroblast layer using the Gal4-UAS system. The sca-Gal4 driver directs transgene expression between stages 9 and 13. The number of cells expressing Nkx6 increases in the brains of sca-Gal4 x UAS-vnd embryos by stage 11. While ectopic vnd is detected throughout the CNS of these embryos, ectopic Nkx6 expression is restricted to only a subset of Vnd-expressing cells, suggesting that not all cells are competent to express Nkx6 (Uhler, 2002).

Drosophila Nkx6 has several general features in common with its two vertebrate counterparts. Expression of all three Nk(x)6 family members in the embryonic CNS is restricted to neurons. In Drosophila, chick, and mouse embryos, Nk(x)6 genes are transiently expressed in the ventral CNS midline early during development. During early stage 10, Drosophila Nkx6 is expressed in most ventral column neuroblasts, similar to early mouse and chick Nkx6.1 and Nkx6.2 expression in the ventral third of the neural tube (p3, pMN and p2 domains). In Drosophila, a small subset of intermediate neuroblasts also express Nkx6, paralleling the expression of chick and mouse Nkx62 expression in a narrow stripe of intermediate progenitors (p1). An important divergence between fly and vertebrate expression patterns is that broad CNS expression of the fly homolog begins relatively late during neurogenesis. In chick and mouse embryos, Nkx6.1 and Nkx6.2 are among the earliest genes expressed. Expression is initiated as longitudinal columns in the neural plate. The fly neuroectoderm, the equivalent of the neural plate, expresses Nkx6 only in very anterior regions, with earliest expression occurring at stage 6 (Uhler, 2002).

Despite extensive double labeling analyses, the lineages and movements of cells that express Nkx6 are not known. Although four neuroblasts in the ventral nerve cord which express the gene strongly between stages 10 and 11 of embryonic development have been identified, the identity of Nkx6-positive cells thereafter have not been identified. It is likely that transcripts are expressed in some, though not all, GMC progeny of Nkx6- positive neuroblasts, because several Nkx6-positive GMCs and neuroblasts are positioned in a manner typical of GMCs budding off the parent neuroblast. Conversely, in at least one lineage, NB 4-2->GMC4-2a->RP2, Nkx6 expression is not expressed beyond the neuroblast. The rapid expansion of Nkx6 expression between stages 12 and 13 suggests that Nkx6 expression is not lineage restricted (Uhler, 2002).

Sim and Vnd are expressed in the right cells and at appropriate times to potentially regulate Nkx6 expression in distinct CNS domains. The absence of midline transcripts in sim mutants suggests that Nkx6 is positively regulated by sim, as are most genes expressed in the midline. Vertebrate sim homologs are not expressed in the floorplate but in cells flanking the floorplate, several days after Nkx6.1 and Nkx6.2 activation, suggesting some evolutionary divergence in Nk(x)6 gene regulation (Uhler, 2002).

In the head, vnd likely positively regulates Nkx6 expression. Both gene products co-localize within an hour of detectable Vnd. Nkx6 expression in the head neuroectoderm and progeny is abolished in the absence of vnd, with the exception of two isolated neuroblasts that do not express Vnd. Conversely, overexpression of vnd leads to increased Nkx6 expression in the head. Although effects on Nkx6 expression in the ventral nerve cord are observed, the significance of the results are uninterpretable (Uhler, 2002).

An emerging feature of developmental genes is that many evolutionarily conserved genes are homologous not only in sequence, but often also have conserved expression patterns (for example, scarecrow and Nkx2.1), conserved regulation (e.g. targets of apterous homologs) and in some instances also conserved functions (vnd and Nkx2.2). Overall homology of the fly Nkx6 gene is not prejudiced towards either individual vertebrate Nkx6 gene. The apparent absence of additional Nkx6 homologs, ascertained both by sequence homology searches of the Drosophila genome and by hybridization analysis, suggests that Nkx6.1 and Nkx6.2 are each derived from an ancestral Nk(x)6 gene (Uhler, 2002).

There is a surprising degree of conservation in the dorso-ventral expression pattern of the fly Nkx6 gene and its vertebrate homologs in the neural tube. However, the genes that regulate their expression in the CNS may be quite divergent, as there is no evidence that sim or Nkx2.2 are upstream of Nkx6 genes in the vertebrate CNS. Nkx6 expression is activated ventrally by Sonic hedgehog, repressed dorsally by BMP-7, and regulated across the antero-posterior axis by unknown notochord factors. Evidence in the mouse neural tube suggests that Nkx6.1 represses Nkx6.2 in the ventral region after an initial overlap. In the pancreas, however, analysis of single and double Nkx2.2/Nkx6.1 mutants indicates that Nkx2.2 is upstream of Nkx6.1, suggesting that the regulatory interaction between Nk(x)2 and Nk(x)6 type genes is evolutionarily conserved. Mouse Nkx2.2 can bind to the Nkx6.1 enhancer in vitro (Uhler, 2002).

Although the deduced amino acid sequence of Nkx6, its expression pattern, and similarity to its vertebrate homologs suggest a role in regulating cell fates, the function of Nkx6 has not been elucidated yet. Functional studies have been initiated using two independent deficiencies that cover the Nkx6 locus. In trans-heterozygous Nkx6-deficient embryos, axon scaffold defects, most commonly incomplete separation of the anterior and posterior commissures, have been found. This effect is a hallmark of abnormal midline glial development and is consistent with Nkx6 expression in midline precursors (Uhler, 2002).

Identification and analysis of Vnd/NK-2 homeodomain binding sites in genomic DNA

Vnd/NK-2 homeodomain affinity column chromatography was used to purify Drosophila DNA fragments bound by the Vnd/NK-2 homeodomain. Sequencing the selected genomic DNA fragments led to the identification of 77 Drosophila DNA fragments that were grouped into 42 Vnd/NK-2 homeodomain-binding loci. Most loci were within upstream or intronic regions, especially first introns. Nineteen of the Drosophila DNA fragments cloned correspond to one locus, termed Clone A, which is 312 bp in length and contains five Vnd/NK-2 homeodomain core consensus binding sites, 5'-AAGTG, and is part of the first intron of the Beadex gene. The interactions between Clone A and Vnd/NK-2 homeodomain protein were further analyzed by mobility-shift assay, DNase I footprinting, methylation interference, and ethylation interference. The DNase I footprinting analysis of Clone A with Vnd/NK-2 homeodomain protein revealed three strong binding sites and one weak binding site between 15 and 130 bp of Clone A. Binding of the Vnd/NK-2 homeodomain to the 5'-flanking sequence of vnd/NK-2 genomic DNA was also analyzed. The DNase I footprinting result showed that there are two strong binding sites and five weak binding sites in the fragment between -385 and -675 bp from the transcription start site of the vnd/NK-2 gene (Wang, 2005).

With the in vitro genomic DNA purification method, Drosophila genomic DNA fragments that contain nucleotide sequences that bind to the Vnd/NK-2 HD were purified by five rounds of Vnd/NK-2 HD-Sepharose affinity column chromatography. Seventy-seven purified Drosophila DNA fragments were cloned and sequenced. Most DNA sequences were cloned once, but 11 sequences were cloned two to four times, and 19 clones were obtained of one sequence (Clone A). Forty-two different sequences (i.e., loci) were purified from Drosophila genomic DNA. Six repeat fragments were isolated; however, they were within six different transposons. Therefore, the genomic DNA purification procedure resulted in an enrichment of DNA fragments that bind to the Vnd/NK-2 HD with high affinity from single-copy genomic DNA. This approach should be applicable to many other transcription factors. To improve this approach, more purified DNA fragments need to be cloned and sequenced (Wang, 2005).

Based on DNase I footprinting results, there were two strong and five weak DNase I-protected regions located between -385 and -675 bp from the transcription initiation site of the vnd/NK-2 gene. The strong DNase I footprinting site N2, 14 nucleotides in length, contained one high-affinity Vnd/NK-2 consensus binding site, 5'-T(T/C)AAGTG(G/C). The N1 site, 27 nucleotides in length, contained one high-affinity and three low-affinity binding sites The N1 site may have more than one Vnd/NK-2 HD molecule binding to this region. Among five weak footprinting sites, n2 contained a low-affinity Vnd/NK-2 HD site, and n5 contained a TAAT core. The AAG of the n1 site and the GAG of the n4 site may be the nucleotides that are recognized in weak binding of the Vnd/NK-2 HD to DNA. It has been shown that Cnd/NK-2 genomic DNA between -410 and -750 bp from the transcription start site, which contains N1 and N2 as well as n2-n5 protected regions, is regulated by Vnd/NK-2 protein in Drosophila S2 cells. The footprinting data agree with the demonstrations that the Vnd/NK-2 protein, directly or indirectly, regulates its own gene expression and that the DNase I footprinting regions containing the 5'-AAGTG core in the 5' upstream region of the vnd/NK-2 gene may be functional. The DNase I footprinting analysis of Clone A with the Vnd/NK-2 HD protein showed three strong and one weak protection sites clustered at the 5' end of Clone A. Each strongly protected site contained one Vnd/NK-2 consensus binding core (Wang, 2005).

Comparing methylation interference results of Vnd/NK-2 binding to the 5'-flanking sequence of Vnd/NK-2 genomic DNA (N2 site) and Clone A (A1 site), as well as the results of TTF-1 binding to site C, the invariant interference occurred at (+)A3, (+)A4, (+)G5, (-)A1 and (-)A6. These invariant sites are consistent with the site AAGTG derived from the footprinting results and with the analysis of the Vnd/NK-2 consensus binding site. These results demonstrate the importance of the AAGTG sequence for interactions between the Vnd/NK-2 HD and DNA (Wang, 2005).

Other groups have used chromatin profiling to identify genomewide target sequences for the Drosophila GAGA factor and have found that protein binding occurs in intergenic DNA regions and introns and very few in exons. A chromatin immunoprecipitation procedure also has been used to identify DNA binding sites; 203 Drosophila DNA fragments that bind the Engrailed protein were localized in intergenic (53%) or intronic (47%) regions. Among 85 isolated Krüppel-binding fragments of Drosophila genomic DNA, 42% corresponded to intergenic regions, 21% corresponded to introns, 22% overlapped intron/exon boundaries, and 15% corresponded to exons. Only 22% of binding sites for Sp1, cMyc, and p53 were located in upstream regions of genes, whereas 36% were within or were immediately 3' to well characterized genes. Among the CREB 215 binding sites located in human chromosome 22, 22% were within 10 kb of the 5' end of the gene, 4% were in exons, and 15% and 24% were in a first intron or other intron, respectively. Among 60 potential HD protein BARX2 target loci, 35% were located within introns, generally within the first or second intron, and 65% were in intergenic regions. Recently, the DNA elements located in the first intron have been shown to be functionally important for mouse phenotypic traits and Drosophila neurogenesis. However, further work is needed to determine whether the purified DNA fragments that contain Vnd/NK-2 HD binding sites and the candidate target genes are functional (Wang, 2005).

Linking pattern formation to cell-type specification: Dichaete, Ind and Vnd directly repress achaete gene expression in the Drosophila CNS

Mechanisms regulating CNS pattern formation and neural precursor formation are remarkably conserved between Drosophila and vertebrates. However, to date, few direct connections have been made between genes that pattern the early CNS and those that trigger neural precursor formation. Drosophila has been used to link directly the function of two evolutionarily conserved regulators of CNS pattern along the dorsoventral axis, the homeodomain protein Ind and the Sox-domain protein Dichaete, to the spatial regulation of the proneural gene achaete (ac) in the embryonic CNS. A minimal achaete regulatory region that has been identified that recapitulates half of the wild-type ac expression pattern in the CNS; multiple putative Dichaete-, Ind-, and Vnd-binding sites have been found within this region. Consensus Dichaete sites are often found adjacent to those for Vnd and Ind, suggesting that Dichaete associates with Ind or Vnd on target promoters. Consistent with this finding, Dichaete can physically interact with Ind and Vnd. Finally, the in vivo requirement of adjacent Dichaete and Ind sites in the repression of ac gene expression has been demonstrated in the CNS. These data identify a direct link between the molecules that pattern the CNS and those that specify distinct cell-types (Zhao, 2007b).

Sox-domain proteins physically associate with other transcription factors to regulate gene transcription. Thus, the identification that Dichaete genetically interacts with Vnd and Ind suggested that Dichaete associates with Vnd and Ind to regulate gene expression in the CNS. To test this model, it was asked whether Dichaete can interact with Ind or Vnd in the yeast two-hybrid assay. Control experiments revealed that the full-length Dichaete protein as well as the region C-terminal to the high-mobility-group (HMG) DNA-binding domain (amino acids 221–384) activate transcription on their own when fused to the Gal4 DNA-binding domain, suggesting that the C-terminal region contains transcriptional activation activity. As a result, a number of distinct Dichaete fusion constructs were tested for self-activation of transcription and four were identified that were transcriptionally inert. One of these contained the HMG domain and the C-terminal region, indicating that the presence of the HMG domain may mask the transactivation properties of the C-terminal region. A prior study mapped a transactivation domain to the N-terminal region of Dichaete (Ma, 1998), yet no transactivation properties of this domain were identified in this study. Consistent with a transactivation domain residing in the C-terminal region of Dichaete, all other identified transactivation domains in Sox-family proteins map C-terminal to the HMG domain (Zhao, 2007b).

By using the four Dichaete bait constructs, it was found that the N-terminal region of Dichaete (amino acids 1–141) specifically interacted with full-length Ind protein. In a reciprocal manner, the ability of the Dichaete N-terminal region to interact with two different regions of Ind was tested: the region N-terminal to the homeodomain (amino acids 1–302) and the region including the homeodomain and all residues C-terminal to it (296–391). Both regions of Ind interacted strongly with the Dichaete N-terminal region, suggesting that this region of Dichaete can interface with two distinct regions of Ind (Zhao, 2007b).

In a similar manner, two distinct regions of Dichaete, the regions N-terminal (amino acids 1–141) and C-terminal (amino acids 221–384) to the HMG domain, interact with the full-length Vnd protein. Three different Vnd prey constructs were used to localize the regions of Vnd that interact with Dichaete. It was determined that the region of Vnd located between the TN domain (a domain common to Tinman/NK-2 proteins) and the homeodomain (amino acids 217–536) interacts with the Dichaete N-terminal domain. This result confirms and extends those of Yu (2005) who found that Vnd and Dichaete coprecipitate and that a Vnd deletion lacking the first 408 amino acids interacts with Dichaete. It was not possible to define the region of Vnd that interacts with the Dichaete C-terminal region, perhaps because the constructs interrupt the domain to which the C-terminal region of Dichaete binds or disrupt the general topology of this domain. Nonetheless, the yeast two-hybrid results indicate that Dichaete can interact with Ind and Vnd consistent with the model that Dichaete complexes with Ind and Vnd on target gene promoters to regulate transcription in the CNS (Zhao, 2007b).

A molecular understanding of how Dichaete, Ind, and Vnd pattern the CNS requires the identification and characterization of the regulatory regions of candidate direct target genes. One such candidate is the ac gene. Prior studies on ac suggested that regulatory regions important for its spatial regulation exist both 5' and 3' to the ac gene. Thus, an 8.15-kb minigene was generated that contains the ac transcription unit as well as ~4.8 kb of DNA 5' to the transcription start and ~2.4 kb of DNA 3' to the polyadenylation site and its ability to drive ac expression in an In (1)y3PLsc8R mutant background was tested. This genetic background carries a deletion of ac and also deletes the regulatory regions necessary to drive sc expression in row 3. Thus, it allows visualization of ac expression as driven by the minigene in the absence of endogenous ac/sc gene expression in row 3. The ac minigene drives ac expression in half of its wild-type CNS pattern because ac is expressed normally in the medial and lateral clusters of row 3 but is not expressed in row 7. The dynamics of ac expression as driven by the minigene in row 3 mirror those of endogenous ac expression because ac expression in each cluster quickly becomes restricted to a single cell, the presumptive neuroblast, which then delaminates into the interior of the embryo and extinguishes ac gene expression before its first division. Thus, the DNA contained within the minigene is sufficient to activate ac in its wild-type expression pattern in row 3 and to mediate the Notch-dependent restriction of ac to the presumptive neuroblast (Zhao, 2007b).

By creating a series of 5' and 3' deletions of the initial minigene, the regulatory regions sufficient to drive ac expression in row 3 was delimited to a 2.84-kb genomic fragment (pG7), which is referred to as the row 3 element. This element contains the ac transcription unit, 1.34 kb of DNA 5' to the start of transcription and 542 base pairs of DNA 3' to the end of the transcription unit. ac minigenes were characterized for their ability to respond to the functions of Dichaete, ind, and vnd and for the presence and in vivo relevance of putative binding sites for these factors (Zhao, 2007b).

In support of Dichaete, Vnd, and Ind acting directly on the row 3 element to regulate ac expression, loss of Dichaete, vnd, or ind function affects ac expression as driven by ac-pG4 or ac-pG7 in the same way, and these defects are identical to those observed for endogenous ac expression in these mutant backgrounds. For example, loss of ind or Dichaete causes, respectively, strong or modest derepression of ac expression in the intermediate column, whereas loss of vnd results in the absence of ac expression in the medial column (Zhao, 2007b).

To see whether Dichaete, Ind, or Vnd act directly on the row 3 element to control ac expression, this element was searched for perfect matches to the consensus Vnd [CAAGTG], Sox-domain [(A/T)(A/T)CAA(A/T)G and homeodomain (TAATGG) binding sites. The canonical Sox-domain and homeodomain binding site sequences were used because the consensus sites for Dichaete and Ind have not been determined. This search identified one match for Vnd (V) and three each for Dichaete (S1, S3, and S4) and Ind (H1, H3, and H4). Notably, predicted Dichaete/Sox-binding sites tend to reside close to predicted Vnd or Ind sites, consistent with Dichaete acting with Vnd and Ind to regulate ac expression. The sole exception is the Ind site (H1) located upstream of the transcriptional start site of ac. However, gel-shift assays identify a Dichaete-binding site 11 bp 5' of this Ind site (S2) (Zhao, 2007b).

Because the precise binding specificity of Ind is unknown, whether Ind can bind the predicted sites was tested by using gel-shift assays. Focused was placed on the predicted Ind site located upstream of the transcription start site because it is the only location where Dichaete and Ind sites are found adjacent to each other. It was found that Ind specifically binds this site in vitro. During these experiments, a second Ind-binding site (TAAATG) 8 bp 3' to this site was found, that differs slightly from the consensus homeodomain site. Thus, Ind can bind to two sites located within 1 kb of the ac promoter, suggesting a possible molecular mechanism for Ind-dependent repression of ac (Zhao, 2007b).

The initial search for Dichaete-binding sites required a perfect match to the consensus Sox-binding site. However, bona fide transcription factor-binding sites often differ from the experimentally defined consensus by a few base pairs, indicating that the search likely underpredicted possible Dichaete-binding sites. Because of this, gel-shift assays were used to search for Dichaete-binding sites throughout the entire row 3 element (pG7). Three sites were identified to which Dichaete bound specifically. Two of these correspond to sites identified in the consensus sequence search (sites S1 and S3); whereas the third resides 11 bp 5' of the first of the two Ind sites near the transcriptional start of ac (S2); this site (GACAATG) differs from the consensus by one base pair. No binding was detected of Dichaete to one predicted Sox site (S4). Because Dichaete and ind are known to repress ac expression, the three binding sites for Ind and Dichaete upstream of the ac promoter identify a likely site of action through which these factors repress ac (Zhao, 2007b).

The clustering of binding sites for Dichaete, Vnd, and Ind, together with the ability of Dichaete to interact with Vnd and Ind, supports the idea that Dichaete acts with these factors to regulate ac expression in the CNS. To test this model directly, the in vivo relevance was assayed of the adjacent Vnd and Dichaete sites as well as the adjacent Dichaete and Ind sites on ac expression. ac expression was unaltered when the Vnd-binding site, the adjacent Dichaete site, or both sites were mutated. Thus, vnd either does not regulate ac expression directly or other Vnd binding sites in the row 3 element compensate for the loss of this site (Zhao, 2007b).

The relevance of the three Dichaete- and Ind-binding sites located ~850 bp upstream of the start of ac transcription was assayed. Mutating any single site or any combination of two sites had no effect on ac expression. However, mutating all three sites derepressed ac expression in the intermediate column, a phenotype similar to that found in embryos mutant for ind or Dichaete. This result provides direct link between genes that pattern the CNS and those that specify distinct cell types. Because the derepression of ac is less severe than that observed in ind mutant embryos, Ind and Dichaete likely act through additional sites in this element to repress ac expression fully in the intermediate column (Zhao, 2007b).

Unexpectedly, derepression of ac expression posterior to row 3 was observed upon mutation of the three sites. This posterior expansion of ac mimics the effect that removal of gooseberry function has on the expression of ac, suggesting that Gooseberry, another homeodomain protein, may bind the same sites as Ind and act with Dichaete to repress ac expression in its expression domain (Zhao, 2007b).

Protein Interactions

Drosophila Groucho, like its vertebrate Transducin-like Enhancer-of-split homologues, is a corepressor that silences gene expression in numerous developmental settings. Groucho itself does not bind DNA but is recruited to target promoters by associating with a large number of DNA-binding negative transcriptional regulators. These repressors tether Groucho via short conserved polypeptide sequences, of which two have been defined: (1) WRPW and related tetrapeptide motifs have been well characterized in several repressors; (2) a motif termed Engrailed homology 1 (eh1) has been found predominantly in homeodomain-containing transcription factors. A yeast two-hybrid screen is described that uncovered physical interactions between Groucho and transcription factors, containing eh1 motifs, with different types of DNA-binding domains. One of these, the zinc finger protein Odd-skipped, requires its eh1-like sequence for repressing specific target genes in segmentation (Goldstein, 2005).

The eh1 Gro recruitment domain was originally defined as a heptapeptide motif that is conserved in members of the En family of homeodomain proteins and their vertebrate homologues. More recently, eh1-dependent binding to Gro has also been demonstrated in vitro for various other Drosophila and mammalian proteins, nearly all of which contain homeodomains. Given that Bowl and Odd, two non-homeodomain ZnF transcription factors, contain this motif and interact with Gro, the possibility was explored that eh1 motifs are prevalent among additional non-homeodomain transcription factor families. Indeed, an unbiased yeast screen for Gro-interacting proteins selected two additional transcriptional regulators that contain eh1-like motifs, namely, Sloppy-paired (Slp; Forkhead related) and Dorsocross (Doc; T box). Alignment of the eh1-like sequences of Bowl, Odd, Slp, and Doc with those of En and Gsc revealed three conserved amino acids: phenylalanine-x-isoleucine-x-x-isoleucine (Phe-x-Ile-x-x-Ile, where x is any amino acid). Subsequent database searches for presumptive Drosophila transcription factors containing this minimal peptide sequence identified a wide range of potential negative regulators belonging to different superfamilies as classified by their distinct DNA-binding domain types. Remarkably, eh1-related motifs have been preserved in many human homologues of these fly proteins, indicating that the ability to bind Gro/TLE has been evolutionarily conserved in human transcriptional regulators and that this sequence may have been widely adopted throughout the proteome as a Gro recruitment domain (Goldstein, 2005).

Several representatives, corresponding to different transcription factor families, were tested for the ability to bind Gro in biochemical assays. Where possible, full-length expressed sequence tags encoding these proteins were obtained; otherwise, single exons containing the eh1-like sequence were PCR amplified from genomic DNA. Each polypeptide was assessed for the ability to pull down radiolabeled Gro in vitro. GST-tagged Slp and Doc (amino acids 254 to 391) readily retain Gro, as do Eyes absent (Eya) and the homeodomain proteins Ventral nervous system defective (Vnd, 1 to 465), Bagpipe (Bap, 1 to 129), BarH1, and Empty spiracles (Ems, 1 to 360), as well as the orphan nuclear hormone receptor DHR96. To confirm that these interactions rely on intact eh1-related sequences, the eh1 motif of one of these, BarH1, was mutated by substituting glutamic acid for Phe at position 1, finding that its binding to Gro is reduced by >60% (Goldstein, 2005).

At the molecular level, members of the NKx2.2 family of transcription factors establish neural compartment boundaries by repressing the expression of homeobox genes specific for adjacent domains. The Drosophila homologue, vnd, interacts genetically with the high-mobility group protein, Dichaete, in a manner suggesting co-operative activation. However, evidence for direct interactions and transcriptional activation is lacking. This study presents molecular evidence for the interaction of Vnd and Dichaete that leads to the activation of target gene expression. Two-hybrid interaction assays indicate that Dichaete binds the Vnd homeodomain, and additional Vnd sequences stabilize this interaction. In addition, Vnd has two activation domains that are typically masked in the intact protein. Whether vnd can activate or repress transcription is context-dependent. Full-length Vnd, when expressed as a Gal4 fusion protein, acts as a repressor containing multiple repression domains. A divergent domain in the N-terminus, not found in vertebrate Vnd-like proteins, causes the strongest repression. The co-repressor, Groucho, enhances Vnd repression, and these two proteins physically interact. The data presented indicate that the activation and repression domains of Vnd are complex, and whether Vnd functions as a transcriptional repressor or activator depends on both intra- and inter-molecular interactions (Yu, 2005; full text of article).

Vnd represses target gene expression in conjunction with Groucho

The transcription factor, Vnd, is a dual regulator that specifies ventral neuroblast identity in Drosophila by both repressing and activating target genes. Vnd and its homologs have a conserved amino acid sequence, the Nk-2 box or Nk specific domain, as well a conserved DNA-binding homeodomain and an EhI-type Groucho interaction domain. However, the function of the conserved Nk-2 box has not been fully defined. To explore its function, the Nk-2 box was deleted and the regulatory activity of mutant Vnd in transgenic over-expression assays was compared to that of the wild-type protein. No regulatory activity could be assigned to the Nk-2 box using an over-expression assay, because the mutant protein activated expression of endogenous Vnd, masking a requirement for the Nk-2 box. However, in transgenic rescue assays, Vnd lacking the Nk-2 box repressed ind expression at 30% lower levels than the wild-type protein. Moreover, in transient transfection assays using Gal4 DNA-binding domain-Vnd chimeras, the repression activity of Vnd lacking the Nk-2 box was compromised. Because Vnd represses target gene expression in conjunction with Groucho, it was asked whether the Nk-2 box affects Vnd’s ability to interact with this co-repressor. Vnd lacking the Nk-2 box binds Groucho 30% less efficiently than wild-type Vnd in co-immunoprecipitations. These data suggest that the Nk-2 box contributes to the repression activity of Vnd by stabilizing its interaction with the co-repressor, Groucho (Uhler, 2007).

The conserved Eh1 domain, characterized by the consensus amino acid sequence, FxIxxIL, is essential for the interaction of non Hairy-type transcription factors with the co-repressor, Groucho. It has been shown that GST fusion protein including the candidate Vnd Eh1 domain pulled down Groucho, and mutation of this domain interfered with Vnd’s capacity to repress reporter expression in transgenic embryos. The data presented in this study indicate that a secondary domain, the Nk-2 box, also modulates Vnd's capacity to interact with Groucho, since Vnd’s interaction with this co-repressor is compromised when the Nk-2 box was deleted. The Nk-2 box deletion results in functional readouts including reduced ability to rescue the repression of ind expression in transgenic assays and reduced repression of reporter expression in a heterologous transfection assay (Uhler, 2007).

The Nk-2 box is not the first intramolecular domain identified that modulates Groucho binding to a transcription factor. It has been reported that deletion of the carboxyl terminal of Vnd, including both the homeodomain and the Nk-2 box, interferes with Groucho binding to this transcription factor. However, the domain that was deleted was relatively large, covering 200 amino acids. Another study found that the Eh1 domain of the Caenorhabditis elegans Unc-4 is insufficient for interaction with Unc-37, the worm Groucho. Mutant Unc-4 with the Eh1 domain intact, but lacking sequences amino terminal to the Eh1 domain, is deficient in in vivo repressor activity and fails to interact with Unc-37 in two hybrid interaction assays. The fact that a secondary domain, the Nk-2 box, is involved in modulating Groucho recruitment to the Eh1 domain is not altogether unexpected, because of the complexity of events that results from stable Groucho binding to Vnd-type transcription factors in the context of a target gene enhancer. These include the recruitment of repressosome components, the deacytlation of histones on target genes, chromatin condensation, and gene silencing (Uhler, 2007).

Vnd is one of a number of transcription factors that functions as a dual function regulator. Of the dual-function transcription factors characterized, the ability of the Rel-type transcription factor, Dorsal, to repress target gene expression is best understood. Dorsal activates genes in ventral regions and represses transcription of dorsal fate-determining genes, in the early Drosophila embryo. Groucho is dispensable for Dorsal-directed activation, but is essential for Dorsal-mediated ventral repression. Available evidence suggests that the context of a particular Dorsal-binding site determines whether Dorsal activates or represses target gene expression. Dorsal-dependent ventral silencers contain other elements that are required for Dorsal-dependent ventral repression in addition to Dorsal-binding sites. Mutagenesis of these additional elements [to which the transcription factors, Cut and Dead Ringer (Dri), bind], converts Dorsal from a repressor into an activator. In vitro-binding assays indicate that Dri and Dorsal work together to recruit Groucho to the template synergistically. While Dorsal-mediated repression requires Groucho, Dorsal-mediated activation depends on the co-activators, CBP (CREB-binding protein) and/or certain TAFs (TBP-associated factors) (Uhler, 2007 and references therein).

The conflicting abilities of Vnd to both activate and repress target gene expression are in part modulated by Vnd’s selective interaction with the co-activator, Dichaete, and the co-repressor, Groucho. Surprisingly, Vnd’s two activation domains map directly at the carboxyl side of the Eh1 domain and the Nk-2 box that mediate repression. The significance of the positioning of domains with opposite affects adjacent to each other is currently not well understood. Vnd’s secondary structure likely affects its interaction with Groucho, since the Eh1 domain and the Nk-2 box are at opposite ends of Vnd, separated by over 400 amino acids. Further analyses of Vnd mutant protein function in vivo and further characterization of vnd-dependent enhancers will help clarify which intramolecular interactions facilitate Vnd interacting with co-regulators that mediate opposite effects on target gene expression (Uhler, 2007).


ventral nervous system defective: Biological Overview | Evolutionary Homologs | Developmental Biology | Effects of Mutation | References

Home page: The Interactive Fly © 1997 Thomas B. Brody, Ph.D.

The Interactive Fly resides on the
Society for Developmental Biology's Web server.