Interactive Fly, Drosophila

The POU domain of VVL and of other POU domain transcription factors is a DNA binding motif which interacts with promoter sequence elements based on the octamer consensus element ATGCAAAT. An autoregulatory enhancer of vvl, serving to regulate VVL interaction with its own promoter, consists a 514-bp minimal enhancer located 4.5 kb upstream of the vvl transcriptional initiation site. Two response elements, separated from each other by over 200 base pairs, VVLE1 and VVLE2, consist of a divergent TAATGATATGC element and a consensus element respectively. Disruption of VVLE1 with point mutations block VVL protein binding to this element and show no detectable expression in medial glia of the midline. Expression in trachea and oenocyte clusters are normal. Disruption of VVLE2, on the other hand, gives glial expression but no expression in trachea, while oenocyte expression is normal. It is suggested that distinguishable alterations in the accessable surface of the VVL POU domain caused by flexable recognition of the variant elements VVLE1 and VVLE2 may influence interactions between DNA-bound VVL protein and tissue-specific coactivators or coregulators. Alternatively the separable functions could be due to the proximity of each element to distinct tissue-specific cis-regulatory elements (Certel, 1996).

An efficient approach to isolate STAT regulated enhancers uncovers STAT92E fundamental role in Drosophila tracheal development

The ventral veinless (vvl) and trachealess (trh) genes are determinants of the Drosophila trachea. Early in development both genes are independently activated in the tracheal primordia by signals that are ill defined. Mutants blocking JAK/STAT signaling at any level do not form a tracheal tree suggesting that STAT92E may be an upstream transcriptional activator of the early trachea determinants. To test this hypothesis STAT92E responsive enhancers activating the expression of vvl and trh in the tracheal primordia were sought. STAT92E regulated enhancers can be rapidly and efficiently isolated by focusing the analysis on genomic regions with clusters of putative STAT binding sites where at least some of them are phylogenetically conserved. Detailed analysis of a vvl early tracheal enhancer shows that non-conserved sites collaborate with conserved sites for enhancer activation. It was found that STAT92E regulated enhancers can be located as far 60 kb from the promoters. The results indicate that vvl and trh are independently activated by STAT92E which is the most important transcription factor required for trachea specification (Sotillos, 2010).

Mutations in the JAK/STAT pathway result in embryos only forming residual trachea fragments. This is caused by the abnormal activation of the early tracheal genes trh and vvl suggesting that the early trachea enhancers may be directly regulated by STAT92E in which case the trachea enhancers would be associated to STAT92E binding sites (Sotillos, 2010).

To test this the putative STAT92E binding sites were first localized in silico in the vvl 144 kb intergenic region. Then in vivo the enhancer activity was tested of regions that either (1) contain putative STAT92E sites that are conserved in several Drosophila species; (2) contain non-conserved putative STAT92E sites; or (3) contain no putative STAT92E sites. Of the 12 reporter lines made 10 have enhancer activity consistent with harboring embryonic vvl cis-regulatory elements. The expression of the enhancers either lacking STAT92E sites or containing non-conserved STAT92E sites is independent of JAK/STAT function. In contrast, of the seven vvl enhancers containing conserved sites, the expression of three of them required JAK/STAT regulation in the embryo. One of these enhancers drives expression in the hindgut and two are expressed in the trachea at stage 10. These results suggested that exclusively looking for conserved STAT92E sites would be sufficient to localize the STAT92E regulated sites, a prediction confirmed by isolating the trh gene tracheal enhancers. Although the presence of cryptic STAT binding sites that diverge from the ideal consensus could not be discarded in this analysis, such elements will probably have a minor contribution as ignoring their existence allowed finding of the main regulatory elements. Analysis of only eight fragments comprising a seventh of the trh locus was sufficient to find the early tracheal STAT92E responsive elements. This analysis revealed that the locus extends at least 40 kb upstream the trh promoter, with some enhancers located beyond the first predicted neighboring gene (Sotillos, 2010).

In both the trh and vvl genes it was found that other late tracheal enhancers are associated to STAT92E conserved sites suggesting that the STAT92E protein may be repeatedly used to control tracheal gene expression during development. This possibility is backed by the fact that upd is transcribed up to stage 13 and phosphorylated STAT92E can be detected in the trachea well after the stage 10 early specification stage. This late trachea specific expression of upd depends on trh suggesting that a feed-back loop maintains JAK/STAT activity during tracheal development (Sotillos, 2010).

It was confirmed that the expression of the vvl1+2 early tracheal enhancer depends on STAT92E by mutating its putative STAT92E binding sites. The results show that mutation of all three putative STAT92E sites in the vvl1+2 enhancer causes a severe loss of expression, indicating that a bona-fide direct enhancer was isolated. However, other possible direct enhancers like the vvlds1.5 hindgut element show an unexpected behavior. While vvlds1.5 perfectly recapitulates the vvl early hindgut activation its expression does not stop at stage 15, but keeps transcribing lacZ up to stage 17 well after the endogenous vvl gut transcription ceased. Despite this abnormal behavior, and pending direct site mutagenesis confirmation, it is believeed that these experiments show convincingly that the analysis of the genomic regions containing conserved STAT binding site clusters is an efficient way to quickly identify direct STAT92E regulated enhancers (Sotillos, 2010).

An important finding of this analysis has been the observation that STAT regulated enhancers can be tens of kilobases away from the transcriptional start of the target gene, even separated by another predicted gene. This indicates that STAT regulated enhancers can be functional at great distances and that search of STAT binding sites should not be restricted to the immediate vicinity of a gene (Sotillos, 2010).

Assuming there was no base content bias, the probability of finding in the genome either a 3n or a 4n STAT92E site by chance is (2 × 1/4⁶) which is close to one site every 2 kb. In the 144 kb vvl genomic region there are 85 putative STAT92E sites, that is, 15 more than the expected 70 sites. Similarly, in the 70 kb trh locus there are 53 putative sites, 19 more than the expected 34 sites. The excess of sites found can be partially explained by selective pressure as there are 20 conserved sites in vvl. However, in the trh locus the 9 conserved STAT92E sites represent only half of the observed excess. An additional explanation for the excess STAT92E sites could be provided by the observation that in the regions were evidence was found for JAK/STAT regulation, there are clusters of conserved and non-conserved sites. It has been suggested that STAT site clustering helps forming tetramers that co-operatively increase STAT transcriptional output. Although STAT92E may form similar tetrameres in Drosophila, the distance of the clustered sites observed in this study is probably too large to allow tetramer formation and their existence must serve another purpose. When the vvl1+2 enhancer was subdivided it was observed that the distal non-conserved STAT92E site is dispensable for the enhancer function. However mutagenesis of the two conserved sites in vvl1+2 without mutating the distal non-conserved site has a mild decrease in the enhancer expression. Only when all three sites were mutated, including the non-conserved site, there is a strong effect on vvl1+2 expression. This shows that non-conserved sites may be functional in vivo even though they are not absolutely necessary. The results indicate that sites appearing distal to a STAT92E regulated enhancer may substitute for the proximal conserved sites, suggesting a way in which novel functional sites could eventually substitute conserved sites during evolution (Sotillos, 2010).

The crb and dome genes, which have been shown in vivo to be STAT92E targets, also show an accumulation of STAT92E sites in the introns where the enhancers localize. In the case of dome, five STAT92E sites cluster in a 700 bp fragment. Although only two of the five sites are conserved, and dissection of the enhancer showed that the conserved sites are crucial for JAK/STAT regulation, the non-conserved sites were also required for full enhancer expression. Therefore, clustered conserved and non-conserved STAT92E sites contribute to target gene regulation. It will be important to understand how STAT92E proteins binding to these distant sites interact with other transcriptional co-factors that presumably bind to the core enhancer (Sotillos, 2010).

The earliest genes activated specifically in the tracheal primordia are trh and vvl. Both genes have cross-regulatory interactions that help maintain each other's expression in the trachea, but their early activation is independent of each other. This study has localized the early trachea enhancers in trh and vvl and have shown that their activation in both cases depends on the JAK/STAT pathway making STAT92E the most important trachea activator. Stage 10 upd expression is consistent with a model where trh and vvl activation in the tracheae primordia is specified by a competitive interaction between the JAK/STAT and the WNT signaling pathways. STAT92E is probably acting with some other transcription factor^S as inactivation of the JAK/STAT pathway does not result in a complete lack of expression of the enhancers. These other transcription factors would not necessarily have a restricted spatial expression as the precise positional activation of vvl could be provided by upd and wg. Although the early requirement of STAT92E for early tracheal specification precludes any studies at later stages, the maintained expression of upd in later tracheal development and the presence of STAT92E conserved sites associated to late tracheal enhancers suggest that the pathway is important for vvl and trh expression maintenance during tracheal development (Sotillos, 2010).

The observation that in crustaceans trh and vvl are co-expressed in the epipods that form the gills suggest that both genes where co-opted early in arthropod evolution to control the formation of the respiratory system (Franch-Marro, 2006). It would be interesting to find out if JAK/STAT is also required for trh and vvl expression in the crustaceans as that would suggest that the pathway had been adopted early in evolution fixing the regulation of both genes in the respiratory system (Sotillos, 2010).

Common origin of insect trachea and endocrine organs from a segmentally repeated precursor

Segmented organisms have serially repeated structures that become specialized in some segments. The Drosophila corpora allata, prothoracic glands, and trachea are shown to have a homologous origin and can convert into each other. The tracheal epithelial tubes develop from ten trunk placodes, and homologous ectodermal cells in the maxilla and labium form the corpora allata and the prothoracic glands. The early endocrine and trachea gene networks are similar, with STAT and Hox genes inducing their activation. The initial invagination of the trachea and the endocrine primordia is identical, but activation of Snail in the glands induces an epithelial-mesenchymal transition (EMT), after which the corpora allata and prothoracic gland primordia coalesce and migrate dorsally, joining the corpora cardiaca to form the ring gland. It is proposed that the arthropod ectodermal endocrine glands and respiratory organs arose through an extreme process of divergent evolution from a metameric repeated structure (Sanchez-Higueras, 2013).

The endocrine control of molting and metamorphosis in insects is regulated by the corpora allata (ca) and the prothoracic glands (pg), which secrete juvenile hormone and ecdysone, respectively. In Diptera, these glands and the corpora cardiaca (cc) fuse during development to form a tripartite endocrine organ called the ring gland. While the corpora cardiaca is known to originate from the migration of anterior mesodermal cells, the origin of the other two ring gland components is unclear (Sanchez-Higueras, 2013).

The tracheae have a completely different structure consisting of a tubular network of polarized cells. The tracheae are specified in the second thoracic to the eighth abdominal segments (T2-A8) by the activation of trachealess (trh) and ventral veinless (vvl) (Sanchez-Higueras, 2013).

The enhancers controlling trh and vvl in the tracheal primordia have been isolated and shown to be activated by JAK/ STAT signaling. While the trh enhancers are restricted to the tracheal primordia in the T2-A8 segments, the vvl1+2 enhancer is also expressed in cells at homologous positions in the maxilla (Mx), labium (Lb), T1, and A9 segments in a pattern reproducing the early transcription of vvl. The fate of these nontracheal vvl-expressing cells was unknown, but it was shown that ectopic trh expression transforms these cells into tracheae. To identify their fate, vvl1+2-EGFP and mCherry constructs were made (Sanchez-Higueras, 2013).

Although the vvl1+2 enhancer drives expression transiently, the stability of the EGFP and mCherry proteins labels these cells during development. It was observed that while the T1 and A9 patches remained in the surface and integrated with the embryonic epidermis, the patches in the Mx and Lb invaginated just as the tracheal primordia did. Next, the Mx and Lb patches fused, and a group of them underwent an epithelial-mesenchymal transition (EMT) initiating a dorsal migration toward the anterior of the aorta, where they integrate into the ring gland. To find out what controls the EMT, the expression of the snail (sna) gene, a key EMT regulator, was studied. Besides its expression in the mesoderm primordium, it was found that sna is also transcribed in two patches of cells that become the migrating primordium. Using sna bacterial artificial chromosomes (BACs) with different cis-regulatory regions, the enhancer activating sna in the ring gland primordium (sna-rg). A sna-rg-GFP construct labels the subset of Mx and Lb vvl1+2-expressing cells that experience EMT and migrate to form the ring gland. Staining with seven-up (svp) and spalt (sal) (also known as salm) markers, which label the ca and the pg, respectively, showed that the sna-rg-GFP cells form these two endocrine glands. The sna-rg-GFP-expressing cells in the Mx activated svp, and those in the Lb activated sal before they coalesced, indicating that the ca and pg are specified in different segments before they migrate (Sanchez-Higueras, 2013).

To test whether Hox genes, the major regulators of anteroposterior segment differentiation, participate in gland morphogenesis, vvl1+2-GFP embryos were stained, and it was found that the Mx vvl1+2 primordium expressed Deformed (Dfd) and the Lb primordium Sex combs reduced (Scr), while the T1 primordium expressed very low levels of Scr. Dfd mutant embryos lacked the ca, while Scr mutant embryos lacked the pg. Dfd and Scr expression in the gland primordia was transient, suggesting that they control their specification. Consistently, in Dfd, Scr double-mutant embryos, vvl1+2 was not activated in the Mx and Lb patches, and the same was true for vvl transcription. In these mutants, the sna-rg-GFP expression was almost absent, and the ca and pg did not form. In each case, Dfd controlled the expression of the Mx patch and Scr of the Lb patch (Sanchez-Higueras, 2013).

The capacity of different Hox genes to rescue the ring gland defects of Scr, Dfd double mutants was tested. Induction of Dfd with the sal-Gal4 line in these mutants restored the expression of vvl1+2 and sna-rg-GFP in the Mx and the Lb. However, in contrast to the wild-type, both segments formed a ca as all cells express Svp. Similarly, induction of Scr also restored the vvl1+2 and sna-rg-GFP expression, but both primordia formed a pg as they activate Sal and Phantom, an enzyme required for ecdysone synthesis. The capacity of both Dfd and Scr to restore vvl expression, regardless of the segment, led to a test of whether other Hox proteins could have the same function. Induction of Antennapaedia (Antp), Ultrabithorax (Ubx), abdominal-A (abd-A), or Abdominal-B (Abd-B) restored vvl1+2 expression in the Mx and Lb, but these cells formed tubes instead of migratory gland primordia. These cephalic tubes are trachea, as they do not activate sna-rg, they express Trh, and their nuclei accumulate Tango (Tgo), a maternal protein that is only translocated to the nucleus in salivary glands and tracheal cells, indicating that the trunk Hox proteins can restore vvl expression in the Mx and Lb but induce their transformation to trachea (Sanchez-Higueras, 2013).

To investigate whether vvl and trh expression is normally under Hox control in the trunk, focus was placed on Antp, which is expressed at high levels in the tracheal pits. In double-mutant Dfd, Antp embryos, vvl1+2 was maintained in the Lb where Scr was present, while the Mx, T1, and T2 patches were missing. In T3-A8, vvl1+2 expression, although reduced, was present, probably due to the expression of Ubx, Abd-A, and Abd-B in the posterior thorax and abdomen. Thus, Antp regulates vvl expression in the tracheal T2 primordium. Surprisingly, in Dfd, Antp double mutants, Trh and Tgo were maintained in the T2 tracheal pit, indicating that although Hox genes can activate ectopic trh expression, in the tracheal primordia they may be acting redundantly with some other unidentified factor, explaining why the capacity of Hox proteins to specify trachea had not been reported previously (Sanchez-Higueras, 2013).

sna null mutants were studied to determine sna's requirement for ring gland development, but their aberrant gastrulation precluded analyzing specific ring gland defects. To investigate sna function in the gland primordia, the sna mutants were rescued with the sna-squish BAC, which drives normal Sna expression except in the ring gland. These embryos have a normal gastrulation and activate the sna-rg- GFP; however, the gland primordia degenerate and disappear. To block apoptosis, these embryos were made homozygous for the H99 deficiency, which removes three apoptotic inducers. In this situation, the ca and pg primordia invaginated and survived, but they did not undergo EMT. As a result, the gland primordia maintain epithelial polarity, do not migrate, and form small pouches that remain attached to the epidermis. Vvl is required for tracheal migration. In vvl mutant embryos, sna-rg-GFP expression was activated, but the cells degenerated. In vvl mutant embryos also mutant for H99, the primordia underwent EMT and migrated up to the primordia coalescence; however, the later dorsal migration did not progress (Sanchez-Higueras, 2013).

This study has shown that the ca and pg develop from vvl-expressing cephalic cells at positions where other segments form trachea, suggesting that they could be part of a segmentally repeated structure that is modified in each segment by the activity of different Hox proteins. As the cephalic primordia are transformed into trachea by ectopic expression of trunk Hox, tests were performed to see whether the trachea primordia could form gland cells. Ectopic expression of Dfd with arm- Gal4 resulted in the activation of sna-rg-GFP on the ventral side of the tracheal pits. These sna-rg-GFP0-expressing cells also expressed vvl1+2 and Trh and had nuclear Tgo, showing that they conserve tracheal characteristics. These sna-rg-GFP-positive cells did not show EMT and remained associated to the ventral anterior tracheal branch. The strength of ectopic sna-rg-GFP expression increased when ectopic Dfd was induced in trh mutant embryos. However, migratory behaviors in the sna-rg-GFP cells were only observed if Dfd was coexpressed with Sal. Thus, sal is expressed several times in the gland primordia, first at st9-10 repressing trunk Hox expression in the cephalic segments and second from st11 in the prothoracic gland. It is uncertain whether the sal requirement for migration is linked to the first function or whether it represents an additional role (Sanchez-Higueras, 2013).

These results show that the endocrine ectodermal glands and the respiratory trachea develop as serially homologous organs in Drosophila. The identical regulation of vvl in the primordia of trachea and gland by the combined action of the JAK/STAT pathway and Hox proteins could represent the vestiges of an ancestral regulatory network retained to specify these serially repeated structures, while the activation of Sna for gland development and Trh and Tgo for trachea formation could represent network modifications recruited later by specific Hox proteins during the functional specialization of each primordium. This hypothesis or alternative possibilities should be confirmed by analyzing the expression of these gene networks in various arthropod species. The diversification of glands and respiratory organs must have occurred before the split of insects and crustaceans, as there is a correspondence between the endocrine glands in both classes, with the corpora cardiaca corresponding to the pericardial organ, the corpora allata to the mandibular organ, and the prothoracic gland to the Y gland. Despite their divergent morphology, a correspondence between the insect trachea and the crustacean gills can also be made, as both respiratory organs coexpress vvl and trh during their organogenesis. Divergence between endocrine glands and respiratory organs may have occurred when the evolution of the arthropod exoskeleton required solving two simultaneous problems: the need to molt to allow growth, and the need for specialized organs for gas exchange (Sanchez-Higueras, 2013).

Transcriptional Regulation

vvl regulation must be complex, because it is expressed through the activity of several pathways. vvl is genetically downstream of single minded in midline neural cells, and is a potential target of that basic HLH transcription factor. It is possible, however that vvl would be normally expressed in cells of single minded mutants, had they been viable, but they are not (Billin and Poole, 1995). vvl expression in the wing imaginal disc and trachea is dependent on coordinate activities of signaling molecules such as DPP, WG and HH. In these cases, vvl appears to be downstream of the torpedo EGF receptor signaling (de Celis, 1995). This gene is obviously a good candidate for promoter analysis.

Formation of the trachea occurs by the migration and fusion of clusters of ectodermal cells specified in each side of ten embryonic segments. Morphogenesis of the tracheal tree requires the activity of many genes, among them breathless (btl) and ventral veinless (vvl), whose mutations abolish tracheal cell migration. Activation of the btl receptor by branchless (bnl), its putative ligand, exerts an instructive role in the process of guiding tracheal cell migration. decapentaplegic determines vvl expression along the embryonic dorsoventral axis; expansion of dpp expression results in an increased recruitment of cells to express vvl. These cells are allocated in the expanded tracheal placodes, indicating that expansion of dpp expression causes a concomitant enlargement of the traceal placodes and of vvl expression. vvl is also required for the maintenance of btl expression during tracheal migration (Llimargas, 1997).

The role of Castor in regulating pdm genes raises the possibility that it may regulate expressions of other POU genes. To test this, the expression domains of Cas and Drifter/Ventral veins lacking were examined. Drf expression was examined in cas- embryos. In addition to its established role in midline glia and tracheal development, Drf is also expressed in a subset of NB progeny in both the developing brain and ventral cord. Many Cas-expressing NB sublineages also express Drf. Thus, it appears that Cas does not repress drf expression: to the contrary, a marked reduction in late-lineage Drf expression is observed in cas- embryos, suggesting Cas either directly or indirectly plays a role in activating and/or sustaining drf expression in these sublineages. Ectopically activated Cas has no effect on Drf expression. In the absence of castor function, I-POU expression is lost in a subset of ventral cord cells, but ectopic Cas has no effect on the I-POU wild-type expression pattern. It is not known if Cas is a direct activator of drf and/or I-POU. However, the data indicate that if Cas is playing a direct activator role, it most likely requires co-factors that are not expressed outside of its normal domain (Kambadur, 1998).

Ken & barbie selectively regulates the expression of a subset of Jak/STAT pathway target genes

A limited number of evolutionarily conserved signal transduction pathways are repeatedly reused during development to regulate a wide range of processes. A new negative regulator of JAK/STAT signaling is described and a potential mechanism identified by which the pleiotropy of responses resulting from pathway activation is generated in vivo. As part of a genetic interaction screen, Ken & Barbie (Ken), which is an ortholog of the mammalian proto-oncogene BCL6, has been identified as a negative regulator of the JAK/STAT pathway. Ken genetically interacts with the pathway in vivo and recognizes a DNA consensus sequence overlapping that of STAT92E in vitro. Tissue culture-based assays demonstrate the existence of Ken-sensitive and Ken-insensitive STAT92E binding sites, while ectopically expressed Ken is sufficient to downregulate a subset of JAK/STAT pathway target genes in vivo. Finally, endogenous Ken is shown specifically represses JAK/STAT-dependent expression of ventral veins lacking (vvl) in the posterior spiracles. Ken therefore represents a novel regulator of JAK/STAT signaling whose dynamic spatial and temporal expression is capable of selectively modulating the transcriptional repertoire elicited by activated STAT92E in vivo (Arbouzova, 2006).

Analysis of phenotypes associated with mutations in Drosophila JAK/STAT pathway components have identified a wide variety of requirements for the pathway during embryonic development and in adults. What is less clear is how the repeated stimulation of a single pathway is able to generate this pleiotropy of developmental functions. In order to identify modulators of JAK/STAT signaling that may be involved in this process, a genetic screen was undertaken for modifiers of the dominant phenotype caused by the ectopic expression of the pathway ligand Unpaired (Upd) in the developing eye imaginal disc. Such misexpression by GMR-updΔ3′ results in overgrowth of the adult eye, a phenotype sensitive to the strength of pathway signaling activity. With this assay, one genomic region, defined by Df(2R)Chi^g320, was found to enhance the GMR-updΔ3′-induced eye overgrowth phenotype. Of the genes deleted by Df(2R)Chi^g320, only mutations in ken showed consistent and reproducible enhancement of the phenotype. In addition, other dominant phenotypes induced by transgene expression from the GMR promoter are not modulated by ken mutations, indicating that Ken is unlikely to interact with the misexpression construct used (Arbouzova, 2006).

The enhancement of the GMR-updΔ3′ phenotype after removal of one copy of ken implies that Ken normally functions antagonistically to JAK/STAT signaling. Therefore phenotypes associated with mutations in other pathway components were tested to establish the reliability of this initial observation. Consistent with this, genetic interaction assays between ken mutations and the hypomorphic loss-of-function allele stat92E^HJ show a reduction in the frequency of wing vein defects normally associated with this stat92E allele. Moreover, the degree of suppression is consistent with the strength of ken alleles tested. Similarly, the frequency of “strong” posterior spiracle phenotypes caused by the dome³⁶⁷ allele of the pathway receptor is also reduced when crossed to ken alleles or the Df(2R)Chi^g320 deficiency, with a concomitant increase in “weak” phenotypes (Arbouzova, 2006).

Thus, multiple independent ken alleles all modify diverse phenotypes caused by both gain- and loss-of-function mutations in multiple JAK/STAT pathway components. Each of these components acts at different levels of the signaling cascade and show interactions indicating that Ken consistently acts as an antagonist of the pathway (Arbouzova, 2006).

The ken locus contains three exons encoding a 601 aa protein. Ken possesses an N-terminal BTB/POZ domain between aa 17 and 131 and three C-terminal C2H2 zinc finger motifs from aa 502 to 590. Strikingly, a number of Zn finger-containing proteins that also contain BTB/POZ domains have also been shown to function as transcriptional repressors—often via the recruitment of corepressors such as SMRT, mSIN3A, N-CoR, and HDAC-1 (Arbouzova, 2006).

Searches for proteins similar to Ken identified homologs in Drosophila pseudoobscura and the mosquito Anopheles gambiae. In vertebrates, human B-Cell Lymphoma 6 (BCL6) was the closest full-length homolog. Drosophila Ken and human BCL6 share the same domain structure and show 20.3% overall identity. Proteins listed as potential vertebrate homologs of Ken in Flybase are more distantly related (Arbouzova, 2006).

Expression of ken was also examined during development, where it is detected in a dynamic pattern from newly laid eggs, throughout embryogenesis, and in imaginal discs. As such, endogenous Ken is present in all tissues and stages in which genetic interactions were observed (Arbouzova, 2006).

Given the presence of potentially DNA binding Zn finger domains and the nuclear localization of GFPKen, the DNA binding properties of Ken was determined by using an in vitro selection technique termed SELEX (systematic evolution of ligands by exponential enrichment). With a GST-tagged Ken Zn finger domain and a randomized oligonucleotide library, ten successive rounds of selection were undertaken. Sequencing of the resulting oligonucleotide pool and alignment of 43 independent clones showed that all recovered plasmids were unique and each contained one, or occasionally two, copies of the motif GNGAAAK (K = G/T) (Arbouzova, 2006).

To confirm the SELEX results, GFPKen was expressed in tissue culture cells and these were used for electromobility shift assays (EMSA). A radioactively labeled probe containing the wild-type (wt) consensus binding site GAGAAAG gives a specific band, which can be supershifted by an anti-GFP antibody and therefore represents a GFPKen/DNA complex. In order to identify positions essential for binding, a competition assay was used in which unlabeled oligonucleotides containing single substitutions in each position from 1 to 7 were added to binding reactions. 10-fold excess of unlabeled wild-type consensus oligonucleotide greatly diminished the intensity of the GFPKen band, while 50- and 100-fold excess totally blocked the original signal. By contrast, competition with unlabeled m3 oligonucleotides containing a G to A substitution at position 3 failed to significantly reduce the intensity of the band even at 100-fold excess. With this approach, the positions 1 and 7 are found dispensable for DNA binding, whereas the central GAAA core is absolutely required. Similar results were obtained with the converse experiment with labeled mutant probes, although in this case the wt probe produces a stronger signal than the m1 and m7 mutant oligonucleotides. Taken together, these experiments not only define the core sequence for Ken binding, but also demonstrate the specificity of Ken as a site-specific DNA binding molecule. Interestingly, the core consensus bound by Ken is very similar to that identified for human BCL6, with the Zn fingers of the latter binding to a DNA sequence containing a core GAAAG motif (Arbouzova, 2006).

One initial observation made is that the core GAAA essential for Ken binding overlaps the sequence recognized by STAT92E. Consistent with this overlap, a 100-fold excess of unlabeled oligonucleotide containing the STAT92E consensus is sufficient to fully compete for Ken in EMSA assays. Given this finding, it is hypothesized that the negative regulation of JAK/STAT signaling by Ken observed in genetic interaction assays may occur via a mechanism of competitive DNA binding site occupation. Due to the incomplete overlap between the STAT92E and Ken core sequences, this hypothesis also implies the existence of STAT92E DNA binding sites to which both STAT92E and Ken could bind (STAT⁺/Ken⁺) as well as sites with which Ken cannot associate (STAT⁺/Ken⁻) (Arbouzova, 2006).

To test this hypothesis, a cell culture-based assay was set up by using a luciferase-expressing reporter containing four STAT92E binding sites originally identified in the promoter of the Draf locus. In addition to this STAT⁺/Ken⁺ wild-type reporter, STAT⁺/Ken⁻ and STAT⁻/Ken⁻ variants identical but for the binding sequences were generated. When transfected into the hemocyte-like Kc₁₆₇ Drosophila cell line, both STAT⁺/Ken⁺ and STAT⁺/Ken⁻ reporters showed strong stimulation upon coexpression with the pathway ligand Upd, an assay previously shown to require an intact JAK/STAT cascade. When cotransfected with KenGFP, the activity of the STAT⁺/Ken⁺ reporter was reduced, an effect reproduced in three independent experiments with both KenGFP and Ken. While the reduction in reporter activity for the STAT⁺/Ken⁺ assay shown is statistically significant, the STAT⁺/Ken⁻ reporter was unaffected by the coexpression of Ken. Reporters containing binding sites mutated to prevent binding of both STAT92E and Ken (STAT⁻/Ken⁻) showed no activation after pathway stimulation and did not respond to Ken (Arbouzova, 2006).

These results indicate that Ken functions as a transcriptional repressor in this cell-culture system and shows that this effect is specific to the DNA sequence determined by SELEX and EMSA. This result is also consistent with a recent whole-genome RNAi-based screen, which used a reporter containing STAT⁺/Ken⁺ binding sites and includes Ken among the list of JAK/STAT regulators identified. In addition, recent reports have also demonstrated BCL6 binding to STAT6 sites in vitro and have shown that BCL6 can act as a repressor of STAT6-dependent target gene expression in cell culture. Although this repression is mediated by the binding to corepressors to the BTB/POZ domain of BCL6, no link between BCL6 and STAT activity has been demonstrated in vivo (Arbouzova, 2006).

Finally, it should also be noted that both the STAT⁺/Ken⁺ and STAT⁺/Ken⁻ reporters contain additional GAAA sequences that are not part of the characterized STAT92E binding sequences. However, despite the presence of these potential Ken binding sites within 15 bp of the STAT92E site, Ken expression did not affect the STAT⁺/Ken⁻ reporter, suggesting that Ken may require STAT92E to influence gene expression. Although no direct association between Ken and STAT92E has been demonstrated, this possibility cannot be excluded, and further analysis remains to be undertaken (Arbouzova, 2006).

Having established that Ken functions at the level of DNA binding in cell culture, it was asked whether Ken also acts as a transcriptional repressor of JAK/STAT pathway target genes in vivo. For this, the effect of ectopically expressed Ken on the expression of putative JAK/STAT pathway target genes was examined and, given the high levels of maternally loaded STAT92E present at blastoderm stage, focus was placed on targets expressed later in embryogenesis. These include the hindgut-specific expression of vvl, the expression of trachealess (trh) and knirps (kni) in the tracheal placodes, and the dynamic expression of socs36E throughout the embryo (Arbouzova, 2006).

First, the effect of Ken was addressed on trh, whose expression precedes the formation of the tracheal pits in the embryonic segments T2 to A8. Levels of trh are greatly reduced in embryos uniformly misexpressing Ken driven by the daughterless-GAL4 (da-GAL4) line. Many tracheal placodes express little or no trh, and tracheal pits fail to form even in the presence of residual trh. Similar effects are seen in upd^OS1A mutant embryos lacking all pathway activity. Likewise, downregulation of Kni expression is also observed in embryos misexpressing ken. These results show that both endogenous trh and kni are downregulated by ectopically expressed Ken (Arbouzova, 2006).

Whether Ken can modulate the expression of socs36E, a Drosophila homolog of mouse SOCS-5, was tested. socs36E expression closely mirrors that of upd, showing JAK/STAT pathway-dependent upregulation in segmentally repeated stripes, tracheal pits, and the hindgut. By contrast to trh and kni, ectopically expressed Ken does not affect any aspect of socs36E transcription. However, controls expressing a dominant-negative form of the pathway receptor DomeΔCyt, using the same Gal4 driver line, show a strong downregulation of socs36E, an effect reproduced by the complete removal of all JAK/STAT pathway activity by the upd^OS1A allele. Taken together, these results illustrate that ectopic expression of Ken during Drosophila development is sufficient to downregulate the expression of only a subset of putative JAK/STAT pathway target genes (Arbouzova, 2006).

As part of this analysis, modulation of vvl by Ken was tested. In wild-type embryos, vvl is expressed in the developing trachea and lateral ectoderm (in a JAK/STAT-independent manner) and in the hindgut of stage 12–14 embryos, where it requires JAK/STAT signaling. In upd^OS1A mutants, no vvl expression in the hindgut can be detected, indicating that this locus is a target of pathway activation. When Ken is uniformly misexpressed throughout the embryo, vvl expression is no longer detectable in the hindgut. Thus vvl, like trh and kni, can be a target of Ken-mediated repression (Arbouzova, 2006).

Having established that ectopic Ken is sufficient to downregulate vvl in the hindgut, whether endogenous Ken performs a similar role was determined. One overlap between ken expression and regions known to require JAK/STAT signaling are the developing posterior spiracles, structures in which both the pathway ligand upd and ken are simultaneously expressed. However, vvl is never detected in the posterior spiracle primordia in wild-type embryos, despite JAK/STAT pathway activity induced by upd expression in these tissues. Intriguingly, in a heteroallelic combination of the strongest ken^k11035 allele and Df(2R)Chi^g320, vvl transcript was detected not only in its normal expression domain within the hindgut but also in the posterior spiracles. This ectopic expression is initially detected from late stage 13 and rapidly strengthens during stage 14–15. When ken^k11035/Df(2R)Chi^g320 embryos simultaneously mutant for the amorphic upd^OS1A allele were analyzed, upregulation of vvl in the presumptive posterior spiracles was never observed at the stage by which ectopic vvl expression was first detected in the ken mutant embryos. At later stages, JAK/STAT pathway activity is required for posterior spiracle morphogenesis, posterior spiracles do not form, and upregulated vvl is not present (Arbouzova, 2006).

These results demonstrate that Ken is not only sufficient to downregulate the JAK/STAT pathway-dependent expression of vvl in the hindgut, but its endogenous expression is also necessary for vvl repression in the posterior spiracles. In ken mutants, ectopic vvl expression in the posterior spiracles results from a derepression of endogenous STAT92E activity (Arbouzova, 2006).

The overlap between the consensus sequences bound by STAT92E and Ken, together with the analysis of reporters containing STAT⁺/Ken⁺ and STAT⁺/Ken⁻ binding sites, indicate that Ken is likely to selectively regulate only a subset of JAK/STAT target genes. In this model, some target genes are regulated by binding sites compatible with both STAT92E and Ken, while others contain sequences to which only STAT92E can associate. While the DNA binding site is critical in cell-culture systems, similar proof is more difficult to establish in vivo. In particular, only a limited number of JAK/STAT pathway target genes have been rigorously demonstrated to require STAT92E binding in vivo (Arbouzova, 2006).

Although studied in some detail, the regulatory domains controlling vvl expression in the developing hindgut have not been identified. Therefore, although these results predict that such a domain would contain STAT⁺/Ken⁺ binding sequences, further analysis is required to confirm this hypothesis. By contrast, the regulatory domain of socs36E required to drive gene expression in the blastoderm, tracheal pits, and hindgut comprises a 350 bp region containing three STAT⁺/Ken⁺ and two STAT⁺/Ken⁻ binding sites. Although not conclusive, the presence of STAT92E-exclusive sites in this region may explain the inability of Ken to downregulate socs36E in vivo (Arbouzova, 2006).

The findings also draw a parallel between Drosophila Ken and BCL6. The data presented demonstrate that both proteins show similar abilities to bind DNA and to mediate transcriptional repression with some evidence also linking BCL6 to JAK/STAT signaling as described here. Taken together, these similarities suggest that Ken and BCL6 represent functional orthologs of one another. Given this evolutionary conservation, it is tempting to speculate that the selective regulation of JAK/STAT pathway target genes is also conserved and may represent a general mechanism by which the pathway is modulated to elicit diverse developmental roles in vivo. Although many STAT targets undoubtedly remain to be identified, it will be intriguing to see which may also be coregulated by Ken/BCL6-dependent mechanisms (Arbouzova, 2006).

Ultraconserved non-coding DNA within Diptera and Hymenoptera

This study has taken advantage of the availability of the assembled genomic sequence of flies, mosquitos, ants and bees to explore the presence of ultraconserved sequence elements in these phylogenetic groups. Non-coding sequences found within and flanking Drosophila developmental genes were compared to homologous sequences in Ceratitis capitata and Musca domestica. Many of the conserved sequence blocks (CSBs) that constitute Drosophila cis-regulatory DNA, recognized by EvoPrinter alignment protocols, are also conserved in Ceratitis and Musca. Also conserved is the position but not necessarily the orientation of many of these ultraconserved CSBs (uCSBs) with respect to flanking genes. Using the mosquito EvoPrint algorithm, uCSBs shared among distantly related mosquito species were identified. Side by side comparison of bee and ant EvoPrints of selected developmental genes identify uCSBs shared between these two Hymenoptera, as well as less conserved CSBs in either one or the other taxon but not in both. Analysis of uCSBs in these dipterans and Hymenoptera will lead to a greater understanding of their evolutionary origin and function of their conserved non-coding sequences and aid in discovery of core elements of enhancers (Brody, 2020).

Phylogenetic footprinting of Drosophila genomic DNA has revealed that cis-regulatory enhancers can be distinguished from other essential gene regions based on their characteristic pattern of conserved sequences. Cross-species alignments have also identified conserved non-coding sequence elements associated with vertebrate developmental genes, and sequences that are conserved among ancient and modern vertebrates (e.g., the sea lamprey and mammals). Elements conserved between disparate taxa are considered to be 'ultraconserved elements'. Previous studies have identified ultra-conserved elements in dipterans, Drosophila species and sepsids and mosquitos. Comparison of consensus transcription factor binding sites in the spider Cupiennius salei and the beetle Tribolium castaneum have been shown to be functional in transgenic Drosophila (Brody, 2020).

This study describes sequence conservation of non-coding sequences within and flanking developmentally important genes in the medfly Ceratitis capitata, the house fly Musca domestica and Drosophila genomic sequences (see Genomic regions analyzed for presence of uCSBs). The house fly and Medfly have each diverged from Drosophila for ~100 and ~120 My respectively. This analysis reveals that, in many cases, CSBs that are highly conserved in Drosophila species, as detected using the Drosophila EvoPrinter algorithm, are also conserved in Ceratitis and Musca. Additionally, the linear order of these ultraconserved CSBs (uCSBs) with respect to flanking structural genes is also maintained. However, a subset of the uCSBs exhibits inverted orientation relative to the Drosophila sequence, suggesting that while enhancer location is conserved, their orientation relative to flanking genes is not (Brody, 2020).

For detection of conserved sequences in mosquitos, EvoPrinter algorithms were adapted to include 22 species of Anopheles plus Culex pipens and Aedes aegypti. Use of Anopheles species allows for the resolution of CSB clusters that resemble those of Drosophila. Comparison of Anopheles with Culex and Aedes, separated by ∼150 million years of evolutionary divergence, reveals uCSBs shared among these taxa. Although mosquitoes are considered to be Dipterans, uCSBs were identified conserved between mosquito species but these were generally not found in flies (Brody, 2020).

In addition, EvoPrinter tools were developed for sequence analysis of seven bee and thirteen ant species. Both ants and bees belong to the Hymenoptera order and have been separated by ~170 million years. Within the bees, Megachile and Dufourea are sufficiently removed from Apis and Bombus (~100 My) that only portions of CSBs are shared between species: these can be considered to be uCSBs. uCSBs are found that are shared between ant and bee species, and these are positionally conserved with respect to their associated structural genes. Finally, this study shows that ant specific and bee specific CSB clusters that are not shared between the two taxa are in fact interspersed between shared uCSBs (Brody, 2020).

A previous study of 19 consecutive in vivo tested Drosophila enhancers, contained within a 28.9 kb intragenic region located between the vvl and Prat2 genes, revealed that each CSB cluster functioned independently as a spatial/temporal cis-regulatory enhancer (Kundu, 2013). Submission of this enhancer field to the RefSeq Genome Database of Ceratitis capitata via BLASTn revealed 17 uCSBs; all 17 regions were colinear and located between the Ceratitis orthologs of Drosophila vvl and Prat2 genes. In each case the matches between Ceratitis and Drosophila corresponded to either a complete or a portion of a CSB identified by the Drosophila EvoPrinter as being highly conserved among Drosophila species (Kundu, 2013). Submission of the same Drosophila region to Musca domestica RefSeq Genome Database using BLASTn revealed 13 uCSBs that were colinearly arrayed within the Musca genome. Nine of these Ceratitis and Musca CSBs were present in both species and corresponded to CSBs contained in several of the enhancers identified in a previous study of the Drosophila enhancer field (Kundu, 2013). The conservation within one of these embryonic neuroblast enhancers, vvl-41, is shown in the following figure (Ultra-conserved sequences shared among a Drosophila ventral veins lacking enhancer and orthologous DNA within the Ceratitis capitata and Musca domestica genomes.). Each of the CSB elements in vvl-41 that are shared between Dm and Ceratitis are in the same orientation with respect to the vvl structural gene. Three-way alignments of each of the other eight uCSBs within the vvl enhancer field that are shared between Dm, Ceratitis and Musca are shown in a supplemental figure. The uCSB of vvl-49 in Ceratitis is in reverse orientation with respect to the vvl structural gene. Many of the uCSBs in Musca are in a different orientation on the contig than in Dm, indicating microinversions. One of the two uCSBs in Ceratitis goosecoid was in reverse orientation compared to Drosophila CSBs, while three of the four uCSBs in Musca goosecoid were in reverse orientation. One uCSB each in Ceratitis and Musca castor was in reverse orientation compared to Drosophila castor. 10 of the 15 uCSBs in the Musca wingless non-coding region were in the reverse orientation compared to the orientation in Drosophila, while all uCSBs in Ceratitis Dscam2 were in forward orientation compared to the orientation in Drosophila. It is concluded that, except for microinversions, the order and orientation is the same, with respect to flanking genes of highly conserved non-coding sequences in select developmental determinants of Drosophila, Ceratitis and Musca (Brody, 2020).

Many of the non-coding regions in dipteran genomes contain uCSBs, especially in and around developmental determinants, and many of these are likely to be cis-regulatory elements such as those found in the vvl enhancer field. Another example is the prevalence of uCSBs found in the non-coding sequences associated the Dm hth gene locus. A previous study identified an ultraconserved region in hth shared between Drosophila and Anopheles. This study has identified additional hth uCSBs shared among Dm, Ceratitis and Musca. A 55,100 bp upstream region of Dm hth terminating just after the start of the first exon. A total of 11 CSBs shared between the three species, 5 CSBs were shared between Dm and Ceratitis but not Musca, and 6 CSBs were shared between Dm and Musca, but not Ceratitis. Ceratitis exhibited 4 uCSBs and Musca exhibited 8 uCSBs that were in reversed orientation with respect to the Drosophila orthologous regions. Additional genes analyzed in this paper were also analyzed for association with uCSBs in Ceratitis and Musca, and these results are summarized in the table. In some cases, for example wingless in Ceratitis, the presence of uCSBs could not be verified because of the incomplete assembly of the genome, leaving coding sequences and uCSBs on different contigs. In another case, Dscam2 in Musca, no uCSBs were identified (Brody, 2020).

EvoPrint analysis of Drosophila hth sequences immediately upstream and including the first exon, revealed a conserved sequence cluster associated with the transcriptional start site. Two of the longer CSBs were conserved in both Ceratitis and Musca, one shorter CSB was conserved only in Musca, and a second shorter CSB was conserved only in Ceratitis. Each of the uCSBs was in the same orientation with respect to the hth structural gene (Brody, 2020).

EvoPrinting combinations of species using A. gambiae as a reference species and multiple species from the Neocellia and Myzomyia series and the Neomyzomyia provides a sufficient evolutionary distance from A. gambiae to resolve CSBs. Phylogenic analysis has revealed the Anopheles species diverged from ~48 My to ~30 My while Aedes and Culex diversified from the Anopheles lineage in the Jurassic era or even earlier (Brody, 2020).

This study sought to identify uCSBs in selected mosquito developmental genes by comparing Anopheles species with Aedes and Culex. Non-coding sequences associated with the mosquito homolog of the morphogen wingless were examined to discover associated conserved non-coding sequences. A CSB cluster slightly more than 27,000 bp upstream of the A. gambiae wingless coding exons is shown (EvoPrint analysis of the intragenic region adjacent to the Anopheles Wnt-4 and wingless genes identifies ultra-conserved sequences shared with the evolutionary distant Culex pipiens and Aedes aegypti genomes). CSB orientation in A. gambiae was reversed with respect to the ORF when compared to the orentations of both Culex and Aedes CSBs. It is noteworthy that this EvoPrint, carried out using multiple Anopheles, consists of a cluster of CSBs, resembling EvoPrints carried out using Drosophila species. This general pattern of CSB clusters separated by poorly conserved 'spacers' is prevalent among other developmental determinants in mosquitos. uCSBs, conserved in Culex and Aedes, coincide with CSBs revealed by EvoPrint analysis of Anopheles non-coding sequences. A supplemental figure illustrates an EvoPrinter scorecard for the non-coding wingless-associated CSB cluster described in the above figure. Scores for the first four species, all members of the gambiae complex, are similar to that of A. gambiae against itself, with subsequent scores reflecting increased divergence from A. gambiae. Culex and Aedes are distinguished from the other species by their belonging to a distinctive branch of the mosquito evolutionary tree, the Culicinae subfamily and their low scores against the A. gambiae input sequence. No uCSBs were detected associated with gbb or gsc, while uCSBs were readily detected associated with vvl, cas and hth. A single uCSB in Aedes cas and two uCSBs in Culex cas exhibited a reverse configuration compared to the uCSBs in Anopheles. One uCSB in Culex vvl and no uCSBs in Aedes vvl exhibited a reverse configuration compared to the uCSB in Anopheles. Finally, all uCSBs in Culex and Aedes hth were in forward orientation compared to Anopheles. None of the uCSBs shared between Drosophila, Ceratitis and Musca were conserved in mosquitos, with the exception of a single uCSB associated with a 3'UTR (CTTCGTTTTTGCAAGAGGCCCATATAGCTCGCCAA) that is fully conserved in the Dipteran species tested, A possible explanation for this lack of conservation is the observation that mosquitos are only distantly related to Diptera (Brody, 2020).

Bees and ants are members of the Hymenoptera Order, representing the Apoidea (bee) and Vespoidea (ant) super-families. Current estimates suggest that the two families have evolved separately for over 100 million years. To identify conserved sequences either shared by bees and ants or unique to each family, EvoPrinter alignment tools were developed for seven bee and 13 ant species and searched for CSBs that flank developmental determinants. Three approaches were employed to identify/confirm conserved elements and their positioning within bee and ant orthologous DNAs. First, EvoPrinter analysis of bee and ant genes identified conserved sequences in either bees or ants and ultra-conserved sequence elements shared by both families. Second, BLASTn alignments of the orthologous DNAs identified/confirmed CSBs that were either bee or ant specific or shared by both. Third, side-by-side comparisons of ant and bee EvoPrints and BLASTn comparisons revealed similar positioning of orthologous CSBs relative to conserved exons (Brody, 2020).

To identify conserved sequences within bee species EvoPrints of the honey bee (Apis mellifera) genes were generated using other Apis and Bombus species. Using EvoPrints of the Dscam2 locus, clusters of conserved sequences were resolved. Dscam2 is implicated in axon guidance in Drosophila and in regulation of social immunity behavior in honeybees. The EvoPrint scorecard revealed a high score (close relationship) with the homologous region in the other two Apis species. The more distant Bombus species score lower by greater than 50%, and Habropoda represents a step down from the more closely related Bombus species. Megachile shows a significantly lower score reflecting its more distant relationship to Apis mellifera. The relaxed EvoPrint readout reveals two CSB clusters. Only one sequence cluster, the lower 3' cluster, is conserved in all six test species examined, while the 5' cluster is present in all species except Megachile. BLAST searches confirmed that the 3' cluster was absent from Megachile, a more distant species Dufourea novaeangliae, and all ant species in the RefSeq genome database. BLASTn alignments also revealed conservation of the 3' cluster in D. novaeangliae, the wasp species Polistes canadensis and two ant species, Vollenhavia emeryi and Dinoponera quadriceps (Brody, 2020).

EvoPrinter analysis of bee and ant genes that are orthologs of Drosophila neural development genes goosecoid (gsc) and castor (cas) revealed conserved non-coding DNA that is unique to either bees or ants or conserved in both. EvoPrints of the Hymenoptera orthologs identify non-coding conserved sequence clusters that contained core uCSBs shared by both ant and bee superfamilies, and these uCSBs are frequently flanked by family-specific conserved clusters. For example, analysis of the non-coding sequence upstream of the Wasmannia auropunctata (ant) cas first exon identifies both a conserved sequence cluster that contains ant and bee uCSBs and an ant specific conserved cluster that has no counterpart found in bees. It is likely that the ant specific cluster was deleted in bees, since BLASTn searches of Wasmannia against the European paper wasp Polistes dominula reveals conservation of a core sequence corresponding to this cluster. The combined evolutionary divergence in the gsc and cas EvoPrints, accomplished by use of multiple test species, reveals that many of the amino acid codon specificity positions are conserved while wobble positions in their ORFs are not. The lack of wobble conservation indicates that the combined divergence of the test species used to generate the prints afford near base pair resolution of essential DNA (Brody, 2020).

Cross-group/side-by-side bee and ant comparison of their conserved DNA was performed using bee specific and ant specific EvoPrints and by BLASTn alignments (see Side-by-side comparison of conserved sequences within the bee and ant glass bottom boat loci identify clusters of conserved and species-specific sequences.). This figure highlights the conservation observed among bee and ant exons and flanking sequence of the glass bottom boat (gbb, 60A) locus of Apis melliflera EvoPrinted with four bee test species and the Wasmannia auropunctata gbb locus EvoPrinted with three ant species. Position and orientation of these CSB clusters and uCSBs is conserved. Similarly, EvoPrinting a single exon and flanking regions of the Apis mellifera homothorax locus with four bee species and generating an ant specific EvoPrint of the orthologous ant sequence of the Ooceraea biroi homothorax locus with ten other ant species, reveals CSBs that are conserved in both Apis and Ooceraea, as well as sequences that are restricted to one of the two Hymenopteran families (Brody, 2020).

This study describes the use of EvoPrinter to detect the presence of ultraconserved non-coding sequences in flies, including Drosophila species, Ceratitis and Musca, in mosquitos and in Hymenoptera species. uCSBs of the three fly taxa have, for the most part, maintained their linear order suggesting a functional constraint on the order of regulatory sequences. For mosquitos, an older taxon than that of flies and the Hymenoptera, uCSBs are found to be shared between Anopheles, Culex and Aedes. Importantly, in Hymenoptera, uCSBs were found within clusters of conserved sequences shared between ants and bees. This conservation of core sequences in enhancers suggests that these morphologically divergent taxa share common regulatory networks. These approaches to detection of uCSBs in flies, mosquitos and ants and bees will lead to a greater understanding of their evolutionary origin and the function of their conserved non-coding sequences. Knowledge of clusters of CSBs and of uCSBs is an important tool for discovery of the core elements of enhancers and their sequence extent (Brody, 2020).

In most cases both nBLAST and the EvoPrinter algorithm had similar sensitivities and gave comparable results. However, it is recommended that the two techniques should be used in conjunction with one another to enhance CSB and uCSB detection. For example, by using both approaches, uCSBs were discovered that were identified by one tool but not both. The advantage of EvoPrinter is the presentation of an interspecies comparison as a single sequence, while the advantage of nBLAST is that it provides a sensitive detection of sequence homology in a one-on-one alignment. EMBOSSED Needle alignment gives an even more sensitive detection of shorter sequences and is of use once BLAT or EvoPrinter has been used to discover shared CSBs and/or CSB clusters (Brody, 2020).

Targets of Activity

vvl binds to upstream sequences of dopa decarboxylase, which catalyzes the last step in the synthesis of serotonin and dopamine (Johnson, 1990 and Billin, 1991).

Both breathless and drifter are expressed in all tracheal cells and are essential for directed cell migrations. Ubiquitously expressed Breathless protein under control of a heterologous heat-shock promoter is able to rescue the severely disrupted tracheal phenotype associated with drifter loss-of-function mutations. In the absence of Drifter function, breathless expression is initiated normally but transcript levels fall drastically to undetectable levels as tracheal differentiation proceeds. In addition, breathless regulatory DNA contains seven high affinity Drifter binding sites similar to previously identified Drifter recognition elements. These results suggest that the Drifter protein, which maintains its own expression through a tracheal-specific autoregulatory enhancer, is not necessary for initiation of breathless expression but functions as a direct transcriptional regulator necessary for maintenance of breathless transcripts at high levels during tracheal cell migration. This example of a mechanism for maintenance of a committed cell fate offers a model for understanding how essential gene activities can be maintained throughout organogenesis (Anderson, 1996).

vvl is independently required for the specific expression in the tracheal cells of thick veins (tkv) and rhomboid (rho), two genes whose mutations disrupt only particular branches of the tracheal system. Expression in the tracheal cells of an activated form of tkv, the Decapentaplegic receptor, induces shifts in the migration of these cells, asserting the role of the dpp pathway in establishing the branching pattern of the tracheal tree. In addition, by ubiquitous expression of the btl and tkv genes in vvl mutants it is shown that both genes contribute to vvl function. These results indicate that through activation of its target genes, vvl makes the tracheal cells competent to further signaling and suggest that the btl transduction pathway could collaborate with other transduction pathways also regulated by vvl to specify the tracheal branching pattern (Llimargas, 1997).

The Drosophila tracheal system is a network of epithelial tubes that arises from the tracheal placodes, lateral clusters of ectodermal cells in ten embryonic segments. The cells of each cluster invaginate and subsequent formation of the tracheal tree occurs by cell migration and fusion of tracheal branches, without cell division. The combined action of the Decapentaplegic (Dpp), Epidermal growth factor (EGF) and breathless/ branchless pathways are thought to be responsible for the pattern of tracheal branches. It is asked how these transduction pathways regulate cell migration and the consequences on cell behaviour of the Dpp and EGF pathways is examined. rhomboid (rho) mutant embryos display defects not only in tracheal cell migration but also in tracheal cell invagination unveiling a new role for EGF signaling in the formation of the tracheal system. These results indicate that the transduction pathways that control tracheal cell migration are active in different steps of tracheal formation, beginning at invagination (Llimargas, 1999).

Defects in tracheal migration are associated with defects in invagination in rho and vvl mutant embryos, but not in bnl and btl mutant embryos. This is consistent with the observation that EGF-dependent activation of MAP kinase (ERK) in the tracheal placode precedes ERK activation by the Bnl/Btl pathway. Thus the tracheal phenotype of mutations in the EGF pathway, which has been shown to result from impaired activity of the pathway in the trachea, is likely to originate before the onset of migration. It has been proposed that the EGF pathway might be required for tracheal cells in specific branches to follow the leading cell. Tracheal migration defects of rho mutant embryos are due, at least in part, to the failure of some tracheal cells to invaginate (Llimargas, 1999).

These observations also illustrate the role of vvl in tracheal formation. Since btl expression is normally initiated in vvl mutants, early but not sustained activity of the Btl pathway could cause the tracheal phenotype in vvl mutant embryos. Since vvl is also required for the tracheal expression of tkv and rho, failure to activate the Dpp and EGF pathways could also be the source of the cell shape phenotypes in vvl mutant embryos. This latter possibility is substantiated by the observation that vvl and rho mutant embryos show abnormalities in tracheal invagination that are not present in btl mutant embryos. Finally, the tkv;rho double mutant tracheal phenotype is very similar to the vvl phenotype (Llimargas, 1999).

The Drosophila Vestigial protein has been shown to play an essential role in the regulation of cell proliferation and differentiation within the developing wing imaginal disc. Cell-specific expression of vg is controlled by two separate transcriptional enhancers. The boundary enhancer controls expression in cells near the dorsoventral (DV) boundary and is regulated by the Notch signal transduction pathway, while the quadrant enhancer responds to the Decapentaplegic and Wingless morphogen gradients emanating from cells near the anteroposterior (AP) and DV boundaries, respectively. MAD-dependent activation of the vestigial quadrant enhancer results in broad expression throughout the wing pouch but is excluded from cells near the DV boundary. This has previously been thought to be due to direct repression by a signal from the DV boundary; however, this exclusion of quadrant enhancer-dependent expression from the DV boundary has been shown to be due to the absence of an additional essential activator in those cells. The Drosophila POU domain transcriptional regulator, Drifter, is expressed in all cells within the wing pouch expressing a vgQ-lacZ transgene and is also excluded from the DV boundary. Viable drifter hypomorphic mutations cause defects in cell proliferation and wing vein patterning correlated with decreased quadrant enhancer-dependent expression. Drifter misexpression at the DV boundary using the GAL4/UAS system causes ectopic outgrowths at the distal wing tip due to induction of aberrant Vestigial expression, while a dominant-negative Drifter isoform represses expression of vgQ-lacZ and causes severe notching of the adult wing. In addition, an essential evolutionarily conserved sequence element bound by the Drifter protein with high affinity has been identified and it has been located adjacent to the MAD binding site within the quadrant enhancer. These results demonstrate that Drifter functions along with MAD as a direct activator of Vestigial expression in the wing pouch (Certel, 2000).

Localized activation of the Ras/Raf pathway by epidermal growth factor receptor (Egfr) signalling specifies the formation of veins in the Drosophila wing. However, little is known about how the Egfr signal regulates transcriptional responses during the vein/intervein cell fate decision. Evidence is provided that Egfr signaling induces expression of vein-specific genes by inhibiting the Capicua (Cic) HMG-box repressor, a known regulator of embryonic body patterning. Lack of Cic function causes ectopic expression of Egfr targets such as argos, ventral veinless and decapentaplegic and leads to formation of extra vein tissue. In vein cells, Egfr signaling downregulates Cic protein levels in the nucleus and relieves repression of vein-specific genes, whereas intervein cells maintain high levels of Cic throughout larval and pupal development, repressing the expression of vein-specific genes and allowing intervein differentiation. However, regulation of some Egfr targets such as rhomboid appears not to be under direct control of Cic, suggesting that Egfr signaling branches out in the nucleus and controls different targets via distinct mediator factors. These results support the idea that localized inactivation of transcriptional repressors such as Cic is a rather general mechanism for regulation of target gene expression by the Ras/Raf pathway (Roch, 2002).

There are two key aspects of Cic function as a developmental regulator: its ability to repress specific target genes in defined territories, and its inhibition by the Ras/Raf pathway to allow expression of those targets in complementary positions. In the blastoderm embryo, Cic is required for development of trunk body regions and represses genes mediating differentiation of terminal structures. Torso RTK activation at each pole of the embryo alleviates Cic-dependent repression and initiates the terminal gene expression program. A similar model is proposed for cic function during specification of vein versus intervein fate in the wing. Loss of cic function in the wing causes formation of ectopic vein tissue, implying that Cic mediates intervein specification by restricting vein formation to appropriate regions. In intervein territories, Cic behaves as a repressor of vein-specific genes such as argos and vvl but does not seem to affect directly the expression of blistered, which is required for the specification of intervein fates. Finally, Egfr signaling leads to downregulation of Cic protein levels in vein nuclei, thus relieving Cic-mediated repression and promoting vein development (Roch, 2002).

Vvl regulates decapentaplegic expression during Drosophila wing veins pupal development

The differentiation of veins in the Drosophila wing relies on localised expression of decapentaplegic in pro-vein territories during pupal development. The expression of dpp in the pupal veins requires the integrity of the shortvein region (shv), localised 5' to the coding region. It is likely that this DNA integrates positive and negative regulatory signals directing dpp transcription during pupal development. A minimal 0.9 kb fragment has been identified giving localised expression in the vein L5 and a 0.5 kb fragment giving expression in all longitudinal veins. Using a combination of in vivo expression of reporter genes regulated by shv sequences, in vitro binding assays, and sequence comparisons between the shv region of different Drosophila species, binding sites were found for the vein-specific transciption factors Araucan, Knirps and Ventral veinless, as well as binding sites for the Dpp pathway effectors Mad and Med. It is concluded that conserved vein-specific enhancers regulated by transcription factors expressed in individual veins collaborate with general vein and intervein regulators to establish and maintain the expression of dpp confined to the veins during pupal development (Sotillos, 2006).

The expression of dpp in the wing disc is restricted to a narrow stripe of anterior cells localised at the anterior/posterior compartment boundary. This expression is regulated by sequences localised 3′ to the dpp coding region, and the function of the gene at this stage is required for the growth and patterning of the wing. The expression of dpp is still detected at the A/P boundary during the 8 h of pupal development. Later, at 14 h APF novel domains of dpp expression appear corresponding to the developing wing veins. The function of dpp during pupal development requires the integrity of the shv region, which is localised 5′ to the dpp coding region. There are two different transcripts expressed during pupal development, transcripts dpp-RA and dpp-RC, whose promoters (P5 and P3, respectively) are separated by approximately 20 kb of DNA. This DNA includes the first exon of transcript dpp-RC and corresponds to the place where all dpp^s alleles map. Because the strength of dpp^s alleles correlates with their distance to the P3 promoter, it is likely that dpp function in pupal development is mediated mainly by transcript dpp-RC. This suggests that dpp^s mutations affect regulatory sequences necessary to drive dpp expression in presumptive vein territories during pupal development. This possibility was confirmed by analysing the expression of a 8.5 kb construct containing most of the shv region fused to the reporter gene lacZ (shv^8.5–lacZ). The expression of βGal in shv^8.5–lacZ is detected exclusively in the pupal veins, indicating that this region includes all dpp wing veins regulatory regions (Sotillos, 2006).

Several constructs were made using different sub-fragments from the original 8.5 kb dpp^s DNA to identify with more precision the sequences that regulate dpp expression during pupal development. These fragments were cloned in front of a dicistronic lacZ–Gal4 reporter gene and the activity of these constructs was analysed by looking at the expression of βGal in pupal wings from transgenic flies. In addition, to amplify the signal of the dicistronic lacZ–Gal4 gene, the expression was monitored of a reporter gene regulated by UAS sequences. This expression should reveal all places where the Gal4 protein is present. Several regulatory regions were detected that control dpp expression in the veins during pupal development. One regulatory sequence is localised in a 1.1 kb fragment localised 6.5 kb from P3, and drives high levels of expression in most pupal veins and low levels of expression in some interveins. Additional regulatory sequences that efficiently drive expression in most veins are localised in an adjacent 0.5 kb fragment, and further vein-specific regulatory sequences for L5 are localised in the 0.9 kb SalI/KpnI fragment (Sotillos, 2006).

The expression of dpp during embryogenesis is highly dynamic and several independent regulatory regions controlling embryonic dpp expression have already been identified. The shv constructs included in the 8.5 kb EcoRI fragment drive reporter expression during embryonic development from stage 12/13 mainly in three regions of the mesoderm: the oesophagus, gastric caeca and midgut. Regulatory regions controlling dpp expression in the oesophagus appear to be duplicated, because they are localised in the 2.7 kb EcoRI/SalI fragment and also in the 3 kb KpnI/KpnI fragment. Similarly, regions controlling dpp expression in the gastric caeca are also present in the two adjacent fragments 0.9 kb SalI/KpnI and 3 kb KpnI/KpnI. The regions driving reporter expression in the gut are localised in the 3′ end of the shv region (Sotillos, 2006).

To better understand the regulation of dpp expression during vein development, the interactions were analyzed between a 2.5 kb region including wing veins pupal enhancers and several proteins involved in the regulatory network controlling the formation of veins. For this purpose, the 2.5 kb region was subdivided into overlapping fragments of 250-300 bp used as probes to detect the binding of different transcription factors by Electrophoretic Mobility-Shift Assays (EMSAs). Both prepattern specific genes that control vein development, such as Ventral veinless (Vvl) and the Araucan protein (Ara), and transcription factors belonging to the Dpp pathway (Mad and Medea) were analyzed (Sotillos, 2006).

The POU-Homeodomain protein Vvl is expressed in all vein territories throughout pupal development and is required for vein differentiation. Vvl was able to bind all the probes included in the 2.5 kb SalI/SacII region. The Vvl binding was competed by cold DNA as well as by specific oligonucleotides previously described to compete Vvl binding to the vestigial quadrant enhancer. To further characterise the binding of Vvl to the shv enhancer, focus was placed on the 0.5 kb SacII fragment, which drives expression in all longitudinal veins. This fragment was subdivided into two overlapping probes (S9 and S10) and both of them bind Vvl specifically. These bindings were competed using oligonucleotides covering these probes. Vvl-binding regions were found in S9 and in S10. Interestingly, these sequences contain previously identified consensus sequences for Vvl binding. These data suggest that Vvl participates in vein formation during pupal development through the regulation of dpp expression via the shv enhancers (Sotillos, 2006).

The pattern of four longitudinal veins is very similar in all Drosophilids despite the differences in wing size and pigmentation that exist between species. This conservation suggests that the mechanisms underlying vein pattern formation are conserved. The availability of the genomic sequence for different Drosophila species allows a direct comparison between their dpp^s regions. Two Drosophila species from the melanogaster group (D. melanogaster and D. ananassae), one Drosophila from the obscura group (D. pseudoobscura) and D. virilis from the virilis group were compared. It is expected that sequence similarity in non-coding regions corresponds to functional regulatory DNA. In the 2.5 kb region analysed several clusters of sequence conservation were found including most of the binding sites identified by in vitro analysis. Thus, there are four highly conserved regions corresponding to the S1, S4-5, S7-8 and S9-S10 probes containing conserved binding sequences for Vvl, Mad, Med and Ara. This conservation reinforces the importance of these regions to regulate the expression of dpp in the pupal veins. In the case of Vvl specific DNA binding to all probes was shown. However, the putative Vvl binding sequences localised in the conserved regions are only in S1, S3, S7, S8 and S10. In the case of the Dpp pathway transcription factors Mad and Med, putative binding sites are present throughout the enhancer, and accordingly binding of them to all probes was shown. However, only the S5 and S10 probes contain putative binding sites in regions of high sequence conservation. Interestingly, these conserved Mad/Med binding regions contain overlapping binding consensus for the Brinker repressor. This suggests a competition mechanism between Mad/Med and Brinker for binding to the shv enhancer. Competition mechanisms between activator and repressor also occur in several Dpp-downstream genes such as zen and omb. Four consensus binding sequences were found for the transcription factors of the Knirps-complex. The kni genes are expressed in the L2 vein, where they are required for its formation. Three Kni-binding sites were found in the 1.1 kb KpnI/SacII enhancer and one in the 0.5 SacII regions. Only two of the sequences located in the 1.1 kb KpnI/SacII enhancer present some conservation between Drosophilids. Interestingly, the 0.9 kb SalI/KpnI enhancer responsible of dpp expression in the L5 veins does not contain any putative Knirps binding sequence. Although whether Kni binds directly to the shv enhancer has not been analyzed, the presence of Kni-binding sites in conserved regions of the enhancer suggests that, in addition to its role during imaginal development, Kni might also activate dpp transcription during pupal development (Sotillos, 2006).

Therefore, regulatory sequences that drive dpp expression in the pupal veins in 2.5 kb of the dpp^s region have been found. This regulatory DNA can be subdivided into three fragments, a 1.1 kb fragment that recapitulates almost completely the pupal expression of dpp, a 0.9 kb upstream fragment, which drives expression in the proximal part of L5, and a 0.5 kb fragment that directs expression in all veins. Binding sites were found in these fragments for general transcription factors involved in the development of all veins (Vvl) and for the downstream activators of the dpp pathway, Mad and Medea. The regulatory region also contains binding sites for transcription factors expressed and required only in specific veins, such as Ara (L3 and L5) and Kni (L2). Most of these sequences are located in highly conserved regions of the dpp gene in different Drosophila species, indicating a general conservation of dpp regulation in the Drosophilids (Sotillos, 2006).

Polarized subcellular localization of Jak/STAT components is required for efficient signaling: Regulation of vvl and crb

Three protein complexes control polarization of epithelial cells: the apicolateral Crumbs and Par-3 complexes and the basolateral Lethal giant larvae complex. Polarization results in the specific localization of proteins and lipids to different membrane domains. The receptors of the Notch, Hedgehog, and WNT pathways are among the proteins that are polarized, with subcellular receptor localization representing an important aspect of signaling regulation. For example, in the WNT pathway, differential DFz2 receptor localization results in activation of either the canonical or the planar polarity pathway. Despite the large body of research on the vertebrate JAK/STAT pathway, there are no reports indicating polarized signaling. By using the conserved Drosophila JAK/STAT pathway as a system, it was found that the receptor and its associated kinase are located in the apical membrane of epithelial cells. Unexpectedly, the transcription factor STAT is enriched in the apicolateral membrane domain of ectoderm epithelial cells in a Par-3-dependent manner. These results indicate that preassembly of STAT and the receptor/JAK complex to specific membrane domains is a key aspect for signaling efficiency. These results also suggest that receptor polarization in the ectoderm cell membrane restricts the cell's response to ligands provided by neighboring cells (Sotillos, 2008).

Besides setting up epithelial polarity, apicobasal complexes also modulate the subcellular compartmentalization or localized activation of various signaling molecules. The JAK/STAT signaling pathway is involved in processes ranging from immune response to organogenesis. In the vertebrate-signaling model, inactive STAT is shuttling from the cytoplasm to the nucleus. Ligand binding to the dimerized receptor results in the activation of JAK bound to the receptor. JAK phosphorylates itself and the receptor, creating docking sites for STAT. Inactive cytoplasmic STAT now binds to the phosphoreceptor/JAK complex, where it is phosphorylated by the kinase. Phosphorylated STAT is imported to the nucleus, where it activates the transcription of target genes. In contrast to vertebrates, in which the JAK/STAT core-signaling elements are highly redundant, the Drosophila pathway is composed of only three ligands, Unpaired (Upd), Unpaired2, and Unpaired3; one receptor, Domeless (Dome); one JAK, Hopscotch (Hop); and one transcription factor, STAT92E. Therefore, Drosophila was used as a model to investigate the polarization of the pathway (Sotillos, 2008).

dome, hop, and stat92E mRNAs are maternally provided and ubiquitously transcribed in the embryo. To analyze their protein subcellular localization, specific antibodies were used or functional tagged proteins were expressed by using UAS-dome, UAS-hop-Myc, and UAS-STAT92E-GFP. These constructs were expressed by using either mesodermal or ectodermal Gal4 drivers, and the subcellular localization of the proteins was analyzed, paying special attention to three organs where the endogenous ligand is expressed and the pathway is active: the posterior spiracles (ectodermal origin), the pharyngeal musculature (mesodermal), and the hindgut (an ectodermal tube surrounded by mesoderm) (Sotillos, 2008).

In the pharynx, as expected for a receptor, Dome localizes to the membrane, and does so in a dotted pattern that could correspond to endocytic vesicles. Hop-myc localizes to the cytoplasm, obscuring any membrane localization. This is due to the high levels of Hop-myc expressed, saturating the receptor binding sites and accumulating in the cytoplasm, as simultaneous coexpression of Hop-myc with the receptor relocates Hop to the membrane. This depends on the cytoplasmic domain of Dome, as it also occurs with a construct missing the extracellular domain but not with constructs missing the intracellular domain. STAT is detected in the cytoplasm and is more concentrated in the nuclei, as expected from the activation of the pathway in the pharynx. All of these observations agree with current knowledge of JAK/STAT activation based on vertebrate studies (Sotillos, 2008).

In contrast to the mesoderm, analysis of ectoderm cells shows a different picture. Both in the hindgut and the posterior spiracles, the Dome receptor localizes on the apical membrane. Hop is again cytoplasmic, but after coexpression with Dome both proteins localize to the apical membrane. Surprisingly, by using a specific antibody it was observed that STAT concentrates on the apical membrane of all embryonic ectodermal cells irrespective of the level of activation of the pathway. And, in cells in which the pathway is active, STAT also localizes to the nucleus. The signal detected by the antibody is specific; the same result by using a STAT-GFP fusion protein. STAT membrane localization is more prominent in cells in which the pathway is inactive; for instance, in the trunk epidermis or the spiracle after stage 15. This suggests that STAT translocates from the subapical membrane to the nucleus after pathway activation, returning to the membrane after inactivation (Sotillos, 2008).

To determine if STAT-GFP membrane localization is due to any other of the pathway's components, STAT-GFP localization was analyzed in upd, dome, or hop null mutants. STAT does not disappear from the membrane in a deficiency that removes all three Upd ligands. STAT membrane localization is not affected in null mutants for either dome or hop, demonstrating that apical STAT localization is independent of the pathway (Sotillos, 2008).

STAT localizes to the membrane domain in which the apical complexes are located. This, and the fact that STAT does not localize to the membrane in the mesoderm where Crb and Par-3 complexes are not formed, suggests the apical complexes could be recruiting STAT. To test this, different apical complex proteins were expressed in the mesoderm, and their capacity to modify STAT subcellular localization was studied. Neither the expression of Crb nor aPKC (another member of Par-3 complex) is able to translocate STAT to the membrane. In contrast, expression of Par-3 results in efficient membrane translocation of STAT and STAT-GFP. Moreover, STAT-GFP and Par-3 coimmunoprecipate from embryo extracts overexpressing STAT-GFP and par-3, pointing to Par-3 as the molecule responsible of STAT apical localization. In accordance, STAT-GFP is lost from the membrane in par-3 zygotic mutants, whereas in crb null mutants, where the polarity is highly compromised and Par-3 localization is severely affected, STAT remains in the membrane of cells only where Par-3 is still present. Similarly, in null aPKC embryos, STAT-GFP exclusively remains apical in cells in which Par-3 still localizes at the membrane. Thus, STAT recruitment is independent of Crb or aPKC and may directly depend on Par-3 (Sotillos, 2008).

To analyze if JAK/STAT polarization is functionally relevant, genetic interactions with polarity mutants were tested. Heterozygous polarity mutants or stat92E embryos are viable and normal. In contrast, embryos simultaneously heterozygous mutant for stat92E and either par-3, aPKC, or crb present phenotypes associated to JAK/STAT loss of function, including malformation of the posterior spiracles and abnormal segmentation. A specific readout of the pathway's activity was studied, analyzing the expression of a crb-spiracle enhancer that is directly activated by JAK/STAT. The expression of this enhancer is severely reduced in zygotic par-3 mutants simultaneously heterozygous for stat92E, compared to its expression in heterozygous stat92E embryos or zygotic par-3 mutants. In contrast, the expression of the JAK/STAT independent ems-spiracle enhancer is not affected in the same genetic backgrounds. The capability of Par-3 to induce STAT membrane localization and the strong genetic interaction between stat92E and cell-polarity mutations indicate that the apical polarization of JAK/STAT components is required for full-signaling efficiency in the ectoderm (Sotillos, 2008).

Next, whether the apical localization of all JAK/STAT transducer components in the ectoderm results in signaling occurring exclusively through this membrane domain was tested. For this purpose the posterior hindgut, where JAK/STAT is required in the ectoderm and in the mesoderm surrounding it, was analyzed. Upd expressed from the most anterior ectodermal cells of the hindgut activates in the ectoderm ventral veinless (vvl) and upregulates in the mesoderm dome through the dome-MESO enhancer. Thus, vvl and the dome-MESO autoregulatory enhancer can be used as readouts for JAK/STAT activation in the different hindgut tissues (Sotillos, 2008).

If signaling in the ectoderm were transduced exclusively through the apical membrane, it would be expected that vvl activation on the hindgut would not be possible if Upd is presented from the basal side. To test this Upd was expressed either in the ectoderm or in the mesoderm, and its effect on vvl activation in the ectoderm was analyzed. As a positive control the expression of dome-MESO was analyzed. When expressed throughout the ectoderm, Upd induces ectopic expression of dome-MESO in the mesoderm and of vvl in the ectoderm, behaving as the endogenous Upd. In contrast, when Upd is expressed throughout the mesoderm, dome-MESO is ectopically activated, whereas vvl is not. The unresponsiveness of the ectoderm cells to Upd from the mesoderm is consistent with the endogenous receptor being apically localized in the hindgut ectoderm and, thus, unable to receive any mesoderm signal (Sotillos, 2008).

Many proteins involved in the establishment and maintenance of cell polarity also modulate signaling pathways by modifying or restricting the localization of their signaling components. Precise subcellular distribution may help the activation of the pathway or restrict its activity by sequestering key elements. This study has shown that in the epithelial cells the localization of JAK/STAT components is highly polarized. The apical restriction of the receptor can influence transduction, since only ligand presented to the apical side of the epithelium would be detected. This may be of relevance after septic injury, when circulating haemocytes secrete the Upd3 cytokine into the haemolymph. In this case, the secreted ligand would activate its targets in the fat body without stimulating the ectoderm epithelial cells, since the cell junctions efficiently block Upd diffusion to the apical side (Sotillos, 2008).

Par-3-dependent STAT apical localization is intriguing. The localization of STAT to the subapical membrane seems important for signal transduction, since mutations reducing the amount of cell polarity proteins enhance stat loss of function phenotypes and reduce the activation of direct pathway targets. It is proposed that in ectodermal cells, where the receptor and the kinase locate apically, the existence of a subapical pool of STAT facilitates its rapid translocation to the activated receptor, increasing signaling efficiency. Future research should resolve whether this is achieved simply by the increased local concentration of apical STAT facilitating receptor binding or if there exists some dedicated machinery to translocate STAT from the subapical region to the active receptor similar to the one involved in nuclear import. It is interesting to note that crb expression is upregulated by JAK/STAT signaling in the follicle cells and in the posterior spiracles. Since Crb helps maintaining Par-3 in the apical membrane, upregulation of crb by STAT might increase apical Par-3, reinforcing signal transduction by increasing the apical concentration of STAT (Sotillos, 2008).

There are few reports of polarized vertebrate JAK/STAT signaling. However, analysis of the subcellular localization of two IL-6 receptors in MDCK epithelial cells has shown that gp130 localizes basolaterally and CNTF-R apically. Also, in the mammary glands, the IL-4Ra receptor is localized apically in luminal cells during gestation and lactation. Recently, activated STAT3 has been transiently detected at the membrane in the nascent cell-cell contacts of squamous cell carcinoma of the head and neck. In vertebrates the Par-3 complex functions as a regulator of junction biogenesis. It will be interesting to investigate whether Par-3 also mediates the localization of STAT3 in the membrane. The results suggest that JAK/STAT polarization in epithelia may be a general feature (Sotillos, 2008).

Identification of motifs that are conserved in 12 Drosophila species and regulate midline glia vs. neuron expression

Functional complexity of the central nervous system (CNS) is reflected by the large number and diversity of genes expressed in its many different cell types. Understanding the control of gene expression within cells of the CNS will help reveal how various neurons and glia develop and function. Midline cells of Drosophila differentiate into glial cells and several types of neurons and also serve as a signaling center for surrounding tissues. This study examined regulation of the midline gene, wrapper, required for both neuron-glia interactions and viability of midline glia. A region upstream of wrapper required for midline expression was identified that is highly conserved (87%) between 12 Drosophila species. Site-directed mutagenesis identifies four motifs necessary for midline glial expression: (1) a Single-minded/Tango binding site, (2) a motif resembling a Pointed binding site, (3) a motif resembling a Sox binding site, and (4) a novel motif. An additional highly conserved 27 bp are required to restrict expression to midline glia and exclude it from midline neurons. These results suggest short, highly conserved genomic sequences flanking Drosophila midline genes are indicative of functional regulatory regions and that small changes within these sequences can alter the expression pattern of a gene (Estes, 2008).

To facilitate the identification of sequences responsible for wrapper expression in the midline glia of Drosophila, the genomic region flanking the wrapper transcription unit was examined to determine the degree of conservation between the 12 available Drosophila species. The regions most likely to contain regulatory control elements (motifs) of wrapper are tractable; the genomic regions flanking the transcription unit and the first intron are relatively small. The results of this analysis highlighted a region between -492 and -326 upstream of the transcription start site of wrapper that is highly conserved in all Drosophila 12 species examined, particularly a 70-bp region. To test if these sequences are responsible for the wrapper expression pattern in embryos, this genomic region was amplified within a 884-bp fragment, and then fused it to the green fluorescent protein (GFP) reporter gene within the pHstinger vector, which contains a minimal Hsp70 promoter. This DNA construct (wrapper W:GFP) was injected into D. melanogaster embryos using P element-mediated transformation to generate stable fly lines. Embryos containing this construct express GFP in midline glia beginning at stage 12 of embryogenesis and throughout larval stages. It was confirmed that GFP was expressed in midline glia by staining embryos simultaneously with either (1) wrapper and GFP or (2) sim and GFP. Because wrapper protein is found at the surface of midline glial cells, but the GFP produced by pHstinger localizes to the nucleus, wrapper protein encircles the GFP in these cells. The wrapper W:GFP reporter construct also drives expression in a few additional cells within the lateral CNS and muscles, a pattern that differs from the endogenous wrapper expression pattern. This suggests that the W fragment, although sufficient to drive high levels of expression in midline glia, lacks certain sequences that exclude expression in lateral CNS cells. To confirm the midline expression pattern generated by the reporters, all subsequent experiments were performed by staining embryos with both sim and GFP at stage 16 of embryogenesis. These experiments revealed that GFP generated by the wrapper W:GFP reporter gene was indeed expressed in the midline glia, but not in the cells that develop into midline neurons (Estes, 2008).

Next, to determine the minimal sequences required to provide expression in midline glia, this 884-bp region was divided into several subregions, fused to GFP within the pHstinger vector and tested for the ability to drive midline expression in transgenic embryos. Region E, extending from sequences -756 to -286, is sufficient to drive high levels of GFP expression in midline glia. Moreover, a smaller 166-bp (-492 to -327) G fragment, and an even smaller 119-bp (-492 to -374) internal K fragment, that both include the highly conserved region, are also sufficient to drive GFP expression in midline glia, but the level of expression is reduced compared to that of the E fragment and the intact 884-bp W fragment. None of the other reporter constructs drove GFP expression in the midline. The K fragment is also expressed in a subset of midline neurons, including progeny of the median neuroblast, suggesting that the larger W, E, and G fragments contain a silencer, which is absent from the K fragment and normally represses expression in these midline neurons (Estes, 2008).

Next, to determine if the observed conservation at the sequence level between Drosophila species reflects conservation in function, the corresponding E region from D. virilis was tested to see if it could drive GFP reporter expression in the midline glia of D. melanogaster. The E region is also located upstream of wrapper in D. virilis and is 476 bp in length, while it is 462 bp in melanogaster. The entire E region is 58.4% identical in the two species, and the 70-bp highly conserved section differs by only six nucleotides. The midline expression pattern provided by the D. virilis wrapper E:GFP construct in D. melanogaster flies is indistinguishable from that of the corresponding D. melanogaster E region. These results suggest that the location and function of the regulatory sequences of wrapper have been conserved between D. melanogaster and D. virilis (Estes, 2008).

To determine if previously identified midline transcription factors affect wrapper through these regulatory sequences, the wrapper W:GFP reporter gene was tested in a number of mutant backgrounds. First, the effect of sim mutations on the reporter gene was tested by placing the 884-bp wrapper W:GFP transgene into a sim^H9 mutant background, a mutation that eliminates Sim protein expression. In this background, GFP expression was abolished in most cells, suggesting that sim expression is required for wrapper transcriptional activation in the midline. A few remaining cells did express GFP and these are likely lateral CNS cells also observed in wild-type embryos containing the wrapper W:GFP reporter (Estes, 2008).

Next, the reporter gene was tested in a spitz (spi) mutant background. Spi is a signaling molecule that plays multiple roles during Drosophila development. Wrapper protein is normally found on the surface of midline glia where it mediates direct contact with the lateral CNS axons that cross the midline and promotes survival of midline glia. In wrapper mutant embryos, this intimate interaction cannot occur and additional midline glia die. The amount of spi signaling provided by lateral CNS axons determines how many midline glia survive in each segment. The spi mutation severely disrupted CNS development so that the sim positive cells remained on the ventral surface of the embryo. Only a few of the sim positive cells also express GFP driven by wrapper regulatory sequences, suggesting these are the remaining midline glia. The cells expressing sim, but not GFP, are likely midline neurons, while cells expressing GFP and not sim are lateral glia, because they also express reversed polarity (repo), a marker of lateral CNS glia. These results indicate spi mutations reduce the number of midline glia in the embryo and also reduce expression of the wrapper W:GFP reporter gene (Estes, 2008).

In addition to sim and tgo, the transcription factors Dichaete (D), a Sox HMG protein, and Dfr, a POU domain protein, regulate genes expressed in midline glia. The D protein directly interacts with the PAS domain of Sim and the POU domain of Dfr and all three genes activate expression of slit in midline glial . The wrapper W:GFP construct was tested in both a D and dfr mutant background. In both cases, the number and behavior of midline cells was altered and they did not migrate to the dorsal region of the ventral nerve cord, as they normally do. While development of midline cells was disrupted in these mutant backgrounds and fewer midline glia were present, robust GFP expression was still observed from the reporter construct in the midline cells that remained, suggesting that (1) D and Dfr do not directly activate wrapper via these regulatory sequences, (2) additional, redundant factors exist that can substitute for them, or (3) they can substitute for one another (Estes, 2008).

In summary, midline cell development was disrupted in sim, spi, D, and dfr mutant backgrounds. The sim^H9 mutation eliminated midline glia and neurons, while a mutation in spi eliminated most midline glia. As predicted, both sim and spi mutations severely reduced the number of cells expressing GFP driven by the wrapper W:GFP reporter gene. In the D and dfr mutants, the number of midline glia was reduced and the remaining midline glia expressed high levels of GFP (Estes, 2008).

Ectopic sim expression converts neuroectodermal cells into midline cells and activates downstream, midline genes. To test the effect of ectopic sim on wrapper expression, sim was overexpressed using the UAS/GAL4 system and it was found that wrapper was expressed in neuroectodermal cells outside of the midline, but not in all cells that overexpress sim. In the UAS-sim/da-GAL4 embryos, wrapper is activated in cells that correspond to the lateral edges of the CNS and the cells in the anterior of each segment, with gaps in the expression pattern. Next, whether overexpression of the secreted form of spi could expand wrapper to cells outside the midline was tested. Ectopic expression of secreted spi with the da-GAL4 driver also expanded wrapper expression. To determine if it is possible to expand the expression domain of wrapper further, sim was overexpressed together with spi. This caused additional expansion of the wrapper domain into broad stripes within ectodermal cells. In addition, overexpression of either sim or spi causes severe disruption in embryonic development (Estes, 2008).

Next, the ability of sim and spi, either alone or together, to expand expression of the wrapper reporter genes was tested. Expression from both the full-length reporter construct, wrapper W:GFP, and the smaller wrapper G:GFP construct expanded in the UAS-sim/da-GAL4 embryos to a greater extent than the endogenous wrapper gene. The expression pattern provided by the reporter constructs differs from the endogenous wrapper expression pattern, suggesting that either (1) some of the sequences that normally repress wrapper in tissues outside the midline glia may be missing in these wrapper W and G constructs, or (2) ectodermal cells overexpressing sim may undergo cell death and the GFP marker may be more stable in these dying cells compared to wrapper. Overexpression of spi alone also expanded reporter gene expression driven by both the wrapper W:GFP and wrapper G:GFP constructs. The GFP expression domain was expanded to a greater extent in embryos overexpressing sim together with spi compared to those overexpressing either gene alone. Taken together, the results indicate that (1) limiting the wrapper regulatory sequences and (2) increasing the cells that express sim and spi converts the highly specific expression pattern of wrapper from a single strip of CNS cells to a more general pattern throughout the ectoderm of the embryo. In addition, these results suggest that both the sim transcription factor and spi signaling molecule can activate transcription through these sequences derived from the regulatory region of wrapper (Estes, 2008).

To both (1) identify functionally important motifs needed for wrapper expression and (2) determine if all the invariant nucleotides within the conserved 70-bp region of wrapper are essential for the observed midline glial expression pattern, effects of select mutations within the wrapper G region were tested. Previous studies have demonstrated the importance of sim/tgo, D, dfr, and spi for the expression of midline glial genes and, therefore, possible binding sites for these factors were sought. To examine both predicted binding sites, as well as other conserved sequences that may contain binding sites for novel factors, the region was divided into eight motifs that were tested for their effect on midline glia expression (Estes, 2008).

Each of these conserved motifs was tested by changing 2-3 nucleotides in the context of the D. melanogaster G fragment. The altered G fragments were then inserted independently into the pHstinger vector and injected into fly embryos to test their ability to drive midline expression (Estes, 2008).

Despite the high degree of conservation within this region, only four of the eight mutations that were tested (G1, G2, G5, and G7) caused a noticeable reduction in reporter expression. Two of the mutation sets destroyed midline expression of the G reporter construct. The putative Sim/Tgo binding site (G2: CACGT) was needed for midline expression, because changing this sequence to GAAGT eliminated midline glial expression. In addition, another sequence, ATTTTATC (G5), located upstream of the G2, was required for expression of the reporter gene in wild-type embryos and changing this sequence to ATTGGATC eliminated midline glial expression. Two additional sites within the G fragment of wrapper are needed for midline expression: CGGAGAG (G7) and CACAAT (G1). If either of these motifs is altered, midline glial expression is greatly reduced, but not completely eliminated (Estes, 2008). In contrast, the other four sets of mutations had no detectable negative effect on midline glial expression of the reporter gene, even though these sequences are conserved in all 12 Drosophila species. Mutation sets G4, and G8 did cause a low level of reporter gene activation in some midline neurons, suggesting that repressor proteins present in midline neurons may interact with these regions of the wrapper regulatory region. Finally, mutation G3 had no detectable positive or negative effect on expression of the reporter gene, despite being conserved in all 12 Drosophila species. In summary, the various mutations had three different effects on expression driven by the wrapper regulatory sequences: (1) some reduced midline glial expression, (2) some caused the inappropriate activation of the wrapper reporter in midline neurons, and (3) one was conserved, but apparently had no effect on wrapper regulation, in the context of the experiments presented here (Estes, 2008).

Therefore, these experiments suggest that Sim/Tgo heterodimers may directly regulate wrapper gene expression. (1) Activity of the wrapper W:GFP reporter gene is severely reduced in a sim mutant background, suggesting sim is necessary for expression of this transgene and that sim regulates wrapper by activating transcription through these sequences. (2) Midline activity of the wrapper reporter gene is abolished by eliminating the single CME (CACGT) present within this region. (3) wrapper reporter gene expression is expanded in sim overexpression embryos. Future biochemical studies will determine if Sim/Tgo heterodimers directly interact with the wrapper regulatory motif identified in this study (Estes, 2008).

The studies described in this study demonstrate that the wrapper reporter genes are sensitive to levels of spi signaling. Mutations in spi reduce wrapper reporter gene expression and overexpression of the secreted form of Spi, together with Sim expands, not only the expression domain of the endogenous wrapper gene, but the wrapper reporter genes as well. Spi binds the Epidermal Growth Factor Receptor in midline glia, leading to MAPK activation and subsequent activation of the ETS transcription factor, pnt. Therefore, it may be Pnt that directly activates wrapper transcription through the regulatory sequences studied in this study. One of the identified motifs needed for transcriptional activity of wrapper is: CGGAGAG, which loosely conforms to the consensus binding site for ETS transcription factors (C/A)GGA(A/T)(A/G)(C/T). However, further experiments are needed to determine if Pnt directly interacts with these regulatory sequences, as well as the precise mechanism whereby spi signaling regulates wrapper. Taken together with previous studies, these results suggest that the spi signaling pathway may play at least two roles in promoting survival of midline glia: (1) activating wrapper, needed for neuron-glial interactions and (2) phosphorylating, thereby inactivating Head involution defective, which would otherwise cause programmed cell death in midline glia (Estes, 2008).

Many genes expressed in the CNS of metazoan organisms are regulated through synergistic interactions between Sox HMG-containing proteins and POU domain proteins. Recently, many vertebrate genes expressed in the developing CNS have been shown to contain highly conserved noncoding DNA regions enriched for binding sites for three classes of transcription factors: Sox, POU, and homeodomain proteins. Experiments indicated that Sox and POU proteins work together to activate, while homeodomain proteins repress and limit expression of CNS genes. Interestingly, several motifs identified in this study as important for regulation in midline glia of Drosophila resemble binding sites for Sox (G1: CACAAT), POU (G4: ATGCAAAT, G6: ATGCAACA, and G8: ATGCGTGG), and homeodomain proteins (G5: ATTTTATC) (Estes, 2008).

That the wrapper K:GFP, but not the wrapper G:GFP construct is expressed in certain midline neurons, identifies a midline neural silencer in the 43-bp region present in the G fragment, but absent in the K fragment. Within this region, 27 bp are highly conserved in all 12 Drosophila species and two of the three mutations in the G fragment that cause slight activation of reporter gene expression in midline neurons are found within the 43-bp region. All three sites that lead to activation in midline neurons, G4, G6, and G8, conform to a POU domain binding site, suggesting a POU domain protein expressed in midline neurons may bind to one or more of these sites to keep the wrapper gene silent (Estes, 2008).

One POU domain protein, Dfr, binds to the sequence ATGCAAAT in other gene regulatory regions to activate transcription, including those of two genes expressed in midline glia: dfr itself and slit. This sequence is found at site G8 in the wrapper regulatory region, but when changed to ATGCTAGC, caused a low level of activation in midline neurons, rather than reducing expression in midline glia. Although the number of midline glia is reduced in a dfr mutant background, those that remain express a high level of reporter gene expression driven by wrapper sequences and the results suggest dfr is not absolutely required for wrapper reporter gene expression in midline glia (Estes, 2008).

Mutations in the POU domain motifs within the wrapper regulatory sequences suggest a notable difference between the CNS genes studied previously in vertebrates and the midline glial gene studied here. The POU domain binding sites appear to limit expression in midline neurons (rather than activate expression as in vertebrate CNS genes), and it is the Sox and homeodomain binding sites that are needed for activation. This may reflect a key difference in regulatory control of glial vs. neural genes and it is plausible that other midline glial genes excluded from midline neurons will contain silencer elements similar to the one identified in this study, but further experiments are needed to confirm this (Estes, 2008).

Common motifs shared by conserved enhancers of Drosophila midline glial genes

Coding sequences are usually the most highly conserved sectors of DNA, but genomic regions controlling the expression pattern of certain genes can also be conserved across diverse species. In this study, five enhancers were identified capable of activating transcription in the midline glia of Drosophila melanogaster and each contains sequences conserved across at least 11 Drosophila species. In addition, the conserved sequences contain reiterated motifs for binding sites of the known midline transcriptional activators, Single-minded, Tango, Dichaete, and Pointed. To understand the molecular basis for the highly conserved genomic subregions within enhancers of the midline genes, the ability of various motifs to affect midline expression, both individually and in combination, were tested within synthetic reporter constructs. Multiple copies of the binding site for the midline regulators Single-minded and Tango can drive expression in midline cells; however, small changes to the sequences flanking this transcription factor binding site can inactivate expression in midline cells and activate expression in tracheal cells instead. For the midline genes described in this study, the highly conserved sequences appear to juxtapose positive and negative regulatory factors in a configuration that activates genes specifically in the midline glia, while maintaining them inactive in other tissues, including midline neurons and tracheal cells (Fulkerson, 2010).

The results described in this study indicate that the four genes expressed in the midline glia contain enhancers with subregions conserved in 11 or 12 of the sequenced Drosophila genomes. These conserved subregions contain one or more of the four motifs previously identified in the wrapper regulatory region, are highly A/T rich, and needed for robust expression in the midline. These results confirm the importance of several transcription factor-binding sites for midline glial activation. One of these sites, the CME, binds both Sim/Tgo and Trh/Tgo heterodimers and, when multimerized, can drive reporter gene expression in both midline and tracheal cells. Two lines of evidence indicate that the context of the CME determines whether or not it can be utilized to drive expression in these two tissues. (1) The sequences flanking the CMEs are highly conserved in the four genes discussed in this study, Glec, oatp26f, liprinγ and wrapper, suggesting that the location and sequence of other transcription factor-binding sites are constrained. (2) Changing the sequences flanking the CME in the synthetic multimers can eliminate expression in the midline, trachea, or both tissues (Fulkerson, 2010).

A multimerized CME in the context of the 4Toll:GFP reporter was expressed in both the midline and trachea and quite sensitive to slight modifications in flanking sequences. Changing 5-7 nucleotides on either side of the CME within this multimerized construct either substantially elevated expression in the trachea and eliminated it in the midline (T rich:GFP) or eliminated expression in both tissues (Sox:GFP). Additional combinations between the CME and one of the other midline glial motifs restricted expression to the midline (Pnt:GFP) or the trachea (POU:GFP). These results indicate that testing binding sites for two different factors next to one another can disrupt the endogenous ordering and spacing of the sites within the enhancers. Significantly, the Toll:GFP and Pnt:GFP reporters, unlike the intact enhancers described in this study, drive GFP expression in both midline neurons and glia. This midline expression pattern suggests that the synthetic multimers may lack repressor-binding sites that restrict expression to midline glia. Taken together, these results demonstrate the sensitivity of CME function to flanking sequences within the midline enhancers (Fulkerson, 2010).

Existing experimental evidence suggests that unlike most transcription factors, Sim/Tgo heterodimers (as well as Trh/Tgo heterodimers) preferentially binds one sequence over all others: ACGTG, the CME. Within the enhancers described in this study, sites flanking the CME have remained unchanged over evolutionary time due, in part, to similarities between binding sites for Sim and Trh and the molecular consequences of changing nucleotides adjacent to the CME. This conservation may ensure transcription is restricted to the midline glia and repressed in tracheal cells. In addition to the midline enhancers reported in this study, regions conserved among Drosophila species were found within the known midline enhancers. For instance, a 1.0-kb enhancer present in the first intron of slit drives expression in the midline glia and it contains a single CME and a 32-bp sequence conserved in 11 Drosophila species. It is important to note that the number of midline enhancers described in this study is limited and not all the midline glial enhancers are likely to exhibit such a high degree of conservation. For instance, a midline enhancer of the ectoderm3 gene, was identified that exhibits much less conservation among Drosophila species and presently, the basis for the observed variation among enhancers is unknown (Fulkerson, 2010).

The Ets transcriptional activator, pnt, a downstream effector of EGFR signaling, and Drifter, a POU domain protein, are expressed in both embryonic midline glia and tracheal cells. Previous studies have shown that deleting a POU domain-binding site within an enhancer of rhomboid eliminated expression in tracheal cells, but did not affect its midline glial expression. The results described in this study confirm and extend these results and suggest that the location of a POU domain-binding site relative to the CME can play a role in determining if a gene is expressed in the midline glia, the trachea or both. Moreover, swapping the PAS domains between Sim and Trh proteins indicated that additional, midline or tracheal specific cofactors bind to the PAS domains of the individual proteins and likely to determine which genes are expressed in the two different cell types. This may be the reason sequences adjacent to the CME play such a critical and sensitive role in determining which tissues express the various reporter genes described here. To activate the midline genes, Sim may interact with Drifter and Pnt and bind to sequences flanked by different binding sites compared with sequences bound by Trh, Drifter, and Pnt needed to activate tracheal genes. The simplicity of the multimers studied in this paper raise the possibility that different PAS heterodimers may specifically interact with other factors, such as Drifter and Pnt, in a manner that depends on the relative location and/or distance between each binding site, as has been described for nuclear hormone receptor complexes (Fulkerson, 2010).

The results confirm those of Swanson (2010), who found binding sites can be juxtaposed in different ways within enhancers to favor particular short-range interactions, and, in this way, various combinations of transcription factor binding sites (inputs) can result in more than one output. Similarly, the motifs described in this study can be combined in different ways that result in either midline or tracheal expression. The results indicate the proximity of the CME to activators, one another and/or to repressors could contribute to the level of expression observed in the trachea and midline. This study focused on activator sites, but repressor sites are also likely present and restrict expression to certain cell types. Previous studies in Drosophila embryos have revealed the complexity of the transcriptional regulatory 'grammar' and have shown that the transcriptional output from various genes can be determined by the stoichiometry, affinity, spacing, arrangement, and distance between activator and repressor sites (Fulkerson, 2010).

The high degree of conservation within the midline enhancer subregions examined in this study here belies known properties of transcription factors and their recognition sequences, as well as observations made for many early developmental regulators of Drosophila development. Most transcription factors can vary considerably in the sequences they recognize and tend to bind to related sites with different affinities. This property would suggest that enhancers need not be strictly conserved to function, in contrast to what is reported here. The pattern of conserved sequences within these identified enhancers suggests that the transcription factors that bind these regions do so in a conserved order and spacing pattern. These results suggest that Sim and Trh may interact with other proteins to form an 'enhanceosome'-like complex, similar to that observed in the regulation of the interferon-β gene, in which activators and HMG proteins interact to form a specific multiprotein complex, with a defined structure. This model contrasts with the 'information display/billboard' model of enhancer function. In that model, enhancers are bound by a group of independent factors or group of factors that work together to promote or repress transcription in particular cell types. An important distinction between the two models is the arrangement of binding sites within an enhancer. Within an enhanceosome, the arrangement of binding sites relative to each other is constrained, whereas within a billboard enhancer, the relative arrangement of binding sites is rather flexible as long as a sufficient number of binding sites work together, in many possible configurations, to recruit factors for transcriptional activation (Fulkerson, 2010).

Results obtained with the midline glial genes examined in this study suggest that midline enhancers may consist of a nucleating enhanceosome-like region that combines with an 'information display/billboard' constellation of additional binding sites. This is supported by results obtained with the 70-bp conserved region of wrapper. When tested alone, it only marginally drives midline expression, whereas in the context of the 166-bp enhancer, it works quite well. Moreover, the 166-bp region of virilis cannot function on its own, but drives high levels of expression in the midline glia of melanogaster in the context of the larger, 476 bp region. That the 166-bp region from virilis cannot work efficiently in the midline suggests the transcription complex that binds to this region may be slightly different in virilis compared with melanogaster. For each enhancer described in this study, the presence of the conserved region is required to obtain expression in the midline glia (Fulkerson, 2010).

After comparing vertebrate genomes and generating reporter constructs with highly conserved noncoding sequences, Bailey (2006) noticed that many of these direct expression to regions of the CNS. It is possible that enhancers of CNS genes are more conserved compared with other gene sets, such as early developmental regulators of Drosophila that have been studied in detail. This may be due to the highly conserved nature of the transcription factors that regulate gene expression in this tissue, many of which have analogous functions in flies and mammals). Sox-binding sites are present throughout conserved regions of CNS genes and one of the similarities between these conserved CNS genes, the extensively characterized interferon-β enhanceosome and midline glial genes is the importance of HMG proteins. These proteins may bend the DNA, facilitating binding to highly structured, multiprotein complexes. The enhancers described here likely bind PAS and Sox proteins together with other conserved CNS regulators and it may be this combination of transcription factors that contributes to the similarly conserved arrangement of binding sites (Fulkerson, 2010).

Numerous combinations of transcription factor binding sites can be used to drive expression in many tissue types. Despite the conservation found in this study, binding sites for transcription factors do vary considerably, making it, at times, difficult to identify enhancers based on sequence conservation. In certain cases, changes within enhancers can generate diverse phenotypes between Drosophila populations. The continuing challenge is to understand both the forces constraining the enhancer sequences between Drosophila species, as well as how changes in these regions lead to significant modifications in the expression pattern of a gene, which over the long term, leads to variation among Drosophila populations and eventually, Drosophila species. For the midline genes described in this study, selection has stabilized the constellation of binding sites found within enhancers, resulting in their conservation among Drosophila species over approximately 40 million years of evolution (Fulkerson, 2010).

The Drosophila jing gene is a downstream target in the Trachealess/Tango tracheal pathway

Primary branching in the Drosophila trachea is regulated by the Trachealess (Trh) and Tango (Tgo) basic helix-loop-helix-PAS (bHLH-PAS) heterodimers, the POU protein Drifter (Dfr)/Ventral Veinless (Vvl), and the Pointed (Pnt) ETS transcription factor. The jing gene encodes a zinc finger protein also required for tracheal development. Three Trh/Tgo DNA-binding sites, known as CNS midline elements, in 1.5 kb of jing 5'cis-regulatory sequence (jing1.5) previously suggested a downstream role for jing in the pathway. This study shows that jing is a direct downstream target of Trh/Tgo and that Vvl and Pnt are also involved in jing tracheal activation. In vivo lacZ enhancer detection assays were used to identify cis-regulatory elements mediating embryonic expression patterns of jing. A 2.8-kb jing enhancer (jing2.8) drove lacZ expression in all tracheal cell lineages, the CNS midline and Engrailed-positive segmental stripes, mimicking endogenous jing expression. A 1.3-kb element within jing2.8 drove expression that was restricted to Engrailed-positive CNS midline cells and segmental ectodermal stripes. Surprisingly, jing1.5-lacZ expression was restricted to tracheal fusion cells despite the presence of consensus DNA-binding sites for bHLH-PAS, ETS, and POU domain transcription factors. Given the absence of Trh/Tgo DNA-binding sites in the jing1.3 enhancer, these results are consistent with previous observations suggesting a combinatorial basis to Trh-/Tgo-mediated transcriptional regulation in the trachea (Morozova, 2010).

In the developing Drosophila trachea, transcriptional regulation must be precisely coordinated with growth factor signaling to induce the appropriate cellular response. Studies of downstream transcriptional response elements in the transforming growth factor β (TGF-β) signaling pathway show the importance of discrete sequence changes differentiating an activation versus repressive response. Furthermore, such an activating enhancer element in the knirps gene in this pathway requires a cooperative effect with Trh and Tgo to possibly direct tissue specificity in the trachea. Tracheal gene expression is also controlled combinatorially by Trh/Tgo and Dfr/Vvl or either alone. Similarly, this study shows that Trh/Tgo response elements in the jing gene require additional elements to specify embryonic tracheal expression (Morozova, 2010).

Jing is implicated in transcriptional regulation in numerous biological processes, but its exact role is not known. This study extend previous observations of a role for jing in the trachea by establishing it as a direct downstream target of Trh/Tgo heterodimers. By analyzing jing 5' cis-regulatory regions, this study shows combinatorial basis to Trh/Tgo-mediated jing activation. A 2.8-kb jing enhancer recapitulates endogenous jing expression in the embryonic trachea, ectodermal stripes, and CNS midline. jing2.8 includes a distal 1.5-kb of genomic DNA that has three CMEs which are known for their involvement in combinatorial transcriptional regulation. The best evidence that Trh/Tgo complexes are able to directly activate the jing1.5 enhancer was gathered from Drosophila S2 cells by Luciferase reporter and ChIP assays. The CMEs in jing1.5-luc were required for activation by Trh/Tgo suggesting a protein-DNA interaction. Furthermore, Trh/Tgo heterodimers associated with and activated the jing1.5 enhancer. However, the combination of DNA-binding sites for bHLH-PAS, POU, and ETS transcription factors in jing1.5 is not capable of driving tracheal β-Gal expression in a pattern similar to that of endogenous jing. The jing1.3 enhancer cannot drive tracheal expression. Evidence is shown, in vitro and in vivo, that trh, pnt, and dfr/vvl regulate jing mRNA and even jing1.5-lacZ fusion cell expression. Given these results, along with the absence of additional CMEs and consensus POU domain-binding sites in jing1.3, it is proposed that trh and dfr/vvl regulate jing tracheal expression in combination with additional elements in jing1.3 (Morozova, 2010).

jing1.5 specifies a fusion cell component of jing expression that may instead be regulated by the bHLH-PAS transcription factors, Dys/Tgo. This is consistent with the presence of preferred and less preferred Dys/Tgo DNA-binding sites in the jing 1.5-lacZ enhancer. Prior to embryonic stage 12, trh is required for dys expression and then Dys and Archipelago downregulate trh specifically in fusion cells during stage 12. Therefore, Trh cannot activate jing1.5-lacZ in fusion cells from stage 12 which is consistent with the presence of fusion cell lacZ expression in embryos carrying CME deletions in jing1.5. The reductions in jing1.5-lacZ expression in the fusion cells of trh mutants may therefore result from subsequent reductions in dys expression (Morozova, 2010).

This study also characterized jing cis-regulatory elements controlling different aspects of jing expression in CNS glia and Engrailed-expressing midline neurons and segmental ectodermal cells. The midline expression of jing enhancers provided an opportunity to compare jing transcriptional regulation in two tissues. The data show that jing1.5 is sufficient to drive expression in MG and neurons where Jing is normally expressed. The CNS midline identity of jing1.5-lacZ-expressing cells was shown in several ways. First, jing1.5-lacZ expression was absent in a homozygous sim mutant background. Second, the jing1.5-lacZ expression domain was expanded by activating the Spitz Egfr ligand thereby forcing midline glial survival. Lastly, MG characteristics, such as oblong shape and dorsal positions, are shown by some jing1.5-lacZ-expressing midline cells. Therefore, this enhancer is differentially activated in the CNS midline and trachea suggesting that there may be differences in the mechanism by which Sim/Tgo and Trh/Tgo heterodimers activate transcription. This is consistent with the differential abilities of Sim/Tgo and Trh/Tgo to associate with Dfr/Vvl in vitro and the inability of trh to induce ectopic CNS midline gene expression (Morozova, 2010).

Strong CNS midline expression was also driven by the jing1.3 enhancer despite the absence of Sim/Tgo or Dfr/Vvl consensus DNA-binding sites. However, upon further characterization, the jing1.3-lacZ-expressing midline cells were found to express the segment polarity gene, engrailed (en). En-expressing CNS midline cells take up the posterior-most position within each VNC segment. Another En-positive midline cell lineage includes four to six MGP which are present at stage 13 but not at stage 17. The round shape of En-positive jing1.3-lacZ-expressing midline cells suggests that they belong to the MNB lineage and its progeny and do not belong to the MGP lineage. The mechanism of midline activation of jing1.3 is not known, but the ability of Jing to function as a repressor suggests that it may function combinatorially with En in segmental patterning. Further studies will be aimed at determining whether jing plays a role in segmental ectodermal patterning and its associated gene expression programs (Morozova, 2010).

Role of architecture in the function and specificity of two Notch-regulated transcriptional enhancer modules

In Drosophila melanogaster, cis-regulatory modules that are activated by the Notch cell-cell signaling pathway all contain two types of transcription factor binding sites: those for the pathway's transducing factor Suppressor of Hairless [Su(H)] and those for one or more tissue- or cell type-specific factors called 'local activators.' The use of different 'Su(H) plus local activator' motif combinations, or codes, is critical to ensure that only the correct subset of the broadly utilized Notch pathway's target genes are activated in each developmental context. However, much less is known about the role of enhancer "architecture"--the number, order, spacing, and orientation of its component transcription factor binding motifs--in determining the module's specificity. This study investigated the relationship between architecture and function for two Notch-regulated enhancers with spatially distinct activities, each of which includes five high-affinity Su(H) sites. The first, which is active specifically in the socket cells of external sensory organs, is largely resistant to perturbations of its architecture. By contrast, the second enhancer, active in the 'non-SOP' cells of the proneural clusters from which neural precursors arise, is sensitive to even simple rearrangements of its transcription factor binding sites, responding with both loss of normal specificity and striking ectopic activity. Thus, diverse cryptic specificities can be inherent in an enhancer's particular combination of transcription factor binding motifs. It is proposed that for certain types of enhancer, architecture plays an essential role in determining specificity, not only by permitting factor-factor synergies necessary to generate the desired activity, but also by preventing other activator synergies that would otherwise lead to unwanted specificities (Liu, 2012).

Detailed analysis of two different Notch-regulated transcriptional enhancer modules has revealed that they are very differently dependent on a particular architecture for their activity and specificity. The socket cell-specific ASE5 enhancer tolerates a variety of rearrangements of its required motifs without appreciable alteration of function in either nascent or mature sockets. Even when ASE5 is impaired quantitatively as a result of mutating all of its non-essential sequences, motif rearrangement generally has only modest effects on activity level, and never modifies the enhancer's specificity. In contrast, it was found that the mα enhancer is sensitive to simple exchanges in the positions of transcription factor binding motifs, responding with both loss of normal spatial specificity and ectopic activity (Liu, 2012).

Broadly speaking, then, one might say that ASE5 is more representative of a 'billboard' model of enhancer architecture (which posits that transcription factor binding motifs contribute to enhancer function largely independently of how they are organized), while the mα enhancer might be thought of as conforming more closely to an 'enhanceosome' model (which suggests that a module's function is crucially dependent on a particular configuration of transcription factor binding sites in order to create synergy between their inputs) (Liu, 2012).

It is useful to consider the characteristics that may determine whether a given module is more likely to lie at the 'billboard' or the 'enhanceosome' end of the spectrum. Though ASE5 and the mα enhancer are both Notch-activated, they function in different biological contexts, and it is suggested that this may be relevant to their respective architectural constraints. ASE5 acts in a single post-mitotic, differentiated cell type to establish and maintain autoregulation of Su(H) for several days. In this instance, due to the availability of cell type-specific 'local activators' such as Vvl, and the strong contribution that high Su(H) levels alone can make to the enhancer's activity, the need for a constrained architecture may be quite minimal. The mα enhancer, on the other hand, is faced with the challenging task of rapidly and transiently (over a period of hours) activating expression of the E(spl)mα gene in multiple non-SOP cells per PNC, while at the same time repressing its expression in each SOP. This might be expected to create a stringent requirement for constrained spacing between the lone proneural protein binding site and one or more Su(H) sites. At the same time, other aspects of the enhancer's normal specificity rely on inputs via POU-HD and/or homeodomain binding sites -- yet these must not be permitted to promote inappropriate activity in socket cells. Again, particular binding motif configurations may be called for as a preventative. The overall point is that two parameters -- an enhancer's specific biological task and context, and its particular combination of factor binding sites -- are likely to play a major role in determining the architectural constraints to which it may be subject (Liu, 2012).

The case of the mα enhancer serves to underscore the insufficiency, in many instances, of a transcription factor binding site 'code' in predicting the specificity of a cis-regulatory module. Despite the presence of five Su(H) sites and two motifs that can be bound by Vvl, the native mα enhancer shows no meaningful activity even in adult socket cells. Yet the mα-shuffle1 and mα-shuffle2 variants, in which the positions of the Vvl motifs are altered, do exhibit substantial adult socket cell activity. Thus, it is specifically the wild-type enhancer's architecture that normally prevents this from happening. A similar conclusion derives from examining the functionality of the proneural (E) plus Su(H) (S) 'code' embodied in the mα enhancer. When the lone E box site is in its native and evolutionarily conserved position 14 bp away from one of the Su(H) sites, it provides sufficient input to drive robust expression in all wing disc PNCs. But when it is moved instead to the location of one of the Vvl sites, the module's PNC activity is severely reduced. Again, the simple presence of Su(H) and proneural binding motifs in the mα enhancer does not suffice to predict its specificity; rather, the specific arrangement of these sites has a profound effect on its ability to generate the PNC specificity (Liu, 2012).

The critical role of binding site spacing and organization in generating the transcription factor synergies necessary for the normal activity of many enhancers is becoming increasingly clear. But the mutational analyses of both ASE5 and the mα enhancer demonstrate an equally important role for architecture in preventing inappropriate synergies and hence inappropriate specificities (Liu, 2012).

Two ASE5 variants are particularly informative in illuminating the importance of motif spacing in restraining enhancer activity. ASE5M2, in which only the five Su(H) sites are intact but spacing is preserved, is completely inactive in both pupal and adult socket cells. By contrast, the ABm version of ASE5-shrink, which likewise retains only the five Su(H) sites but now places them much closer together, is strongly active in adult sockets. Thus, ASE5's native architecture serves in part to prevent the Su(H) sites from responding on their own, and in this way maintains the enhancer's dependence on inputs from the box A and/or box B sequence elements, even in adult socket cells (Liu, 2012).

Next, the wholly ectopic responsiveness of mα-shrink in both pupal and adult socket cells demonstrates clearly that the potential for unrelated and unwanted specificities can be inherent in an enhancer's particular combination of transcription factor binding motifs. Even as it functions in an inappropriate cell type, mα-shrink follows a recognizable regulatory logic. Its activity in nascent socket cells is fully dependent, as expected from ASE5, on its POU-HD and/or homeodomain sites (and not on its 'E box' proneural protein binding site), while its robust adult socket activity -- as in the case of the ABm version of ASE5-shrink -- requires only the five Su(H) sites (Liu, 2012).

Finally, the far more modest alterations represented by the 'shuffle' versions of the mα enhancer explicitly demonstrate the critical role that motif placement and spacing may have in suppressing inappropriate specificities. Simply exchanging the position of one of the module's 'Vvl' sites with that of the E box proneural site creates novel activities in both the wing imaginal disc and the socket cell (Liu, 2012).

In a recent report, Swanson (2011) identified short-range transcriptional repression as the mechanism that prevents the cone cell-specific sparkling (spa) eye enhancer, which serves the Drosophila dPax2 gene, from being ectopically active in nearby photoreceptor cells. In this instance, moving the repression-mediating sequences out of their native context apparently eliminated their ability to exert a repressive effect, permitting the module to be active in an inappropriate cell type (Liu, 2012).

It is believed that these results with the mα enhancer are most simply consistent with a different mechanism for restraining unwanted enhancer specificities. In this model, the relative positions and spacings of transcription factor binding sites are organized so as to promote functional synergies between activators that generate the desired specificity, while at the same time preventing different activator synergies that would otherwise create undesirable specificities. Note that, while this mechanism places definite constraints on the allowable motif locations in the module, it does not require that the enhancer be transcriptionally repressed in the incorrect cell type(s) (Liu, 2012).

The possibility that, despite their simplicity, both of the 'site switches' embodied in the mα-shuffle1 and mα-shuffle2 constructs have disrupted the interaction of a short-range repressor with its target activator(s) cannot strictly rulef out. However, this is thought unlikely for a number of reasons. For example, such a repressor would have to be active in both a broad zone of wing disc tissue and in socket cells — two very different settings. It is suggested instead that the most parsimonious explanation for these findings is the synergy promotion/prevention model described above (Liu, 2012).

What might determine whether a given enhancer makes use of active repression to limit its specificity, or instead utilizes a simpler synergy prevention mechanism? One reasonable possibility is that repression is required, or more common, when the ectopic specificity that must be prevented consists of a cell or cells that are very closely related developmentally to those in which expression is wanted. Such inappropriate cells may be spatially very close to the correct cells, and/or may have a high degree of similarity in their developmental histories and gene expression profiles. In such cases, it may be difficult or impossible to evolve a motif architecture that simultaneously allows the proper activity and prevents the improper. On the other hand, when the ectopic specificity is a very different cell type or tissue, distant both temporally and spatially from the correct one, and sharing very little developmental history, perhaps motif arrangements that act to prevent inappropriate synergies are easier to evolve. Under this rubric, the use of repression by modules as different as the eve stripe 2 and spa enhancers is readily understood, just as the mα enhancer might instead be expected to inhibit socket cell activity by prevention of the necessary activator synergy. Indeed, the mα module appears to make use of both mechanisms: Activity of this enhancer in the SOP cell is antagonized by repression mediated by Su(H). As a member of the PNC, the SOP is of course surrounded by, and very closely related to, the non-SOPs (Liu, 2012).

Finally, it is interesting to consider what characteristics of an enhancer might put it particularly at risk for ectopic activity, which in turn would require the use of the preventive mechanisms that this study consider. Certainly utilizing transcription factors that are broadly expressed and active [such as Su(H)] would contribute to such a need, as would using inputs from factors that are members of paralogous families with very similar DNA-binding specificities (e.g., POU-HD proteins) (Liu, 2012).

The results described in this study, have important implications for understanding of enhancer evolution. It appears that, due to the specific combination of transcription factor binding motifs they employ, some (perhaps most) enhancers harbor the hidden potential to generate certain novel specificities that can be revealed through comparatively simple sequence changes. In a sense, such enhancers are 'poised' to express these silent specificities. Depending on how widespread this phenomenon is among enhancers in the whole genome, a tremendous potential may exist to explore a vast 'specificity space' through modest mutational events. Moreover, when applied to an individual enhancer, this perspective suggests that a particular novel specificity -- one that requires only relatively minor changes in motif placement to be expressed -- might be seen to evolve independently in more than one lineage (Liu, 2012).

These results also suggest that the minimum size of a given enhancer module may be subject to significant constraints, due to the need to prevent unwanted activator synergies through motif spacing. Thus, even if not all sequences in the enhancer mediate transcription factor inputs, some may be preserved evolutionarily in order to maintain distance between transcription factor binding sites (Liu, 2012).

Protein Interactions

During Drosophila embryogenesis the CNS midline cells have organizing activities that are required for proper elaboration of the axon scaffold and differentiation of neighboring neuroectodermal and mesodermal cells. CNS midline development is dependent on Single-minded, a basic-helix-loop-helix (bHLH)-PAS transcription factor. Fish-hook/Dichaete, a Sox HMG domain protein, and Drifter (Dfr), a POU domain protein, act in concert with Single-minded to control midline gene expression. single-minded, Dichaete, and drifter are all expressed in developing midline cells, and both loss- and gain-of-function assays reveal genetic interactions between these genes. The corresponding proteins bind to DNA sites present in a 1 kb midline enhancer from the slit gene and regulate the activity of this enhancer in cultured Drosophila Schneider line 2 cells. Dichaete directly associates with the PAS domain of Single-minded and the POU domain of Drifter; the three proteins can together form a ternary complex in yeast. In addition, Dichaete can form homodimers and also associates with other bHLH-PAS and POU proteins. These results indicate that midline gene regulation involves the coordinate functions of three distinct types of transcription factors. Functional interactions between members of these protein families may be important for numerous developmental and physiological processes (Ma, 2000).

To address whether the sim, Dichaete, and dfr genes might functionally interact to regulate development of the embryonic CNS midline, whether they exhibit overlapping expression in developing midline cells was examined. This was accomplished using anti-Dichaete and anti-Dfr sera, as well as a P[3.7sim-lacZ] marker that mimics sim midline expression. P[3.7sim-lacZ] embryos were immunostained using anti-ß-gal and either anti-Dichaete or anti-Dfr sera. Prominent overlapping expression was detected between Sim and Dichaete in developing CNS midline cells from stage 8 throughout the remainder of germ band extension. Overlap was also detected in a subset of prospective foregut cells. Similar overlapping expression was also detected between Sim and Dfr. Midline coexpression of Dichaete and Dfr was detected by immunostaining wild-type embryos with anti-Dichaete and anti-Dfr sera. Both genes are expressed together in the CNS midline throughout germ band extension. In germ band-retracted embryos, Dichaete exhibits overlapping expression with Sim and Dfr in the midline glia. Dichaete and Dfr are also detected together in lateral cells of the thoracic ganglia and a subset of ventral epidermal cells. These analyses indicate that sim, Dichaete, and dfr are coexpressed in developing CNS midline cells. The midline expression of these three genes also overlaps that of the slit gene, which is a downstream target of Sim (Ma, 2000).

Both loss-of-function and gain-of-function assays were used to detect genetic interactions between sim, Dichaete, and dfr. Mutants are known to show genetic interactions in CNS midline differentiation and in Slit protein expression. Potential cooperative interactions between sim, Dichaete, and dfr in regulating slit gene transcription were examined through the use of a P[1.0slit-lacZ] marker. This reporter contains a portion of a slit intron that drives lacZ expression mimicking that of the native slit gene in developing midline glia; P[1.0slit-lacZ] expression is first detected in germ band-extended stage 11 embryos and is maintained throughout the remainder of embryogenesis. Dichaete null mutant embryos exhibit a misplacement and loss of midline glia, as detected via anti-ß-gal immunostaining. P[1.0slit-lacZ] is expressed normally in stage 11 Dichaete mutant embryos, but during germ band retraction the number of midline glia becomes reduced from wild type, and many cells are located at aberrant ventral positions within the nerve cord. Similar, although less severe, defects are observed in dfr mutant embryos, where some midline glia are displaced from their normal positions. Notably, ß-gal-expressing midline glia are still detected in both Dichaete and dfr mutants, indicating that unlike Sim, Dichaete and Dfr are not absolutely required for P[1.0slit-lacZ] expression or midline glial development (Ma, 2000).

A dfr-Dichaete double mutant strain was used to examine whether Dichaete and dfr might act together to regulate midline gene expression. Embryos mutant for both Dichaete and dfr exhibit much more severe defects in P[1.0slit-lacZ] expression than either Dichaete or dfr single mutants. Although P[1.0slit-lacZ] is activated normally in stage 11 dfr-Dichaete double mutant embryos, there is a striking loss of midline P[1.0slit-lacZ] expression during germ band retraction. This synergistic effect strongly suggests that Dichaete and Dfr function together to regulate slit transcription. These functions may be mediated directly through Dichaete and Dfr binding sites present in the slit 1 kb regulatory region. Another, nonexclusive possibility is that Dichaete and Dfr might indirectly control slit transcription by regulating the expression of sim. To address this possibility P[3.7sim-lacZ] expression was examined in wild-type and dfr-Dichaete embryos. Compared with wild-type embryos, dfr-Dichaete double mutants exhibit a severe decrease in P[3.7sim-lacZ] expression, a phenotype that first becomes apparent during germ band retraction. Thus, Dichaete and dfr also influence sim expression and hence may indirectly influence the expression of a wide array of midline genes (Ma, 2000).

Analysis of a 380 bp slit midline regulatory fragment has indicated the presence of a single CNS midline element (CME), through which Sim::Tgo heterodimers act. The CME is located within 300 bp of the distal end (farther from the promoter in the native slit gene) of this fragment. An inverted TTCAAT repeat (TTCAATTTCATTGAA) is located 20 bp proximal to the CME. This sequence resembles a (A/T)(A/T)CAAT consensus binding site for Sox proteins, although binding of Sox proteins to a TTCAAT sequence has not been reported. Because sequences present in an extended 1 kb slit DNA fragment are required for normal levels of slit expression in vivo, additional DNA sequences have been obtained. This analysis indicated that no other CMEs are present in the 1 kb slit DNA fragment. However, two perfect Dfr consensus binding sites, ATGCAAAT and CATAAAT, located within 500 bp of DNA proximal to the CME were identified. These two Dfr binding sites are separated by ~150 bp and flank a consensus Dichaete binding site, TACAAT. These data suggest that Dichaete, Sim, and Dfr may all bind to sites present in the 1 kb slit regulatory DNA fragment. To test this possibility, DNA gel mobility shift assays were performed using the Dichaete HMG domain and full-length Dfr protein on double-stranded oligonucleotide probes corresponding to sequences from the slit 1 kb fragment. The Dichaete HMG domain binds strongly to a 26 mer probe containing the TACAAT site. In contrast, Dichaete does not bind consistently to a 26 mer probe containing both TTCAAT sites, suggesting that Dichaete can distinguish between closely related DNA sequences. Dfr protein binds very strongly to a 33 mer probe that contains the ATGCAAAT site, and less strongly to a 32 mer probe containing the CATAAAT site. Dfr binds the ATGCAAAT site both as an apparent monomer and a dimer, because two distinct bands with reduced mobilities are detected. The 1 kb slit fragment thus may integrate the actions of at least three different types of regulatory proteins, represented by Sim, Dichaete, and Dfr (Ma, 2000).

The ability of Dichaete, Dfr, Sim, and Tgo to directly control slit transcription was examined using transient transcription assays in cultured Drosophila S2 cells. The P[1.0slit-lacZ] construct was used as a reporter with various combinations of plasmids that express Dichaete, Dfr, Sim, or Tgo. Dichaete modestly activates P[1.0slit-lacZ] transcription, indicating that in both yeast and fly cells, Dichaete can function as a direct transcriptional activator. Dfr results in little if any activation of P[1.0slit-lacZ], and Dfr and Dichaete together do not exhibit any increased activation over the levels observed for Dichaete alone. Neither Sim nor Tgo alone is able to activate the P[1.0slit-lacZ] reporter, because only background levels of expression are detected. Furthermore, Sim and Tgo together yield only minimal activation. These results imply that although Sim::Tgo heterodimers strongly activate expression of a P[6XCME-lacZ] reporter (>150 units) that contains six multimerized CMEs, additional factors are required to achieve high levels of reporter expression. Significantly, the combination of either Dichaete and Sim::Tgo or Dfr and Sim::Tgo both result in relatively high levels of activation. Thus, both Dichaete and Dfr strongly enhanced the ability of Sim::Tgo heterodimers to activate slit transcription. Comparable levels of activation are observed when all four proteins are expressed together. Taken together, the DNA binding and transcriptional activation assays provide additional evidence that regulation of slit expression in the midline glia requires functional interactions between Dichaete, Dfr, Sim, and Tgo (Ma, 2000).

Functional interactions between Sim, Dichaete, and Dfr may also regulate the midline expression of other genes, including sim and breathless (btl). Thus, sim has autoregulatory functions, and the combined functions of dfr and Dichaete are also required for sustained midline sim expression. In addition, a 2.8 kb interval in the P[3.7sim-lacZ] transgene used in this study contains six evolutionarily conserved CMEs as well as several consensus Dichaete and Dfr binding sites. btl encodes an FGF receptor homolog whose expression in the CNS midline and tracheal cells has been shown to depend, respectively, on Dfr as well as Sim and Tgo, or Trh and Tgo. A 200 bp btl midline/tracheal regulatory region contains three evolutionarily conserved CMEs. Inspection of this region also revealed the presence of a conserved consensus ATCAAT Dichaete binding site located in a 40 bp interval between CME2 and CME3, as well as a conserved consensus GATAAAT Dfr binding site located 40 bp downstream of CME3. Thus, functional interactions between Sim, Dichaete, and Dfr could be a general mechanism to regulate gene transcription during CNS midline development (Ma, 2000).

I-POU does not form a complex with Drifter as reported previously (Treacy, 1992 and Turner, 1996).

drifter: Biological Overview | Evolutionary Homologs | Developmental Biology | Effects of Mutation | References

The Interactive Fly resides on the
Society for Developmental Biology's Web server.