Interactive Fly, Drosophila

Gene name - engrailed

Synonyms -

Cytological map position - 48A3-4

Function - transcription factor

Keywords - segment polarity

Symbol - en

FlyBase ID:FBgn0000577

Genetic map position - 2-62.0

Classification - homeodomain

Cellular location - nuclear

NCBI link: Entrez Gene

en orthologs: Biolitmine

Recent literature

Dominguez-Cejudo, M. A. and Casares, F. (2015). Antero-posterior patterning of Drosophila ocelli requires an anti-repressor mechanism within the hh-pathway mediated by the Six3 gene Optix. Development 142(16):2801-9. PubMed ID: 26160900
Summary:
In addition to the compound eyes, most insects possess a set of three dorsal ocelli that develop at the vertices of a triangular cuticle patch, forming the ocellar complex. The wingless and hedgehog signaling pathways, together with the transcription factor encoded by orthodenticle, are known to play major roles in the specification and patterning of the ocellar complex. Specifically, hedgehog is responsible for the choice between ocellus and cuticle fates within the ocellar complex primordium. However, the interaction between signals and transcription factors known to date do not fully explain how this choice is controlled. This study shows that this binary choice depends on dynamic changes in the domains of hedgehog signaling. In this dynamics, the restricted expression of engrailed, a hedgehog-signaling target, is key because it defines a domain within the complex where hh transcription is maintained while the pathway activity is blocked. The Drosophila Six3, Optix, is expressed in and required for the development of the anterior ocellus specifically. Optix would not act as an ocellar selector, but rather as a patterning gene, limiting the en expression domain. These results indicate that, despite their genetic and structural similarity, anterior and posterior ocelli are under different genetic control.

De, S., Mitra, A., Cheng, Y., Pfeifer, K. and Kassis, J. A. (2016). Formation of a Polycomb-domain in the absence of strong Polycomb response elements. PLoS Genet 12: e1006200. PubMed ID: 27466807
Summary:
Polycomb group response elements (PREs) in Drosophila are DNA-elements that recruit Polycomb proteins (PcG) to chromatin and regulate gene expression. PREs are easily recognizable in the Drosophila genome as strong peaks of PcG-protein binding over discrete DNA fragments; many small but statistically significant PcG peaks are also observed in PcG domains. Surprisingly, in vivo deletion of the four characterized strong PREs from the PcG regulated invected-engrailed (inv-en) gene complex did not disrupt the formation of the H3K27me3 domain and did not affect inv-en expression in embryos or larvae suggesting the presence of redundant PcG recruitment mechanism. Further, the 3D-structure of the inv-en domain was only minimally altered by the deletion of the strong PREs. A reporter construct containing a 7.5kb en fragment that contains three weak peaks but no large PcG peaks forms an H3K27me3 domain and is PcG-regulated. These data suggests a model for the recruitment of PcG-complexes to Drosophila genes via interactions with multiple, weak PREs spread throughout an H3K27me3 domain.

Luo, L., Siah, C. K. and Cai, Y. (2017). Engrailed acts with Nejire to control decapentaplegic expression in the Drosophila ovarian stem cell niche. Development 144(18): 3224-3231. PubMed ID: 28928281
Summary:
Homeostasis of adult tissues is maintained by a small number of stem cells, which are sustained by their niches. In the Drosophila female germline stem cell (GSC) niche, Decapentaplegic (Dpp) is the primary factor that promotes GSC self-renewal. However, the mechanism regulating dpp expression in the niche is largely unknown. This study identified a 2.0 kb fragment located in a 5' cis-regulatory region of the dpp locus containing enhancer activity that drives its expression in the niche. This region is distinct from a previously characterized 3' cis-regulatory enhancer responsible for dpp expression in imaginal discs. These data demonstrate that Engrailed, a homeodomain-containing transcription factor that serves as a cap cell marker, binds to this region and regulates dpp expression in cap cells. Further data suggest that En forms a complex with Nejire (Nej), the Drosophila ortholog of histone acetyltransferase CBP/p300, and directs Nej to this cis-regulatory region where Nej functions as the co-activator for dpp expression. Therefore, this study defines the molecular pathway controlling dpp expression in the Drosophila ovarian stem cell niche.

Bonneaud, N., Layalle, S., Colomb, S., Jourdan, C., Ghysen, A., Severac, D., Dantec, C., Negre, N. and Maschat, F. (2017). Control of nerve cord formation by Engrailed and Gooseberry-neuro: A multi-step, coordinated process. Dev Biol. PubMed ID: 29097190
Summary:
One way to better understand the molecular mechanisms involved in the construction of a nervous system is to identify the downstream effectors of major regulatory proteins. It has been shown that Engrailed (EN) and Gooseberry-Neuro (GsbN) transcription factors act in partnership to drive the formation of posterior commissures in the central nervous system of Drosophila. This report identifies genes regulated by both EN and GsbN through chromatin immunoprecipitation ("ChIP on chip") and transcriptome experiments, combined to a genetic screen relied to the gene dose titration method. The genomic-scale approaches allowed definition of 175 potential targets of EN-GsbN regulation. A subset of these genes was chosen to examine ventral nerve cord (VNC) defects; half of the mutated targets show clear VNC phenotypes when doubly heterozygous with en or gsbn mutations, or when homozygous. This strategy revealed new groups of genes never described for their implication in the construction of the nerve cord. Their identification suggests that, to construct the nerve cord, EN-GsbN may act at three levels, in: (1) sequential control of the attractive-repulsive signaling that ensures contralateral projection of the commissural axons, (2) temporal control of the translation of some mRNAs, (3) regulation of the capability of glial cells to act as commissural guideposts for developing axons. These results illustrate how an early, coordinated transcriptional control may orchestrate the various mechanisms involved in the formation of stereotyped neuronal networks. They also validate the overall strategy to identify genes that play crucial role in axonal pathfinding.

De, S., Cheng, Y., Sun, M. A., Gehred, N. D. and Kassis, J. A. (2019). Structure and function of an ectopic Polycomb chromatin domain. Sci Adv 5(1): eaau9739. PubMed ID: 30662949
Summary:
Polycomb group proteins (PcGs) drive target gene repression and form large chromatin domains. In Drosophila, DNA elements known as Polycomb group response elements (PREs) recruit PcGs to the DNA. This study shows that, within the invected-engrailed (inv-en) Polycomb domain, strong, constitutive PREs are dispensable for Polycomb domain structure and function. It is suggested that the endogenous chromosomal location imparts stability to this Polycomb domain. To test this possibility, a 79-kb en transgene was inserted into other chromosomal locations. This transgene is functional and forms a Polycomb domain. The spreading of the H3K27me3 repressive mark, characteristic of PcG domains, varies depending on the chromatin context of the transgene. Unlike at the endogenous locus, deletion of the strong, constitutive PREs from the transgene leads to both loss- and gain-of function phenotypes, demonstrating the important role of these regulatory elements. These data show that chromatin context plays an important role in Polycomb domain structure and function.

Tian, Y. and Smith-Bolton, R. K. (2021). Regulation of growth and cell fate during tissue regeneration by the two SWI/SNF chromatin-remodeling complexes of Drosophila. Genetics 217(1): 1-16. PubMed ID: 33683366
Summary:
To regenerate, damaged tissue must heal the wound, regrow to the proper size, replace the correct cell types, and return to the normal gene-expression program. However, the mechanisms that temporally and spatially control the activation or repression of important genes during regeneration are not fully understood. To determine the role that chromatin modifiers play in regulating gene expression after tissue damage, ablation was induced in Drosophila melanogaster imaginal wing discs, and a screen was carried out for chromatin regulators that are required for epithelial tissue regeneration. Many of these genes are shown to be important for promoting or constraining regeneration. Specifically, the two SWI/SNF chromatin-remodeling complexes play distinct roles in regulating different aspects of regeneration. The PBAP (see Polybromo) complex regulates regenerative growth and developmental timing, and is required for the expression of JNK signaling targets and the growth promoter Myc. By contrast, the BAP complex (see Osa) ensures correct patterning and cell fate by stabilizing the expression of the posterior gene engrailed. Thus, both SWI/SNF complexes are essential for proper gene expression during tissue regeneration, but they play distinct roles in regulating growth and cell fate.

Janssen, R., Turetzek, N. and Pechmann, M. (2022). Lack of evidence for conserved parasegmental grooves in arthropods. Dev Genes Evol. PubMed ID: 35038005
Summary:
In the arthropod model species Drosophila melanogaster, a dipteran fly, segmentation of the anterior-posterior body axis is under control of a hierarchic gene cascade. Segmental boundaries that form morphological grooves are established posteriorly within the segmental expression domain of the segment-polarity gene (SPG) engrailed (en). More important for the development of the fly, however, are the parasegmental boundaries that are established at the interface of en expressing cells and anteriorly adjacent wingless (wg) expressing cells. In Drosophila, both segmental and transient parasegmental grooves form. The latter are positioned anterior to the expression of en. Although the function of the SPGs in establishing and maintaining segmental and parasegmental boundaries is highly conserved among arthropods, parasegmental grooves have only been reported for Drosophila, and a spider (Cupiennius salei). This study presents new data on en expression, and re-evaluate published data from four distantly related spiders, including Cupiennius, and a distantly related chelicerate, the harvestman Phalangium opilio. Gene expression analysis of en genes in these animals does not corroborate the presence of parasegmental grooves. Consequently, these data question the general presence of parasegmental grooves in arthropods.

Kato, Y., Sawada, A., Tonai, K., Tatsuno, H., Uenoyama, T. and Itoh, M. (2022). A new allele of engrailed, en(NK14), causes supernumerary spermathecae in Drosophila melanogaster. Genes Genet Syst. PubMed ID: 35264511
Summary:
A spontaneous mutation, en^NK14, was a new allele of engrailed ^en in Drosophila melanogaster. Females of en^NK14 have three spermathecae, instead of two in wild type, under a wide range of developmental temperatures, while the males show no abnormal phenotype. Spermathecae of the mutant female can accept inseminated sperms, albeit with a delay of at least an hour until full acceptance compared with wild type. The time course of decrease in the number of stored sperms was thoroughly similar between the mutant and wild type. en^NK14 females produced fewer progeny than wild type females despite storing a larger number of sperms. The delay of sperm entry and lower fecundity suggested some functional defects in secretory products of the spermathecae. In addition, some spermathecae in the mutant were accompanied by a mass of brown pigments in the adipose tissue surrounding the capsule. Six contiguous amino acids, Ser340-Ala345, were replaced by one Thr in en^NK14. In another mutant, en(spt), Ser325 was also shown to be substituted by a Cys. These amino acid changes were located within a serine-rich region, in which Ser325, Ser340 and Thr341 were suggested as targets of Protein Kinase C by an in silico analysis. The splicing pattern of en mRNA did not differ between en^NK14 and wild type in embryo, larva, pupa or adult. These results suggest that en plays an important role in determining the number of spermathecae as well as in sperm storage function in the Drosophila female.

Brown, J. L., Price, J. D., Erokhin, M. and Kassis, J. A. (2023). Context-dependent role of Pho binding sites in Polycomb complex recruitment in Drosophila. Genetics 224(4). PubMed ID: 37216193
Summary:
Polycomb group (PcG) proteins maintain the silenced state of key developmental genes, but how these proteins are recruited to specific regions of the genome is still not completely understood. In Drosophila, PcG proteins are recruited to Polycomb response elements (PREs) comprised of a flexible array of sites for sequence-specific DNA binding proteins, "PcG recruiters," including Pho, Spps, Cg, and GAF. Pho is thought to play a central role in PcG recruitment. Early data showed that mutation of Pho binding sites in PREs in transgenes abrogated the ability of those PREs to repress gene expression. In contrast, genome-wide experiments in pho mutants or by Pho knockdown showed that PcG proteins can bind to PREs in the absence of Pho. This study directly addressed the importance of Pho binding sites in 2 engrailed (en) PREs at the endogenous locus and in transgenes. The results show that Pho binding sites are required for PRE activity in transgenes with a single PRE. In a transgene, 2 PREs together lead to stronger, more stable repression and confer some resistance to the loss of Pho binding sites. Making the same mutation in Pho binding sites has little effect on PcG-protein binding at the endogenous en gene. Overall, these data support the model that Pho is important for PcG binding but emphasize how multiple PREs and chromatin environment increase the ability of PREs to function in the absence of Pho. This supports the view that multiple mechanisms contribute to PcG recruitment in Drosophila.

Blunk, S., Garcia-Verdugo, H., O'Sullivan, S., Camp, J., Haines, M., Coalter, T., Williams, T. A. and Nagy, L. M. (2023). Functional Divergence of the Tribolium castaneum engrailed and invected Paralogs. Insects 14(8). PubMed ID: 37623401
Summary:
Engrailed (en) and invected (inv) encode paralogous transcription factors found as a closely linked tandem duplication within holometabolous insects. Drosophila en mutants segment normally, then fail to maintain their segments. Loss of Drosophila inv is viable, while loss of both genes results in asegmental larvae. Surprisingly, the knockdown of Oncopeltus inv can result in the loss or fusion of the entire abdomen and en knockdowns in Tribolium show variable degrees of segmental loss. The consequence of losing or knocking down both paralogs on embryogenesis has not been studied beyond Drosophila. To further investigate the relative functions of each paralog and the mechanism behind the segmental loss, Tribolium double and single knockdowns of en and inv were analyzed. The most common cuticular phenotype of the double knockdowns was small, limbless, and open dorsally, with all but a single, segmentally iterated row of bristles. Less severe knockdowns had fused segments and reduced appendages. The Tribolium paralogs appear to act synergistically: the knockdown of either Tribolium gene alone was typically less severe, with all limbs present, whereas the most extreme single knockdowns mimic the most severe double knockdown phenotype. Morphological abnormalities unique to either single gene knockdown were not found. inv expression was not affected in the Tribolium en knockdowns, but hh expression was unexpectedly increased midway through development. Thus, while the segmental expression of en/inv is broadly conserved within insects, the functions of en and inv are evolving independently in different lineages.

Cheng, Y., Chan, F. and Kassis, J. A. (2023). The activity of engrailed imaginal disc enhancers is modulated epigenetically by chromatin and autoregulation. bioRxiv. PubMed ID: 37502849
Summary:
engrailed (en) encodes a homeodomain transcription factor crucial for the proper development of Drosophila embryos and adults. Like many developmental transcription factors, en expression is regulated by many enhancers, some of overlapping function, that drive expression in spatially and temporally restricted patterns. The en embryonic enhancers are located in discrete DNA fragments that can function correctly in small reporter transgenes. In contrast, the en imaginal disc enhancers (IDEs) do not function correctly in small reporter transgenes. En is expressed in the posterior compartment of wing imaginal disks; small IDE-reporter transgenes are expressed in the anterior compartment, the opposite of what is expected. The data show that the En protein binds to en IDEs, and it is suggested that En directly represses IDE function. Two en IDEs, 'O' and 'S' were identified. Deletion of either of these IDEs from a 79kb HA-en rescue transgene (HAen79) caused a loss-of-function en phenotype when the HAen79 transgene was the sole source of En. In contrast, flies with a deletion of the same IDEs from the endogenous en gene had no phenotype, suggesting a resiliency not seen in the HAen79 rescue transgene. Inserting a gypsy insulator in HAen79 between en regulatory DNA and flanking sequences strengthened the activity of HAen79, giving better function in both the ON and OFF transcriptional states. Altogether these data show that the en IDEs stimulate expression in the entire imaginal disc, and that the ON/OFF state is set by epigenetic regulators. Further, the endogenous locus imparts a stability to en function not seen even in a large transgene, reflecting the importance of both positive and negative epigenetic influences that act over relatively large distances in chromatin.

Blunk, S., Garcia-Verdugo, H., O'Sullivan, S., Camp, J., Haines, M., Coalter, T., Williams, T. A., Nagy, L. M. (2023). Functional Divergence of the Tribolium castaneum engrailed and invected Paralogs. Insects, 14(8) PubMed ID: 37623401
Summary:
engrailed (en) and invected (inv) encode paralogous transcription factors found as a closely linked tandem duplication within holometabolous insects. Drosophila en mutants segment normally, then fail to maintain their segments. Loss of Drosophila inv is viable, while loss of both genes results in asegmental larvae. Surprisingly, the knockdown of Oncopeltus inv can result in the loss or fusion of the entire abdomen and en knockdowns in Tribolium show variable degrees of segmental loss. The consequence of losing or knocking down both paralogs on embryogenesis has not been studied beyond Drosophila. To further investigate the relative functions of each paralog and the mechanism behind the segmental loss, Tribolium double and single knockdowns of en and inv were analyzed. The most common cuticular phenotype of the double knockdowns was small, limbless, and open dorsally, with all but a single, segmentally iterated row of bristles. Less severe knockdowns had fused segments and reduced appendages. The Tribolium paralogs appear to act synergistically: the knockdown of either Tribolium gene alone was typically less severe, with all limbs present, whereas the most extreme single knockdowns mimic the most severe double knockdown phenotype. Morphological abnormalities unique to either single gene knockdown were not found. inv expression was not affected in the Tribolium en knockdowns, but hh expression was unexpectedly increased midway through development. Thus, while the segmental expression of en/inv is broadly conserved within insects, the functions of en and inv are evolving independently in different lineages.

Cheng, Y., Chan, F., Kassis, J. A. (2023). The activity of engrailed imaginal disc enhancers is modulated epigenetically by chromatin and autoregulation. PLoS Genet, 19(11):e1010826 PubMed ID: 37967127
Summary:
engrailed (en) encodes a homeodomain transcription factor crucial for the proper development of Drosophila embryos and adults. Like many developmental transcription factors, en expression is regulated by many enhancers, some of overlapping function, that drive expression in spatially and temporally restricted patterns. The en embryonic enhancers are located in discrete DNA fragments that can function correctly in small reporter transgenes. In contrast, the en imaginal disc enhancers (IDEs) do not function correctly in small reporter transgenes. En is expressed in the posterior compartment of wing imaginal discs; in contrast, small IDE-reporter transgenes are expressed mainly in the anterior compartment.En was found to bind to the IDEs, and suggesting that it may directly repress IDE function and modulate En expression levels. Two en IDEs, O and S were discrovered. Deletion of either of these IDEs from a 79kb HA-en rescue transgene (HAen79) caused a loss-of-function en phenotype when the HAen79 transgene was the sole source of En. In contrast, flies with a deletion of the same IDEs from an endogenous en gene had no phenotype, suggesting a resiliency not seen in the HAen79 rescue transgene. Inserting a gypsy insulator in HAen79 between en regulatory DNA and flanking sequences strengthened the activity of HAen79, giving better function in both the ON and OFF transcriptional states. Altogether these data suggest that the en IDEs stimulate expression in the entire imaginal disc, and that the ON/OFF state is set by epigenetic memory set by the embryonic enhancers. This epigenetic regulation is similar to that of the Ultrabithorax IDEs and it is suggested that the activity of late-acting enhancers in other genes may be similarly regulated.

BIOLOGICAL OVERVIEW

Opposing transcriptional outputs of Hedgehog signaling and Engrailed control compartmental cell sorting at the Drosophila A/P boundary

The wing imaginal disc is subdivided into two nonintermingling sets of cells: the anterior (A) and posterior (P) compartments. Anterior cells require reception of the Hedgehog (Hh) signal to segregate from P cells. Evidence is provided that Hh signaling controls A/P cell segregation not by directly modifying structural components but by a Cubitus interruptus (Ci)-mediated transcriptional response. A shift in the balance between repressor and activator forms of Ci toward the activator form is necessary and sufficient to define 'A-type' cell sorting behavior. Moreover, Engrailed (En), in the absence of Ci, is sufficient to specify 'P-type' sorting. It is proposed that the opposing transcriptional activities of Ci and En control cell segregation at the A/P boundary by regulating a single cell adhesion molecule (Dahmann, 2000).

To test the role of En and Hh-signaling components in controlling cell segregation, two experimental assays were applied. Both assays are based on the presumption that cells maximize contact (intermingle) with cells of the same adhesiveness and minimize contact with (sort out from) cells of different adhesiveness. In the 'round-up assay', clones of mutant cells are assayed for their shape. Each clone is analyzed by how circular it is and how smoothly its border interfaces with surrounding tissue. The degree of roundness of the clone and smoothness of its border is taken as a measure for the difference in adhesiveness between cells inside and outside of the clone. In the wild-type wing imaginal disc, cell segregation is confined to the region of the compartment boundaries. Thus, in the more stringent 'choice assay,' clones generated in the vicinity of the A/P boundary are monitored for their sorting behavior. Clones have three choices: they can (1) remain within their compartment of origin; (2) sort completely into the territory of the adjacent compartment defining a straight border with cells of the compartment of origin at the normal position of the A/P boundary, or (3) sort out from cells of both compartments and take up positions overlapping the normal site of the A/P boundary. Depending on the genetic intervention, the compartment of origin of a clone was determined either by the state of the heritable and P-specific expression of an en-lacZ reporter gene or by the position of the 'twin spot' clone, which is composed of sibling wild-type cells. The position of the A/P boundary was inferred from the expression of a hh-lacZ reporter gene expressed exclusively in P cells (Dahmann, 2000).

Two forms of Ci are distinguished, a constitutively active form, Ci[act], and a repressive form, Ci[rep]. Autonomous and direct roles have been established for Ci[act] and En in specifying A and P cell segregation, respectively. Evidence is also provided that Hh signaling is sufficient to specify A-type cell segregation and that it acts by shifting the balance between Ci[rep] and Ci[act] toward low levels of Ci[rep] and high levels of Ci[act]. It is proposed that the opposing transcriptional activities of Ci[act] and Ci[rep]/En lead to differences in the activity of a cell adhesion system at the boundary of A and P cells, thereby preventing these cell populations from intermingling (Dahmann, 2000).

The smooth and straight boundary between compartments has been ascribed to distinct adhesive properties of cells on opposite sides of the boundary causing these cell populations to minimize contact and sort out. In the case of the A/P boundary of the wing, one difference that could account for the distinct sorting behavior is the exclusive presence of two transcription factors, Ci[act] and En in adjacent A and P cells, respectively. For a long time, the view prevailed that En regulates cell segregation by autonomously and directly specifying P, as opposed to A, cell adhesiveness. This hypothesis has recently been challenged by studies indicating that En acts, at least in part, by directing the expression of Hh and that Hh secreted by P cells induces A cells to acquire a distinct cell adhesiveness. These studies, however, provide conflicting results as to whether or not En also has an autonomous, Hh-independent role in specifying cell segregation at the A/P boundary. The same studies further raised, but did not address, the question of whether Hh signaling would specify cell segregation via its normal transduction pathway by leading to a transcriptional output depending on Ci. In various other systems, the activation of signaling receptors can lead to the posttranscriptional activation of small GTPases that can directly, without altering gene transcription, affect cytoskeletal components and thus conceivably cell adhesion. A key tool for addressing these questions is the choice assay. This assay allows for monitoring whether altering the activity of a gene would change a cell's compartmental preference. Using this assay, the above questions have been addressed by systematically considering three distinct situations (Dahmann, 2000 and references therein).

Situation 1: the 'ground state,' where neither Ci nor En is present.
Irrespective of their compartmental origin, clones of cells null mutant for both ci and en take up positions overlapping the normal site of the A/P boundary with smooth borders to wild-type A and P cells. Because En is not required in A cells and because ci minus single mutant A cells behave like ci,en minus double mutant A cells, it is inferred that Ci is required in A cells for their intermingling with other A cells at the compartment boundary. Since Ci acts in these cells as a transcriptional activator, it is concluded that Hh signaling leads to a Ci-dependent transcriptional response in A cells and transcription of the immediate Hh target gene relevant for A segregation is induced, rather than repressed, in anterior boundary cells. The behavior of ci,en minus double mutant clones also clarifies the role of En. Because clones of P cells lacking En and Ci form smooth borders with neighboring wild-type P cells that also lack Ci and, if in contact with A cells, sort partially into A territory, it is inferred that En has a function in specifying P segregation that is independent of Ci. Since Ci is required for all known responses to Hh signaling, it is concluded that En has a Hh-independent role in determining P segregation. The observation that clones of cells mutant for both ci and en occupy A and P territory to a similar extent leads to the conclusion that Ci and En are required for most if not all aspects of the distinct segregation properties of A and P cells, and the difference between the ground state and the 'A state' brought about by Ci[act] is similar to the difference between the ground state and the 'P state' dependent on En (Dahmann, 2000).

Situation 2: Cells expressing En but lacking Ci.
A more direct argument for a Ci/Hh-independent role of En in the specification of cell sorting behavior can be derived from the experiment in which anterior clones were programmed to express low levels of En. Such cells cease to express Ci and take up positions normally occupied only by P cells. The behavior of these cells is different from that of ground state cells that neither express Ci nor En. In contrast to ci,en minus cells, the low level of En-expressing cells of A origin show a complete transgression to P territory, yet they do not intermingle well with P cells. This latter observation is ascribed to the unnaturally low levels of En produced in these cells (several-fold less than in wild-type P cells). These levels may not repress ci completely and might not be sufficient to fully confer P cell adhesiveness (Dahmann, 2000).

Situation 3: Cells expressing Ci but lacking En.
Posterior clones of cells expressing Ci at physiological levels, but lacking En (mutant for en^E), take up positions in the territory normally only occupied by A cells and intermingle with A cells. This behavior is dependent on Ci, since ci,en double mutant clones of P origin only partially occupy A territory and sort out from A cells. Furthermore, overexpression of Ci in P cells leads these cells to sort out from neighboring P cells, and, if in contact with A cells, sort into A territory. Together, by comparing situations (1) to (3), it is concluded that Ci is necessary and sufficient to specify A segregation, and, in the absence of Ci, En is necessary and sufficient to specify P segregation (Dahmann, 2000).

Thus En has an autonomous, Hh-independent role in specifying cell segregation. In addition, Ci is necessary and sufficient to specify A segregation. Ci is activated in anterior boundary cells by Hh whose P-specific expression is in turn controlled by En. Thus, En controls cell segregation at the A/P boundary both by a Hh-dependent as well as a Hh-independent pathway. To determine the relative contributions of these two pathways, situations were generated and analyzed in which En activity was altered under conditions of constant Hh signaling, or conversely, situations in which the activity of Hh signal transduction was altered under constant En conditions. From these experiments, it is concluded that for the segregation behavior of wing cells, the state of the Hh pathway prevails over that of En activity. This conclusion is particularly well corroborated by the finding that cells in which both pathways are simultaneously 'on' (P cells expressing Ci), sort with A cells. The behavior of such cells may also explain why the late expression of en in anterior boundary cells has no deleterious effects on the integrity of the compartment boundary. Like the experimental cells, these cells are exposed to the Hh signal, coexpress ci and en, yet associate with other A cells rather than with En-expressing P cells (Dahmann, 2000).

Ci is required in A cells for proper cell segregation at the A/P boundary. Depending on the status of the Hh signaling pathway, Ci can exist in two forms with opposing transcriptional activities (Ci[rep] and Ci[act]). These two forms of Ci regulate the expression of different subsets of Hh target genes, some of which appear to be regulated exclusively by Ci[rep] or Ci[act]. It is argued that the A/P sorting of wing cells is under control of both forms of Ci. This conclusion is based on findings that both Ci[rep] and Ci[act] have a profound influence on the segregation behavior of A cells. Two observations show that Ci[rep] determines a preference for sorting into P territory. (1) A cells expressing Ci[rep] in the absence of Ci[act] or A cells overexpressing Ci[rep] in the presence of Ci[act] both take up positions occupied normally only by P cells. This is in contrast to cells lacking Ci entirely, which take up positions overlapping the normal position of the A/P boundary. (2) P cells lacking En but expressing Ci[rep] are confined to the P compartment, unlike cells that lack En and Ci or cells that only lack En. It is inferred from this that one important function of Hh signaling in its role of specifying A-type segregation properties is to prevent the formation of Ci[rep] in cells close to the A/P boundary (Dahmann, 2000).

The conclusion that not only prevention of Ci[rep] formation but also the induction of Ci[act] plays an important role in A/P sorting is deduced from the observation that cells lacking both forms of Ci do not mingle with wild-type A cells expressing Ci[act] due to their vicinity to the Hh source. Moreover, the addition of Ci to P cells, where Ci is readily converted to Ci[act], programs P cells to segregate with A cells. Because Ci[rep] influences cell segregation, one might have expected that anterior ci minus clones far away from the A/P boundary would sort out from neighboring Ci[rep]-expressing cells. However, ci minus cells intermingle well with neighboring A cells. One likely explanation for this apparent discrepancy is the partial derepression of hh transcription in ci mutant cells. These low Hh levels induce in neighboring cells the formation of some Ci[act] that might neutralize remnant levels of Ci[rep]. In support of this assumption, it has been found that clones of cells double mutant for ci and hh do sort out at anterior positions (Dahmann, 2000).

Ci and En are both DNA-binding proteins known to act as transcription factors, indicating that they control cell segregation by regulating the expression of target genes. By analogy to dpp, a Hh target gene that is also controlled by En and both forms of Ci, a model is proposed illustrating how Ci[rep], Ci[act], and En might shape the expression profile of a putative immediate target gene involved in cell segregation. Since in the absence of Ci and En, cells segregate neither with A nor with P cells, they are likely expressing an intermediate level of this gene that is different from those in A or P cells. Since Ci[rep] can control cell segregation and is present in A cells far away from the boundary, it is proposed that the basal expression of this hypothetical gene is downregulated by Ci[rep] in these cells. In A cells close to the boundary, Hh signaling prevents the formation of Ci[rep] yet causes the formation of Ci[act], from which it is inferred that in these cells the transcription of this target gene is upregulated. In P cells, En may repress this target gene, consistent with its role as a transcriptional repressor. It is proposed that the opposing transcriptional activities of Ci[act] and En lead to a large difference in the expression of this immediate target gene in cells on opposite sides of the A/P boundary (Dahmann, 2000).

In the above model, it is assumed that Ci and En control cell segregation by transcriptionally regulating one and the same gene, although it is also possible that they regulate different genes. While at present these alternatives cannot be distinguised, the simpler model that Ci and En control the same target gene is preferred for two reasons: (1) there is a precedent case for such a gene, dpp, which is known to be regulated by both Ci and En; (2) a difference in the expression level of a single cell adhesion molecule (Shotgun or DE-cadherin) is sufficient for two cell populations to sort out. While it is conceivable that Ci and En directly regulate the expression of cell adhesion molecules like DE-cadherin, it is also possible that they act more indirectly by regulating genes whose products influence the activity of uniformly expressed cell adhesion molecules. Clones of cells lacking detectable amounts of DE-cadherin do sort out from neighboring wing disc cells; they are, however, exclusively confined to the compartment of origin, indicating that DE-cadherin is not required for the separation of cells at the A/P boundary (Dahmann, 2000).

Why does cell segregation at the A/P boundary require two transcription factors with opposing activities? Based on the results presented here, the differential activities of either Ci or En suffices for separating A and P cells. For Ci, this is best illustrated by the key finding that P cells forced to express Ci sort out from wild-type P cells and segregate into A territory. Conversely, in the absence of Ci, expression of En suffices for A cells to sort into P territory. The use of two transcription factors with opposing activities may have the advantage of increasing the fidelity of the sorting process by further contrasting the expression levels of a common putative target gene in cells of opposite sides of the A/P boundary (Dahmann, 2000).

It seems to be a general mechanism that En controls cell segregation both in a Hh-dependent and -independent manner. In the Drosophila abdomen, En has also been implicated to control separation of A and P cells in Hh-dependent and -independent ways. The relative contributions of these two functions of En, however, appear to differ between the wing imaginal discs and the abdomen. While a prevalence of the Hh-dependent pathway is found in the wing disc, the two functions of En seem to contribute equally to the separation of abdominal A and P cells. This difference in dominance of the Hh-signal transduction pathway might be due to a more influential role of Ci[rep] in the sorting of imaginal versus abdominal cells. It is intriguing to notice that the same intricate network that defines the strip of cells expressing Dpp also appears to restrict the activity of a putative cell adhesion molecule to the very same cells. The use of Hh/En signaling for both setting up the Dpp organizer and segregating A and P cells may ensure that the position and shape of the morphogen source that organizes both compartments is stably maintained during development. The prediction of a dpp-like expression pattern provides a novel criterion for the future identification of the elusive molecules conferring cell segregation (Dahmann, 2000).

Engrailed controls the organization of the ventral nerve cord through frazzled regulation

In Drosophila, the ventral nerve cord (VNC) architecture is built from neuroblasts that are specified during embryonic development, mainly by transcription factors. Engrailed, a homeodomain transcription factor known to be involved in the establishment of neuroblast identity, is also directly implicated in the regulation of axonal guidance cues. Posterior commissures (PC) are missing in engrailed mutant embryos, and axonal pathfinding defects are observed when Engrailed is ectopically expressed at early stages, prior to neuronal specification. frazzled, enabled, and trio, all of which are potential direct targets of Engrailed and are involved in axonal navigation, interact genetically with engrailed to form posterior commissures in the developing VNC. The regulation of frazzled expression in engrailed-expressing neuroblasts contributes significantly to the formation of the posterior commissures by acting on axon growth. A small genomic fragment within intron 1 of frazzled can mediate activation by Engrailed in vivo when fused to a GFP reporter. These results indicate that Engrailed's function during the segregation of the neuroblasts is crucial for regulating different actors that are later involved in axon guidance (Joly, 2007).

During embryogenesis, Engrailed is first expressed in posterior epidermal cells within each segment, and then later in NBs, GMCs, and neurons. Present at all developmental stages in a subpopulation of neural cells, Engrailed is a good candidate for a factor participating in neuronal determination. Several Engrailed target genes involved in neurogenesis have been identified, and in particular in axonal guidance, including eg, con, comm, fra, ena, and trio. This suggested an important role for engrailed in this process (Joly, 2007).

Interestingly, Trio and Ena were recently found to function as effectors of Fra signalling and to act together in the formation of commissural axons. In particular, they were shown to physically interact, suggesting a potential mechanism by which Fra might coordinate the actin cytoskeletal dynamics necessary for axonal cone growth. This study shows that en genetically interacts not only with fra, but also with ena and trio, to form the posterior commissures. En thus appears to directly regulate PC formation by acting at different levels to ensure axon growth through a complex signalling network that involves Fra (Joly, 2007).

Transheterozygous embryos with alterations in both en and in any of several potential targets present axonal defects that are very similar to those observed in homozygous en mutant embryos. Overexpressing Fra using the prd-Gal4 driver cannot rescue the axonal defects of homozygous en mutant embryos. This confirms that En plays an important role in axonal guidance by regulating various target genes, including ena, trio, commissureless (comm), and transcription factors such as eg, that have been identified as potential En targets. While En is often identified as a repressor, there is no evidence for a role for En in the repression of genes that instruct neurons to choose the AC, such as Wnt5/Drl components. This study demonstrates instead that En regulates axonal guidance and growth by activating components necessary for the establishment of neuronal posterior connectives (Joly, 2007).

Several lines of evidence are provided that fra expression is directly controlled by Engrailed. For example, genomic fragment 2C5 was found to bind En in vivo, first during embryogenesis (as assayed by ChIP) and later in larvae (as assayed by immuno-FISH. In addition, this genomic fragment is shown in this study to be able to mediate activation by En in transgenic flies. However, even though it is known to bind En in embryos, 2C5 is not able to drive GFP expression during embryogenesis, suggesting that it recapitulates only a fraction of the frazzled regulatory sequences (Joly, 2007).

Genetic data is provided arguing that fra is regulated by En during embryogenesis. en and fra interact genetically to ensure the formation of a correct scaffold within the VNC. In homozygous en mutant embryos, fra expression is affected by early stage 11, and Fra immunostaining is absent in the PCs at stage 14, correlating with a loss of posterior commissures (Joly, 2007).

This study shows that PC formation requires an early function of En that acts prior to the specification of neuronal cell fate and to axon growth. Indeed, only the ectopic expression of En at early stages leads to axonal misrouting, whereas the use of pan-neuronal drivers does not cause any axonal defects. Once neurons are specified, En is no longer able to change their fate and hence affect their axonal navigation. This confirms a role for En during NB segregation, and suggests that the neuronal expression of Engrailed is not essential for the formation of the VNC (Joly, 2007).

During NB segregation, Engrailed may participate in the specification of pioneer neurons. Indeed, it was observed that not all the axons that form PCs come from en-expressing cells. Moreover, in homozygous mutant en embryos, it was found that the pioneer marker BP102 was affected in PCs. This suggests that a cluster of En-positive neurons corresponds to the pioneers, which are normally required for normal pathfinding by later outgrowing neurons. This could explain the absence of PCs in engrailed homozygous mutant embryos. Interestingly, the use of a late eve-Gal4 driver to ectopically express En in aCC/RP2 pioneer neurons had no effect on axonal pathfinding. This confirms that the En-sensitive period occurs before the specification of the neurons, including the pioneers (Joly, 2007).

This study shows that the En/fra interaction is important for the formation of the PCs, since PCs are not formed in transheterozygous mutant en⁻/fra⁻ embryos. This absence of PCs might result from a loss of axonal growth, which is known to involve Fra. This might also account for the PC defects that are observed in homozygous en⁻ mutant embryos (Joly, 2007).

Since the function of En in establishing the axon scaffold within the VNC is essential during NB segregation, it is suspected that the regulation of En target genes involved in axonal pathfinding might also occur at early stages. Indeed, it was possible to confirm that the axonal defects detected in en⁻/fra⁻ transheterozygous embryos required the loss of early fra activation during NB segregation. This was shown by RNA in situ hybridization and by rescue experiments: PC axons of stage 15 en^X31/fra¹ embryos only develop normally when Fra expression is recovered before the specification of the neurons, but do not form properly once neurons are formed. Therefore, one possible hypothesis is that the activation of fra in NBs allows the axonal growth of the PC pioneers (Joly, 2007).

The data suggest that the fra level in NBs and neurons is crucial for axon growth. Because mutations affecting axon growth must be dominant over axonal guidance problems, it is logical that the VNCs of both en⁻/en⁻ and en⁻/fra⁻ present the same missing PC phenotype. Indeed, with fra being a direct target of En, it can be assumed that in the absence of the En activator, fra expression will be lower or lost. Indeed, it was noticed that en⁻/fra⁻ embryos phenocopy fra⁻/fra⁻ embryos: in both cases PCs are missing, and neurons express En but show defects in their positioning. Therefore, these changes in neuronal cell fate can be attributed to a change in fra expression. One open question concerns the sensitive period of Fra in this process: frazzled is activated by Engrailed during the segregation of the NBs, but Fra protein is only detectable in neurons. One possible explanation is that Fra protein is present at early stages, but is under the threshold of detection. Another explanation can also be drawn from previous work in vertebrates, where it has been shown that growth cones possess the machinery necessary for protein translation and can translate guidance molecules locally. The resulting rapid changes in protein levels were shown to be involved in axon guidance. Therefore, one hypothesis is that the fra RNA pool in NBs is rapidly translated in growth cones in order to cause changes in the cytoskeleton necessary for axon growth and their further guidance (Joly, 2007).

These results give new insights into En function during neurogenesis and show that En can alter the VNC architecture at different levels to form PCs, playing on axonal pathfinding and axon growth. Indeed, en mutant embryos present PCs that are not properly positioned or not even formed in most segments. Further, ectopic expression of En leads to abnormal axonal pathfinding. Both loss and gain of function of en could be associated with changes in the identity of the NBs (data not shown), confirming a role for En in this process (Joly, 2007).

En functions during neurogenesis act through the regulation of different target genes. One way is through the regulation of transcription factors such as eagle, but it also regulates the expression of fra, trio and ena, which are more directly involved in axon growth and which participate with En in the formation of the PCs. Indeed, monitoring eg-expressing neurons in an en⁻/fra⁻ genetic background showed that axons projecting through PCs do not grow properly, confirming that en and fra are involved in this process (Joly, 2007).

Together, these results illustrate how En can act during NB segregation to build a wild-type VNC. Recent results in vertebrates suggest that the regulatory pathway that this study has identified between En and fra (EN1 and DCC in vertebrates) may be evolutionarily conserved. Elucidating the molecular events that allow En/Fra-positive neurons to specifically project axons through PCs but not ACs will be the next challenge to explore in order to better understand axonal guidance (Joly, 2007).

Eukaryotic transcription factors can track and control their target genes using DNA antennas

Eukaryotic transcription factors (TF) function by binding to short 6-10 bp DNA recognition sites located near their target genes, which are scattered through vast genomes. Such process surmounts enormous specificity, efficiency and celerity challenges using a molecular mechanism that remains poorly understood. Combining biophysical experiments, theory and bioinformatics, this study dissected the interplay between the DNA-binding domain of Engrailed, a Drosophila TF, and the regulatory regions of its target genes. Engrailed binding affinity was found to be strongly amplified by the DNA regions flanking the recognition site, which contain long tracts of degenerate recognition-site repeats. Such DNA organization operates as an antenna that attracts TF molecules in a promiscuous exchange among myriads of intermediate affinity binding sites. The antenna ensures a local TF supply, enables gene tracking and fine control of the target site's basal occupancy. This mechanism illuminates puzzling gene expression data and suggests novel engineering strategies to control gene expression (Castellanos, 2020).

Eukaryotic TFs track their target genes, control site occupancy, and coordinate binding with partners to form the transcription complex. These processes must involve modes of interaction with DNA that go beyond nonspecific binding and facilitated 1D and/or 2D diffusion. By focusing on the DNA regions flanking the target site of real genes, this study has discovered, and characterized, a molecular mechanism that enables such functions. The mechanism exploits the natural tendency of biomolecules to exhibit energetic frustration, in this case manifested by binding promiscuity. Particularly, it was found that the affinity of the Drosophila TF Engrailed to the RRs of its target genes is strongly amplified by long tracts of degenerate consensus repeats that are present in such regions. The combination of a promiscuous TF and a DNA region rich in degenerate consensus binding (DCB) repeats operates as a transcription antenna. Once the DNA region becomes accessible by chromatin dynamics, and thus transcriptionally active, the antenna attracts TF molecules that remain loosely associated to the gene of interest through a highly dynamic exchange among the hundreds of mid-affinity binding sites within the antenna. In this light, it was confirmed that the short recognition sequences and promiscuous specific binding of eukaryotic TFs are a functional strategy to ensure their colocalization with the relevant genes, as it has been postulated by other authors. For instance, there are ~30,000 copies of Engrailed per cell and about 200 genes estimated to be under its control. Taking the β3 tubulin antenna as example, it follows that each of these 200 genes will contain on average ~150 EngHD molecules trapped in its antenna, whereas fewer than 15 molecules will be found anywhere else in the cell (bound non-specifically or free). The pool of TF molecules inside an antenna will be in exchange between sites that are relatively weak binders, so their faster dissociation rates facilitate turnover over the specific binding to a consensus (SB) site, and thus enable a nimble gene expression response (Castellanos, 2020).

The antenna mechanism sheds new light onto some puzzling observations of eukaryotic gene expression. The mechanism predicts two 'specific' binding modes: a frequent, still physically localized, but weaker binding event to antenna DCB sites, and a rare, high affinity binding to target sites (SB). These properties are in striking accord with single-molecule TF tracking experiments in mammalian cells, which have reported that only ~1% of detectable binding events (with lifetimes >0.5s) were to high affinity sites, whereas the remainder involved moderately weak binding events. Antennas also enable control of the target site's occupancy by competing locally for binding. A relatively distant (e.g., few kbp away) antenna can keep the basal SB occupancy of an activator at a suitable minimum and TF supply still relatively local. In contrast, an antenna surrounding the target recognition site can amplify a repressor's effect. Binding events concentrated on long antennas provides a simple explanation of why crosslinking data on eukaryotes produces many more hits than expected from the number of genes under control of the given TF, and relatively weak correlations between site occupancy and gene expression levels. Furthermore, the use of transcription antennas could greatly facilitate the synchronous recruitment of various TFs to assemble into the transcription machinery. Summarizing, DNA antennas provide an elegant mechanism that sheds new light on how eukaryotic TFs operate at the molecular level and explains several paradoxes of existing eukaryotic gene expression data. These molecular devices provide an additional layer of eukaryotic transcriptional control in which the size and sequence profile of the antenna can be engineered, whether by evolution or by scientists, to modulate site occupancy, response swiftness and levels of gene expression (Castellanos, 2020).

Lineage-specific determination of ring neuron circuitry in the central complex of Drosophila

The ellipsoid body (EB) of the Drosophila central complex mediates sensorimotor integration and action selection for adaptive behaviours. Insights into its physiological function are steadily accumulating, however the developmental origin and genetic specification have remained largely elusive. This study identified two stem cells in the embryonic neuroectoderm as precursor cells of neuronal progeny that establish EB circuits in the adult brain. Genetic tracing of embryonic neuroblasts ppd5 and mosaic analysis with a repressible cell marker identified lineage-related progeny as Pox neuro (Poxn)-expressing EB ring neurons, R1-R4. During embryonic brain development, engrailed function is required for the initial formation of Poxn-expressing ppd5-derived progeny. Postembryonic determination of R1-R4 identity depends on lineage-specific Poxn function that separates neuronal subtypes of ppd5-derived progeny into hemi-lineages with projections either terminating in the EB ring neuropil or the superior protocerebrum (SP). Poxn knockdown in ppd5-derived progeny results in identity transformation of engrailed-expressing hemi-lineages from SP to EB-specific circuits. In contrast, lineage-specific knockdown of engrailed leads to reduced numbers of Poxn-expressing ring neurons. These findings establish neuroblasts ppd5-derived ring neurons as lineage-related sister cells that require engrailed and Poxn function for the proper formation of EB circuitry in the adult central complex of Drosophila (Bridi, 2019).

The Drosophila central complex is a composite of midline neuropils that include the protocerebral bridge, the fan-shaped body, the ellipsoid body (EB), the noduli and the lateral accessory lobes. These neuropils are interconnected in a modular way whereby columnar projection neurons leading to and from the central complex connect all its components that are themselves intersected by tangential layers of neural processes, which together form functional modules, each representing a segment of sensory space. Functional studies have identified specific roles for the central complex in higher motor control, courtship and orientation behaviours, visual memory and place learning, as well as sleep, attention, arousal and decision-making (Bridi, 2019).

In contrast to expanding insights into the physiological role of the central complex in regulating behaviour, its developmental origin and genetic specification has largely remained elusive. Earlier work described a primordial central complex at late larval/early pupal stages, which becomes fully formed by 48 hr after puparium formation. Genetic studies have identified several alleles of as-yet unidentified genes, as well as orthodenticle, Pax6/eyeless, Pox neuro (Poxn), tay-bridge, roundabout, Pdm3 and semaphorin as genes involved in normal formation of central complex sub-structures (Bridi, 2019).

This study investigate the origin and formation of EB ring neurons R1-R4 in the developing and adult brain of Drosophila. Bilateral symmetric neuroblasts ppd5 were identified in the embryonic procephalic neuroectoderm as founder cells of neuronal progeny that constitute R1-R4 subtypes of tangential ring neurons in the adult EB. Mutant analysis and targeted genetic manipulations reveal a lineage-specific requirement of engrailed (en) and Poxn activity that determines the number and identity of ppd5-derived progeny and their EB ring-specific connectivity pattern in the adult central complex of Drosophila (Bridi, 2019).

Previous studies suggested the Drosophila EB -- as part of the central complex -- develops from precursor cells that differentiate during larval development and during pupal stages generate the EB neuropil. Lineage analysis demonstrates that at least part of its origin can be traced back to the embryonic procephalic neuroectoderm. This study identified Engrailed-expressing neuroblasts ppd5 as embryonic stem cells that give rise to Poxn-expressing progeny, which ultimately differentiate into EB ring neurons. Genetic tracing with en-Gal4 identified R1-R4 ring neurons, suggesting that embryonic neuroblasts ppd5 are the major source of Poxn-expressing progeny leading to EB ring neurons detected in this study. Based on their position, morphology, gene expression patterns and axonal fasciculation, these findings suggest that ppd5-derived larval lineages correspond to previously described larval lineages variously called 'EB-A1/P1', 'DALv2/3', 'MC1' or 'DM'. It was previously demonstrated that these larval lineages express Poxn and give rise to gamma-amino butyric acid (GABA)-ergic ring neurons in the central complex of the adult brain. It therefore is proposed to (re-) name them according to their embryonic origin (Bridi, 2019).

Subclass-specific Gal4 lines together with Poxn expression identifies these lineage-related, ppd5-derived sister cells as R1-R4 ring neurons. Moreover, brain-specific Poxn-Gal4 mediated labelling identifies ring neurons and their axonal projections covering all layers of the EB neuropil, thus suggesting neuroblasts ppd5 give rise to the majority, if not all, of ring neuron subtypes. The ontogenetic relationship between Engrailed-expressing neuroblasts ppd5 and Poxn-expressing EB ring neurons is affirmed by the fact that en-Gal4 and Poxn-Gal4-targeted RNAi-mediated knockdown of Poxn causes similar EB neuropil-specific phenotypes. Together, these data establish that ppd5-derived progeny are clonal units contributing to the EB ring neuron circuitry in the central complex in Drosophila (Bridi, 2019).

How are these units specified? In both insects and mammals, the patterning and specification of neural lineages is regulated by genetic programs from neurogenesis to neuronal differentiation. This study in Drosophila shows that the development and specification of EB-specific circuit elements is likewise dependent on the lineage-specific activity of developmental regulatory genes. Early formation and maintenance of Poxn-expressing ppd5 lineages requires engrailed function as revealed with a deficiency removing both engrailed orthologues, en and invected. Previous studies showed that, engrailed/invected are required for the specification of neuroblast identity in the developing nervous system, suggesting that engrailed is also required for the specification of ppd5. A later, lineage-specific function of engrailed was found in the specification of ring neuron numbers, onsistent with its transient expression in Poxn+ lineages in the embryonic brain but not at later developmental stages nor in adult ring neurons. engrailed codes for a homeodomain transcription factor mediating the activation and suppression of target genes, regulatory interactions that are required for neural lineage formation and specification in the procephalic neuroectoderm. In contrast, no function for Poxn in embryonic brain development has been reported, suggesting that Poxn is only during later stages of development required for lineage and/or neuronal specification in the central brain (Bridi, 2019).

Indeed, experiments identify a postembryonic requirement of Poxn in the specification of ppd5-derived progeny. Previous studies showed that zygotic mutations of Poxn perturb EB neuropil formation, in that presumptive ring neurons are unable to project their axons across the midline and as a consequence, the EB ring neuropil is not formed. In the present study, en-Gal4-targeted knockdown of Poxn reveals Engrailed-expressing cells that project across the midline and form a ring-like neuropil instead of their normal ipsilateral projections to the SP. Significantly, no ppd5-derived GFP-labelled cells were observed that project ipsilaterally towards the SP, neurons that are normally detectable with en-Gal4 targeted GFP expression in the adult brain. Furthermore, en>Poxn-IR-targeted, EB neuron-like projections do not form a torroidal ring but are rather characterised by a ventral cleft. These en>Poxn-IR cells aberrantly retain Engrailed expression even though their axonal projection and connectivity pattern clearly identify them as ring neurons that are normally devoid of Engrailed but instead express Poxn. Together these data suggest that, based on their morphology, Engrailed expression, axogenesis and ring-specific projection patterns, en>GFP cells normally projecting to the SP have been transformed into EB ring neurons in en>mCD8::GFP,Dcr2,Poxn-IR flies (Bridi, 2019).

The resulting additional ring neurons in en>mCD8::GFP,Dcr2,Poxn-IR flies are accompanied with a ventrally open EB ring neuropil. A comparable phenotype is seen in brains of Poxn(757)>Poxn-IR flies which are characterised by an increased number of Poxn(757)-Gal4-targeted ring neurons, suggesting that increasing numbers of EB ring neurons lead to an arch-like neuropil reminiscent of the arch-like EB seen in the majority of arthropods. In support of this notion, previous work has demonstrated that in vivo amplification of ppd5-derived progenitor cells can lead to fully differentiated supernumerary GABAergic ring neurons that form functional connections often characterised by a ventrally open EB ring neuropil. Together, these data identify differential roles of Poxn activity during neuroblast lineage formation, in that Poxn is required for cell identity determination of ppd5-derived progeny, as well as for the specification of cell numbers and terminal neuronal projections of EB ring neurons (Bridi, 2019).

These Poxn functions in ppd5-derived brain lineages are reminiscent of Poxn activity in the peripheral nervous system (PNS) which mediates the specification of sensory organ precursor (SOP) cell lineages giving rise to external sense organs, the tactile and gustatory bristles, respectively. In these SOP lineages, differential Poxn activity determines progeny fate between chemosensory (gustatory) or mechanosensory (tactile) neuronal identities. Furthermore, SOP lineage-specific Poxn function specifies the number of these neurons and their connectivity pattern. The apparent functional commonalities between Poxn-mediated specification of ppd5 neuroblast-derived lineages in the brain and SOP lineages in the PNS, suggest that evolutionarily-conserved mechanisms underlie the development and specification of clonal units as cellular substrates for neural circuit and sensory organ formation (Bridi, 2019).

The cytoarchitecture of both the insect and mammalian brain are characterised by neural lineages generated during development by repeated asymmetric divisions of neural stem and progenitor cells. These ontogenetic clones are thought to constitute building blocks of the insect and mammalian brain. In support of this notion, lineage-related progeny constitutes sets of circuit elements of the mushroom bodies and antennal lobes in Drosophila. Clonal relationship also characterises the lineage-dependent circuit assembly in the mammalian brain, where stem cell-like radial glia give rise to clonally-related neurons that synapse onto each other, as has been shown for cortical columns and GABAergic interneurons in the neocortex and for striatal compartments of the basal ganglia. The current study in Drosophila shows that a pair of bilateral symmetric, engrailed-expressing embryonic stem cells, neuroblasts ppd5, give rise to R1-R4 subtypes of tangential ring neurons that contribute to the layered EB neuropil. Thus, ppd5 neuroblast lineages constitute complete sets of circuit elements intrinsic to the adult central complex in Drosophila (Bridi, 2019).

It has been suggested that clonal expansion of neural lineages contributed to the evolution of complex brains and behaviours. Key to this hypothetical scenario are ancestral circuit elements in the form of genetically encoded stem cell-derived clonal units, like the ones described in the current study. In such a scenario, lineage-related ancestral circuit elements might have been multiplied and co-opted or diversified during the course of evolution. Multiplication and co-option have been suggested for the evolution of the multiple-loop architecture of the basal ganglia that allows processing of cognitive, emotional and motor information. In line with this hypothesis, quantitative control of the transcription factor Prospero is sufficient to cause clonal expansion of ring-neuron circuitry in Drosophila (Shaw, 2018), which has been implicated in cognitive and motor information processing and resembles extensive correspondences to vertebrate basal ganglia, ranging from comparable developmental genetics to behavioural manifestations and disease-related dysfunctions (Bridi, 2019).

In contrast to multiplication and co-option, the diversification of stem cell lineages can equally contribute to neural circuit evolution. The current results identify differential and tightly regulated spatio-temporal functions of engrailed and Poxn that lead to the differentiation of ppd5 progeny into hemi-lineage specific identities in the adult brain. Loss of engrailed affects the formation of precursors cells, whereas its lineage-specific knockdown affects the number of Poxn expressing ring neurons. Correspondingly, en-Gal4-driven lineage-specific knockdown of Poxn results in an identity transformation of Engrailed-expressing neurons in the adult brain in that they no longer project to the SP, but instead reveal an EB ring-neuron identity. These data indicate a binary switch of hemi-lineage identities as the result of a feed-forward mechanism between engrailed and Poxn. engrailed may activate transcription (directly or indirectly) of Poxn, which in turn represses engrailed to permit differentiation of R1-R4 neurons, thereby regulating the specification of neuronal identities in ppd5 hemi-lineages. This hypothesis is consistent with lineage tracing and MARCM experiments, as well as the transient expression of engrailed in embryonic ppd5 lineages but not in adult EB ring neurons. However, further studies are required to elucidate the nature and extend of these putative regulatory interactions between Engrailed and Poxn (Bridi, 2019).

In summary, these findings establish a causal relationship between a pair of bilateral symmetric embryonic stem cells, neuroblasts ppd5 and the lineage-related assembly of their EB ring neuron progeny as structural units of the central complex in Drosophila. Based on these observations it is proposed that amplification and diversification of ontogenetic clones together with the repurposed use or exaptation of resulting circuitries, is a likely mechanism for the evolution of complex brains and behaviours (Bridi, 2019).

Structure and function of an ectopic Polycomb chromatin domain

Polycomb group proteins (PcGs) drive target gene repression and form large chromatin domains. In Drosophila, DNA elements known as Polycomb group response elements (PREs) recruit PcGs to the DNA. This study shows that, within the invected-engrailed (inv-en) Polycomb domain, strong, constitutive PREs are dispensable for Polycomb domain structure and function. It is suggested that the endogenous chromosomal location imparts stability to this Polycomb domain. To test this possibility, a 79-kb en transgene was inserted into other chromosomal locations. This transgene is functional and forms a Polycomb domain. The spreading of the H3K27me3 repressive mark, characteristic of PcG domains, varies depending on the chromatin context of the transgene. Unlike at the endogenous locus, deletion of the strong, constitutive PREs from the transgene leads to both loss- and gain-of function phenotypes, demonstrating the important role of these regulatory elements. These data show that chromatin context plays an important role in Polycomb domain structure and function (De, 2019).

Polycomb group proteins (PcGs) are critical for organismal development and stem cell maintenance. PcGs were first found in Drosophila as repressors of homeotic genes, and PcG repression is one of the earliest epigenetic regulatory mechanisms to be identified. In Drosophila, nearly all PcG proteins are subunits of one of four principal protein complexes: Polycomb repressive complexes 1 and 2 (PRC1 and PRC2), Pho repressive complex (PhoRC), and Polycomb repressive deubiquitinase (PR-DUB). PcG protein complexes bind to DNA elements known as Polycomb group response elements (PREs), deposit the repressive chromatin modification mark H3K27me3, and drive chromatin compaction leading to gene repression. Genes repressed by PcG are covered with H3K27me3 and are thought to form their own topologically associating domains (TADs). In Drosophila, TADs vary from a few kilobases to several hundred kilobases in size. Regulatory DNAs present within TADs preferentially interact with genes located within the same domain, with limited contacts outside of the TAD boundaries. In Drosophila, genome-wide data suggest that mini-domains formed by actively transcribed regions form the boundaries of some TADs, while other TAD boundaries are demarcated by insulator elements. In mammals, the insulator protein CTCF colocalizes with a subset of TAD boundaries (De, 2019).

The current understanding in chromatin biology hypothesizes that the folding of chromatin into domains assists in the packaging of long stretches of DNA inside the eukaryotic nucleus. Further, it is the organization of these domains that facilitates spatial and temporal regulation of genes within them. Thus, understanding how the domains are formed is of prime importance. How large PcG domains/TADs are formed is a central question. To date, researchers have extensively studied how PcG proteins are recruited to specific DNA sequences and which proteins are present in PcG complexes. PREs are required for the recruitment of PcG proteins in Drosophila and are thought to initiate the formation of a Polycomb domain. The 113-kb PcG domain that encompasses the invected (inv) and engrailed (en) genes of Drosophila. Unexpectedly, deletion of strong, constitutive PREs from the endogenous inv-en domain had little effect on inv-en PcG domain organization. Weak PREs present in the inv-en domain were sufficient to establish and maintain the overall domain organization, and some weak PREs overlap with enhancers present within the inv-en domain. Similarly, deletion of the bxd PRE had a mild effect on Ubx expression, whereas deletion of the iab7 PRE resulted in misexpression of Abd-B in very specific parasegments. A recent report showed that deletion of two PREs from the dac locus causes prominent structural and functional changes in the locus (De, 2019).

The primary aim in this work was to study the effect of "chromatin context" on a PcG domain; the following main conclusions can be drawn from this study. First, the en PcG domain at ectopic genomic sites rescues the null mutants of inv-en. This shows that the information to form the chromatin domain is primarily present within the domain itself. Second, spreading of the chromatin state at different sites of the genome is very much dependent on the local "context" itself. Third, a repressive chromatin domain can interact with null chromatin (has no chromatin mark) but stays segregated from the active chromatin. Fourth, a fragment of DNA containing the strong, constitutive PREs is required at an ectopic location to ensure PcG silencing in all tissues. Strikingly, the endogenous locus is resilient to loss of the same DNA fragment, indicating the importance of context on PcG chromatin domain formation and function. Fifth, ubiquitously expressed flanking genes may act as boundaries for enhancers within PcG domains. These data provide experimental evidence for the hypothesis that chromosomal neighborhood plays an important role in regulating gene expression (De, 2019).

Transgenic assays to test 'functional PREs' have highlighted the effect of chromatin context on reporter gene expression. For example, the strong PREs upstream of en, present in the 1.5-kb fragment examined in this study, only act to repress reporter transgene expression in about 50% of chromosomal insertion sites. Previous work has shown that these PREs are dispensable from the endogenous inv-en locus. The normal phenotype of en80^Δ1.5 flies also supports these data. However, this study shows that the HAen79@attP40 transgene has a higher level of H3K27me3 accumulation than HAen79Δ1.5@attP40. This reduced level of H3K27me3 is sufficient to maintain correct en expression in embryos and imaginal discs. Notably, the absence of strong PREs in HAen79Δ1.5@attP40 caused misexpression of en during adult abdomen development. These data emphasize the importance of context itself on the stability and resiliency of the PcG domain toward modification (mutation or deletion) of regulatory sequences present within the domain. The data also show that deletion of the 1.5-kb DNA fragment that includes the strong en PREs rendered the HAen79 transgene unable to rescue inv en mutants. Strikingly, HA-en expression from HAen79Δ1.5 was very low in the embryonic nervous system, a result not seen with HAen79 or in en80Δ1.5 flies. These data suggest that the embryonic nervous system enhancers are not able to interact well with the HA-en promoter in the absence of the 1.5-kb DNA fragment. This same deletion, when present at the endogenous locus, does not cause a loss of en expression in the nervous system. This suggests that the structure of the endogenous locus is resilient to the loss of this DNA. This study suggests that the chromosomal locationof the endogenous locus aids in the proper folding of en to facilitate enhancer-promoter communication in the embryonic nervous system. Another explanation for this observation is that the 1.5-kb PRE could be acting as a Trithorax response element (TRE) and bind to Trithorax group proteins for proper en expression. In any case, the essential role of 1.5-kb PRE/TRE at the ectopic locus is alleviated by the 'local context' effect at the endogenous locus (De, 2019).

It has been proposed that establishment and inheritance of H3K27me3 are dependent on two factors: (i) sequence-specific recruitment of the chromatin modifiers and (ii) the ability of H3K27me3 to act as a template for PRC2 to bind and modify other nucleosomes present in the vicinity. The current experiments show that the spreading of H3K27me3 to the flanking chromatin surrounding the insertion site is not very efficient, although no active chromatin mark was present in the vicinity (particularly at attP3). However, 4C-seq analysis showed that the ectopic en PcG domain interacted with flanking chromatin significantly until it encountered the active domain. These observations indicate that mere interactions between the PcG domain and flanking chromatin were not able to spread the repressive mark efficiently. This observation highlights the importance of PcG recruitment via PREs for PcG domain formation (De, 2019).

One of the surprising observations in this study was the deposition of the repressive mark over the exons of Msp300 in larvae that contain HAen79@attP40. This accumulation is also observed in Kc167 cells but not in larval samples that did not have HAen79 inserted at attP40. It is assumed that some regions in the genome are more susceptible to the PcG regulation and that the weak binding peaks of PcGs near the attP40 site might be acting as PREs to facilitate spreading of the mark. It is posited that, when a PcG domain is inserted into attP40, it renders the adjacent chromatin more likely to form an H3K27me3 domain. It is noted that the Msp300 gene is covered by both H3K27me3 and H3K36me3. This is likely due to the mixed cell population in larval brains and discs. What about the spreading of the H3K27me3 mark over the exons? This is unusual; however, pre-mRNA and cotranscriptional activity have been linked to the local chromatin structure. In a genomic study of many histone modifications in human and Caenorhabditis elegans DNA, the H3K27me3 mark was enriched over exons. In addition, physical interaction between mammalian splicing factors (U2snRNP and Sf3b1) and PcG proteins (Zfp144 and Rnf2) was reported to be required for proper repression of Hox genes. It is speculated that cotranscriptional recruitment of PcG proteins over the Msp300 gene in the transgenic line might have established the exon-specific deposition of the H3K27me3 mark. It is proposed that the exon-specific H3K27me3 accumulation can act as an intermediate step to repress transcriptionally active target genes and establish PcG domains during development (De, 2019).

Two interesting questions for chromatin biologists are the following: How does a meter-long genome fit into a nucleus and how does this folding influence genome function? High-resolution Hi-C experiments have given structural insights into interphase chromosomes in eukaryotic nuclei. According to the Hi-C data, chromatin is organized into TADs or 'contact domains', and the TADs form compartments A (enriched with active domains) and B (enriched with inactive domains). While the Drosophila genome has 'compartments,' the existence of TADs in Drosophila is disputed, and despite the evidence that TADs and compartments are important for chromatin organization and function, basic information about how these structures are formed and maintained has been incomplete. The current data shows that it is the intrinsic property of chromatin to segregate based on histone modifications and gene activity. This property of chromatin is not locus specific; in different chromatin contexts, repressed chromatin tends to segregate from active domains. Biochemical and molecular evidence on the antagonistic behavior between the H3K27me3 and H3K36me3 modifications also support this claim and provide evidence toward H3K36me3 as a chromatin component that restricts the PcG-mediated spread of H3K27me3. How are large PcG domains formed and maintained? The establishment of the PcG domain and the spreading of H3K27me3 start from the strong PREs present in the PcG domain during cell cycle 14. Repressive loops between PREs within PcG domains are also formed during cell cycle 14. These loops are proposed to play a synergistic role in establishing PcG domains. Surprisingly, deletion of some strong PREs in situ resulted in weak phenotypes, suggesting redundancy of PcG recruitment. This study provides evidence that apart from the minor PREs present in the large PcG domains, chromatin context itself is a critical factor that determines robustness and function of PcG domains (De, 2019).

Engrailed, Suppressor of fused and Roadkill modulate the Drosophila GLI transcription factor Cubitus interruptus at multiple levels

Morphogen gradients need to be robust, but may also need to be tailored for specific tissues. Often this type of regulation is carried out by negative regulators and negative feedback loops. In the Hedgehog (Hh) pathway, activation of patched (ptc) in response to Hh is part of a negative feedback loop limiting the range of the Hh morphogen. This study shows that in the Drosophila wing imaginal disc two other known Hh targets genes feed back to modulate Hh signaling. First, anterior expression of the transcriptional repressor Engrailed modifies the Hh gradient by attenuating the expression of the Hh pathway transcription factor cubitus interruptus (ci), leading to lower levels of ptc expression. Second, the E-3 ligase Roadkill shifts the competition between the full-length activator and truncated repressor forms of Ci by preferentially targeting full-length Ci for degradation. Finally, evidence is provided that Suppressor of fused, a negative regulator of Hh signaling, has an unexpected positive role, specifically protecting full-length Ci but not the Ci repressor from Roadkill (Roberto, 2022).

This study examined the roles of three potential negative regulators of Hh signal transduction, two of which are themselves encoded by Hh target genes. In each case interesting new aspects about the pathway's regulation. Anterior expression of en likely extends the range of the Hh gradient were discovered (Roberto, 2022).

Anterior expression of en in the wing imaginal disc was first observed 30 years ago and en is the Hh target gene requiring the highest level of Hh signaling. Its domain of expression exactly correlates with a region of lower full-length Ci protein levels. It had been proposed that the lower Ci protein levels are a consequence of Ci being particularly active and labile in this region. This study shows that the lower levels of Ci are not primarily due to it being particularly labile, but rather are a consequence of negative transcriptional regulation by En. The role of this negative feedback loop appears to be to modulate the Hh gradient by downregulating the expression of ptc in addition to its effects on dpp. This leads to Hh signaling extending further into the anterior compartment, with a corresponding anterior shift in the location of LV3 and the expression of dpp. A model is prefered in which the attenuation of ptc expression by anterior en is indirect via Ci, but in principle en could also directly negatively regulate ptc. This is thought less likely as, 'flip-out' clones expressing Ci activate high levels of ptc in the posterior compartment in the presence of en. The anterior expression of en occurs late in third instar larvae, which correlates with the downregulation of ci expression as visualized using the UAS-TT transcriptional timer and the refinement of wing vein specification (Roberto, 2022).

Ci function is modulated by two feedback loops acting at different levels. Anterior expression of the En protein attenuates Ci activity directly adjacent to the compartment boundary of the wing disc by downregulating the expression of the ci gene. Rdx and Su(fu) act at the protein level modulating the competition between the full-length (Ci FL) and repressor forms (Ci R) of Ci. Rdx specifically targets full-length Ci, whereas Su(fu) partially protects full-length Ci from Rdx-mediated degradation. Rdx degradation of full-length Ci appears to help downregulate Hh target genes in cells no longer receiving the Hh signal (Roberto, 2022).

Why did this mechanism evolve to modulate the Hh gradient? Morphogen gradients, by virtue of their central roles in the development of multiple tissues, must be robust and resistant to perturbation. Therefore, to specifically expand the range of the Hh gradient in the wing disc a new component was added, anterior expression of the ci repressor en (Roberto, 2022).

The lack of the C-terminal domain in the Ci repressor has multiple consequences. It loses the binding site for the co-activator CBP, and it loses C-terminal binding sites for Su(fu), Cos2 and Rdx. As a consequence, the Ci repressor is not sequestered in the cytoplasm by Cos2 in the absence of Hh signaling and enters the nucleus without Su(fu), whereas full-length Ci enters the nucleus only in the presence of Hh signaling and as a complex with Su(fu) (Roberto, 2022).

In order to better understand the roles of Su(fu) and Rdx, animals heterozygous for the ci^Ce2 mutation were examined. In this context, overexpression of rdx or loss of Su(fu) function leads to a complete fusion between LV3 and LV4. In addition, clones mutant for Su(fu) show dramatic reduction in the expression of the Hh target genes ptc and dpp. These results show that Su(fu) has a potential novel positive role in Hh signal transduction, improving the ability of full-length Ci to compete with the repressor form. A positive role for Su(fu) has also been found in mammals where Su(fu) appears to function as a chaperone for the full-length Gli proteins, but not the repressor forms, and is required for full activation of Gli target genes. The requirement for Drosophila Su(fu) is obviated in the absence of Rdx, suggesting that Rdx primarily targets full-length Ci and not Ci repressor, even though the repressor is not protected by Su(fu). These results are analogous to what is seen with the mammalian homologue of Rdx, SPOP, indicating that this mechanism has been conserved during evolution. SPOP is opposed by Su(fu) and degrades the full-length forms of the mammalian GLI2 and GLI3 but not the GLI3 repressor form. The competition between Rdx and Su(fu) appears to be rather finely balanced as either increasing the expression of rdx or reducing the expression of Su(fu) enhances the ability of Ci^Ce2 to compete with full-length Ci. This function of protecting full-length Ci from Rdx presumably takes place in the nucleus, as this is where the Rdx protein primarily localizes (Roberto, 2022).

However, the functional relevance of rdx being an Hh target gene has been unclear. Zygotic loss of rdx in the embryo has no visible effect on segmental patterning of the cuticle and, unlike en, knockdown of rdx along the compartment boundary in the wing disc has little effect on wing patterning. Perhaps its role is to clear full-length Ci from cells that were once within the domain of Hh signaling and have moved outside the domain of Hh signaling. Perdurance of Rdx could target full-length Ci in the nucleus allowing the Ci repressor to shut off Hh target genes. This is the situation in the eye disc with the progression of the morphogenetic furrow. Cells that recently received high level Hh signaling and activated Ci must now downregulate Ci to allow proper differentiation of the ommatidia. Rdx appears to be important for this process, as loss of rdx leads to defects in the eye. A similar situation may exist in other tissues. Looking at the temporal regulation of ptc expression with UAS-TT, cells removed from the compartment boundary in the wing disc have lower levels of destabilized GFP relative to RFP and appear to be in the process of shutting off ptc. This distinction is lost following downregulation of rdx by RNAi (Roberto, 2022).

In the domain of modest level Hh signaling (in which dpp is expressed), both full-length Ci and Ci repressor must be present in some form of reciprocal gradients. In this domain, enhancers with perfect Ci consensus binding sites are silent due to binding of Ci repressor. The dpp enhancer with imperfect Ci binding sites is expressed, and for it to be completely active, full-length Ci must be bound. Why is full-length Ci able to better compete with Ci repressor for the imperfect binding sites? Full-length Ci and the Ci repressor share the same DNA binding domain, and it would be expected that the repressor would outcompete full-length Ci for binding to target sites because the repressor is primarily nuclear, whereas full-length Ci is primarily cytoplasmic, even in the presence of Hh signaling, due to a strong nuclear export signal (NES). I suggest that cooperativity between Ci repressor proteins at perfect Ci binding sites can account for this distinction. Another potential mechanism for preferentially recruiting full-length Ci to imperfect binding sites might be suggested by the different protein interactions observed with full-length Ci and CiCe2. Full-length Ci enters the nucleus with Su(fu) while the Ci repressor is not bound to Su(fu). In addition, the Ci repressor is missing the CBP binding site. As a consequence, full-length Ci could engage in protein-protein interactions with other transcription factors that are not available to the Ci repressor. This added affinity to other proteins within the enhanceosome could allow the preferential recruitment of full-length Ci to enhancers with imperfect Ci binding sites. Differential protein-protein interactions may also explain why full-length Ci is still able to activate ptc-lacZ expression along the compartment boundary in ciCe2/+ heterozygotes but not the artificial enhancer 4bs-lacZ. The ptc-lacZ enhancer is a bona fide Drosophila enhancer and is likely to recruit a constellation of proteins that could interact with full-length Ci, whereas protein-protein interactions are likely to be much less robust at 4bs (Roberto, 2022).

In conclusion, these results highlight the complexity of Hh signal transduction and its modulation. Expressing en in the anterior compartment of the wing pouch modulates the Hh gradient, whereas Su(fu) has a surprising positive role in the pathway, acting to partially protect full-length Ci from the E-3 ligase Rdx that Ci activates (Roberto, 2022).

Interlocking of co-opted developmental gene networks in Drosophila and the evolution of pre-adaptive novelty

The re-use of genes in new organs forms the base of many evolutionary novelties. A well-characterised case is the recruitment of the posterior spiracle. gene network to the Drosophila male genitalia. This study found that this network has also been co-opted to the testis mesoderm where is required for sperm liberation, providing an example of sequentially repeated developmental co-options. Associated to this co-option event, an evolutionary expression novelty appeared, the activation of the posterior segment determinant Engrailed to the anterior A8 segment controlled by common testis and spiracle regulatory elements. Enhancer deletion shows that A8 anterior Engrailed activation is not required for spiracle development but only necessary in the testis. This study presents an example of pre-adaptive developmental novelty: the activation of the Engrailed transcription factor in the anterior compartment of the A8 segment where, despite having no specific function, opens the possibility of this developmental factor acquiring one. It is proposed that recently co-opted networks become interlocked, so that any change to the network because of its function in one organ, will be mirrored by other organs even if it provides no selective advantage to them (Molina-Gil, 2023).

It has been observed that genes playing particular roles during organ development can be recruited to perform novel functions in other organs. The re-use, or co-option, of developmental genes in new organs, is the base of many evolutionary novelties. An example of this is the highly transparent crystallin proteins that refract light in the eye lens. In all vertebrates, α-crystallin evolved from the co-option of a small heat shock protein to the eye; while in birds δ-crystallin evolved from the co-option of a different protein, the Arginosuccinase lyase involved in arginine biosynthesis (Molina-Gil, 2023).

Although there is abundant research on single-gene co-option, few studies have considered the functional consequences of full-gene network co-option. One of the best-characterised co-option cases is the recruitment of the appendage-forming gene network to form the eye-spots that decorate butterfly wings in several species. Similarly, in the Drosophila melanogaster subgroup, the larval respiratory posterior spiracles and the adult male genitalia share the expression of numerous genes due to the recent co-option into the male genital disc primordium of a pre-existing gene network controlling the formation of the external larval respiratory organs. The co-option of this gene network to the male genitalia resulted in the formation of the posterior lobe, a structure present in D. simulans and D. mauritiana, closely related to Drosophila melanogaster, but not in the more distant D. biarmipes or D. ananassae species (Molina-Gil, 2023).

The formation of the posterior spiracles and the posterior lobes have been well-studied in D. melanogaster. Posterior spiracle organogenesis is regulated by a gene network activated in the eighth abdominal larval segment (A8) by the Hox protein Abdominal-B (Abd-B). The internal spiracular chamber is formed from A8 anterior compartment cells (A8a) when Abd-B activates in the dorsal ectoderm the transcription of the JAK/STAT signalling pathway ligand Unpaired (Upd) as well as the Empty spiracles (Ems) and the Cut (Ct) transcription factors. The external protruding stigmatophore is formed from both anterior and posterior compartment A8 cells when Abd-B activates the Spalt (Sal) transcription factor, which in turn activates engrailed (en) transcription in a unique A8 pattern. These primary factors activate the RhoGAP Cv-c and RhoGEF64C cytoskeletal regulators, the cell polarity gene crumbs (crb) and various Cadherins6 Abd-B also modulates the expression of wingless and the EGF regulator rhomboid genes generating A8 specific segmental information distinct from that in more anterior segments (Molina-Gil, 2023).

The posterior lobe is a hook-shaped structure used by the male to grasp the female during mating, which may act as a prezygotic reproductive isolation barrier facilitating speciation. In D. melanogaster, ten genes of the spiracle gene network are required for the formation of the posterior lobe, with their activation in at least seven cases being regulated in both structures by the same cis-regulatory elements (CRE). The study of two of these enhancers, revealed that the same DNA-binding sites activate the CRE's expression in both organs, making this one of the best-characterised cases of whole gene network co-option (Molina-Gil, 2023).

The recruitment of a gene network to a new organ exposes it to different selective pressures that may accelerate the appearance of novelties. One such novelty is the expression of the posterior segment determinant engrailed in the anterior compartment cells of the A8 segment. Engrailed is crucial during segmentation, and its expression in the anterior compartment is surprising given that En has been localised to the posterior cells all along arthropod evolution (Molina-Gil, 2023).

Drosophila segmentation results from the activation of the segment-polarity genes engrailed (en), hedgehog (hh) and wingless (wg) in periodic stripes of cells along the antero-posterior axis of the embryo. Once activated, the segment-polarity genes engage in cross-regulatory interactions that maintain their expression. As a result, En and its direct target hh become expressed throughout development in the posterior compartment of every segment, where they regulate cell tension and adhesive characteristics that prevent posterior cells from mixing with anterior compartment cells, generating stable signalling boundaries. Mutant embryos for either en or hh result in an almost complete fusion of segments. As the posterior spiracle is one of the few circumferential organs in the embryo, en activity in anterior A8 cells could be required for establishing the circumferential information pattern necessary for spiracle organogenesis that contributed to the evolution of the protruding posterior spiracles characteristic of dipteran larvae (Molina-Gil, 2023).

This study investigated when en was recruited to the anterior A8 cells (A8a) and the cis and trans regulatory elements responsible for it. A8a expression appeared in Diptera before the evolution of the posterior lobe, and evidence is presented that this is associated with a previous posterior spiracle gene network co-option event to the testis mesoderm. Working with Drosophila melanogaster, this study showed that Engrailed expression in the A8a compartment is not required for the spiracle's development and that the engrailed CRE controlling spiracle expression is required in the testis cyst cells for spermiation. This work presents an example of repeated sequential gene network co-option events involving tissues of different germ layers, and shows how this resulted in the generation of a bona fide pre-adaptive developmental expression novelty: the activation of the En transcription factor in the anterior compartment of the embryonic A8 segment where, despite having no specific function, it opens the possibility of this important developmental factor acquiring one in the future. The expression of en in the anterior compartment of the A8 segment, was likely caused by the regulatory interlocking of the co-opted networks. It is proposed that gene network interlocking occurs as the result of the use of the same gene network in several organs, so that any change to the network because of its functionality in one organ, will be mirrored in all organs even if it has no selective advantage in some of them (Molina-Gil, 2023).

Previous work showed that the ectodermally expressed posterior spiracle gene network had been co-opted to the ectodermal male genitalia in the Drosophila melanogaster clade. This study shows the same gene network has also been co-opted to the mesodermal head cyst cells (HCCs) in Drosophila. Part of the posterior spiracle gene network becomes activated in the HCCs using the same enhancers driving expression in the posterior spiracle (Molina-Gil, 2023).

Although Abd-B is expressed in the somatic testis cells during embryogenesis, it is not expressed in the adult HCCs. Two alternative explanations are suggested. The first possibility is that one of the spiracle primary targets, many of which encode transcription factors, becomes expressed in the testis HCC independently of Abd-B regulation, and this results in the activation of the other spiracle genes due to cross-regulatory network interactions. Alternatively, the embryonic Abd-B expression could be speculated to epigenetically modify the spiracle gene network leaving it in a poised state that later could become activated in the absence of Abd-B (Molina-Gil, 2023).

The finding that the posterior spiracle gene network has been co-opted twice to different organs, suggests that the coordinated activation of several transcription factors and signalling molecules of the network does not necessarily cause detrimental effects. The reported effects of experimentally inducing the ectopic activation of the eye gene regulatory network in D. melanogaster, a situation akin to what may happen during co-option, may help in understanding this. eyeless activation in the imaginal primordia results in the formation of ectopic eyes and causes morphological alterations leading to lethality. However, it has been noted that ectopic Eyeless expression in the wing only induces eye development in proximal cells where Dpp is also expressed, and similar results were observed with Hh43,44 showing that the activation of a gene network inducer does not cause developmental transformations in all cells where it is expressed. Thus, when the co-option of a gene network causes developmental transformations reducing the animal's fitness, it will be lost. However, if the co-option had no influence on local developmental processes, it could be tolerated giving the opportunity to the gene network elements to interact with allelic variants present in the population whose interaction could result in selective pressures fixing the trait. Such a series of steps may have led to the co-option of other developmental gene networks such as the activation of the appendage gene network in the butterfly wings that resulted in the formation of novel wing spot patterns (Molina-Gil, 2023).

Expression of Engrailed in posterior metameric stripes is characteristic of arthropods, but is also observed in Onycophorans and in certain worms indicating an ancient origin12,46,47. In flies, anterior En activation has rarely been reported48, and in D. melanogaster, anterior expression is the exception. En activation in A8a associated to the posterior spiracle is present in D. virilis, but not in E. balteatus, suggesting it originated in the higher Diptera (Brachicera). This is supported by analyses showing that in other Diptera like Bactrocera dorsalis (Tephritidae) and Lucilia sericata (Calliphoridae), that present well-developed larval posterior spiracles, engrailed expression is restricted to the posterior A8 segment. The A8a engrailed spiracle expression does not depend on the segmentation gene network, but is regulated by the posterior spiracle network through Abd-B, Sal and the JAK/STAT signalling pathway. Despite its complex regulation, this en enhancer has no function in Drosophila melanogaster's spiracle organogenesis. Taken together, Engrailed expression in the anterior compartment of A8 was acquired after functional posterior spiracles already existed in dipteran larvae and its origin had to do with testis evolution. Future experiments should establish if this occurred due to the de novo appearance of the enD CRE, or to a silent enD cryptic enhancer becoming activated due to the recruitment of a new transregulatory factor in the spiracles/testis gene network or to changes in chromatin accessibility (Molina-Gil, 2023).

Most insects have a pair of non-protruding spiracle openings in each trunk segment (see Interlocking of co-opted gene networks). In the hemimetabolous bug Oncopeltus fasciatus nymph or the holometabolous beetle Tribolium castaneum larva these are formed by an internal spiracular chamber that expresses the Ct protein,; however, in Oncopeltus, Ct expression is not regulated by Hox proteins as it is in Drosophila, but depends on the tracheal protein Trachealess. Although in Tribolium, Spalt is expressed in the lateral ectoderm, it is not associated with the spiracles. Similarly, Engrailed is restricted to a posterior stripe of cells in the segment that does not surround the spiracular opening. Analysis of E. balteatus embryos suggests Sal was recruited in Diptera to the protruding stigmatophore before engrailed was expressed in A8a. Analysis of Sal and En expression does not reveal any activation in E. balteatus testis, indicating the spiracle network is not expressed in the gonads of all Diptera. This situation changes in Drosophila species where both Sal and En are expressed in A8a associated with the spiracles as well as in the testes. The co-option of the posterior spiracle cascade to the testis and the recruitment of Engrailed expression may have been the result of the major sperm size and gonad morphology changes occurring at some point after the divergence of Episyrphus and Drosophila that could have required the selection of new genetic variants to maintain fertility. As the enD enhancer has no apparent function in the posterior spiracles, but is required for male spermiation in D. melanogaster, it is proposed that originally this CRE was selected for its testicular function and that expression in the posterior spiracles was a side effect caused by the interlocked regulation of the co-opted gene networks in both organs which was not eliminated because it had no deleterious effects. Finally, after the divergence from the virilis group, a second co-option event of the spiracle gene network in the male genital disc of D. melanogaster and closely related species led to the evolution of the posterior lobe. Given the extreme variability of the posterior lobe sizes between various D. melanogaster genotypes it is unclear whether the enD enhancer has a direct function on lobe development or if its expression there is aphenotypic, and only the result of its interlocking with other functional elements of the co-opted gene network (Molina-Gil, 2023).

These results suggest that the selection of a new trait in a co-opted gene network in one organ, results in its activation in all organs where the network is expressed. This implies a slower acquisition of new traits in co-opted gene networks as novelties would become discarded if they had detrimental effects in any of the interlocked organs. On the other hand, if the new trait (for example the activation of a novel transcription factor) was selected for its function in one organ, this would lead to the fixation of non-functional pre-adaptive traits with functional potential in all interlocked organs. Thus, it is proposed that co-option can lead to sensu stricto pre-adaptation cases, as opposed to the associated term of exaptation. While in exaptation the co-opted character had a previous selective function that has been recruited to perform a novel function (i.e., heat shock proteins recruited to form the eye crystallin, or feathers that could have served for heat regulation or sexual display before being co-opted for flight) cases like the expression of Engrailed in the posterior spiracle provide no selective advantage but could conceivably acquire it in the future, as has happened in the anterior wing of several Diptera where En has acquired a new role in wing pigmentation (Molina-Gil, 2023).

Early treatement of engrailed in The Interactive Fly

Most animals are constructed of segments. This is as true for worm and frog as it is for fly and human. Much of the epidermis of Drosophila develops as a chain of alternating anterior (A) and posterior (P) compartments, populations of cells that differ from each other because the selector gene engrailed (en) is active in the cells of P but not A. Studies of the wing have led to a general model of how compartments and selector genes build pattern. Early in development, the state of en expression is fixed in sets of cells ('on' in P and 'off' in A); the state being inherited by all the descendants of each set. During growth, the borderlines between A and P compartments act as engines to produce positional information - the mechanism depending on a secreted molecule Hedgehog (Hh) being made by all P cells. Hh crosses over the border to reach nearby A cells which are primed to receive it. These A cells then respond to Hh by becoming a line source of a diffusing morphogen (such as Dpp), to form a gradient with a peak near the compartment boundary - this gradient delivers information of position, polarity and dimension to both A and P compartments of the wing. Variations on this basic mechanism may be used to generate pattern in both insects and vertebrates (Lawrence, 1999. and references therein)

Two questions are discussed below. First, how is en regulated, resulting in its expression in 14 parasegments? The immediate answer is found by looking at pair-rule genes.

Six pair-rule genes (even-skipped, fushi tarazu, sloppy paired, runt, paired and odd-paired), working in concert, ensure that there are 14 stripes of EN. The hallmark of pair-rule genes is their expression in seven stripes early in embryonic development. Pair-rule patterning, producing seven stripes, is a result of the action of gap genes, whose function is to define each of the seven stripes of primary pair-rule genes. Even earlier in the developmental history of the fly, the expression of maternal genes serves to structure the expression of gap genes. Moving from maternal to gap to pair-rule to segment polarity genes (en and wingless), one of the major functions in this hierarchy of genetic expression is the regulation of en and wingless, resulting in the creation of the necessary fourteen evenly spaced segments on either side of the segmental border, thus establishing the segmental compartmentalization of the fly. Even-skipped indirectly regulates engrailed. In odd parasegments, graded expression of eve establishes the en stripes by setting the boundaries of the activator paired and the repressors runt and sloppy paired (Fujioka, 1995). Expression of en in even parasegments results from activation by Fushi tarazu (with FTZ-F1 as cofactor) (Florence, 1997). Only the most anterior cells of each ftz stripe express en and this restriction is dependent upon odd-skipped and naked (DiNardo and O'Farrell, 1987, and Mullen, 1995). For more information on the pair-rule gene regulation of en see the Transcriptional regulation section below, or specific sites for each of the pair-rule genes.

Second, what exactly does en do to define each parasegment, that is, what is the function of en in each segment? en expressing cells in the anterior compartment of each parasegment communicate with the adjacent compartment (expressing wingless) by means of the secreted protein Hedgehog. Engrailed acts cell autonomously to activate transcription of hedgehog. Hedgehog signals then effect the induction of wingless. Engrailed also acts positively to activate invected, Engrailed's partner in establishing cell identity. Engrailed acts in each anterior compartment to suppress proteins made in the posterior compartments, such as Wingless, Decapentaplegic, Patched, Deformed and Cubitus interrruptus. EN also has some segment specific effects, such as downregulating Ultrabithorax in parasegment six. For more information about the targets of en, see the Targets of activity of this site or the sites for each of Engrailed's target genes.

GENE STRUCTURE: Two related genes, engrailed and invected, are separated by approximately 25 kb. They are transcribed on opposite strands toward one another. One of the introns separates two regions encoding the homeodomain (Poole, 1985).

cDNA clone length - 2411

Bases in 5' UTR - 177

Exons - three

Bases in 3' UTR - 406

PROTEIN STRUCTURE

Amino Acids - 552

Structural Domains

Engrailed has a divergent homeodomain. The cross homology of Engrailed with Ultrabithorax and Antennapedia is 58% and 53% respectively (Fjose, 1985). Engrailed has an alanine rich region homologous to Tup1 that can function in gene repression.

The Engrailed homeoprotein is a dominantly acting, so-called 'active' transcriptional repressor, both in cultured cells and in vivo. When retargeted via a homeodomain swap to the endogenous fushi tarazu gene (ftz), Engrailed actively represses ftz, resulting in a ftz mutant phenocopy. Functional regions of Engrailed have been mapped using this in vivo repression assay. In addition to a region containing an active repression domain identified in cell culture assays, there are two evolutionarily conserved regions that contribute to activity. The one that does not flank the HD is particularly crucial to repression activity in vivo. This domain is present not only in all engrailed-class homeoproteins but also in all known members of several other classes, including goosecoid, Nk1, Nk2 (vnd) and muscle segment homeobox. The repressive domain is located in the eh1 region, known as 'region three', found several hundred amino acids N-terminal to the homeodomain. The consensus sequence, arrived at by comparing Engrailed, Msh, Gsc, Nk1 and NK2 proteins from a variety of species, consists of a 23 amino acid homologous motif found in all these proteins. Thus Engrailed's active repression function in vivo is dependent on a highly conserved interaction that was established early in the evolution of the homeobox gene superfamily. Using rescue transgenes it has been shown that the widely conserved in vivo repression domain is required for the normal function of Engrailed in the embryo (Smith, 1996).

The 2.2 A resolution structure of the Drosophila Engrailed homeodomain bound to its optimal DNA site is reported. The original 2.8 A resolution structure of this complex provided the first detailed three-dimensional view of how homeodomains recognize DNA, and has served as the basis for biochemical studies, structural studies and molecular modeling. The refined structure confirms the principal conclusions of the original structure, but provides important new details about the recognition interface. Biochemical and NMR studies of other homeodomains have led to the notion that Gln50 is an especially important determinant of specificity. However, refined structure shows that this side-chain makes no direct hydrogen bonds to the DNA. The structure does reveal an extensive network of ordered water molecules that mediate contacts to several bases and phosphates (including contacts from Gln50), and a model provides a basis for detailed comparison with the structure of an Engrailed Q50K altered-specificity variant. Comparing the proposed structure with the crystal structure of the free protein confirms that the N and C termini of the homeodomain become ordered upon DNA-binding. However, several key DNA contact residues in the recognition helix are found to have the same conformation in the free and bound protein, and several water molecules also are preorganized to contact the DNA. The proposed structure helps provide a more complete basis for the detailed analysis of homeodomain-DNA interactions (Fraenkel, 1998).

A novel Drosophila paired-like homeobox gene, DPHD-1, has been isolated. The homeodomain of DPHD-1 shows 85% amino-acid identity with that of the C. elegans Unc-4 protein. Whole-mount in situ hybridization of embryos and third-instar larvae reveal that the DPHD-1 mRNA is specifically localized in subsets of postmitotic neurons in the central nervous system (CNS) and in the developing epidermis, where it displays a segmentally repeated pattern. Double staining with a posterior compartment marker, an anti-Engrailed antibody, has shown that DPHD-1 expressing neurons in the CNS are present in the posterior compartment, whereas DPHD-1 expression in the epidermis is restricted to the anterior compartment in each segment. This temporal and spatial expression pattern suggests that DPHD-1 may play a role in determining the distinct cell types in each segment (Tabuchi, 1998).

The homeodomain (HD) is a ubiquitous protein fold that confers DNA binding function on a superfamily of eukaryotic gene regulatory proteins. Here, the DNA binding of recognition helix variants of the HD from the engrailed gene of Drosophila was investigated by phage display. Nineteen different combinations of pairwise mutations at positions 50 and 54 were screened against a panel of four DNA sequences consisting of the Engrailed consensus, a non-specific DNA control based on the lambda repressor operator OR1 and two model sequence targets containing imperfect versions of the 5'-TAAT-3' consensus. The resulting mutant proteins can be divided into four groups that vary with respect to their affinity for DNA and specificity for the Engrailed consensus. The altered specificity phenotypes of several mutant proteins were confirmed by DNA mobility shift analysis. Lys50/Ala54 is the only mutant protein that exhibits preferential binding to a sequence other than the Engrailed consensus. Arginine is a functional replacement for Ala54. The functional combinations at 50 and 54 identified by these experiments recapitulate the distribution of naturally occurring HD sequences and illustrate how the Engrailed HD can be used as a framework to explore covariation among DNA binding residues (Connolly, 1999).

The EH1 motif in metazoan transcription factors

The Engrailed Homology 1 (EH1) motif is a small region, believed to have evolved convergently in homeobox and forkhead containing proteins, that interacts with the Drosophila protein Groucho (C. elegans unc-37, Human Transducin-like Enhancers of Split). The small size of the motif makes its reliable identification by computational means difficult. The predicted proteomes of Drosophila, C. elegans and human have been systematically searched for further instances of the motif. Using motif identification methods and database searching techniques, which homeobox and forkhead domain containing proteins also have likely EH1 motifs was examined. Despite low database search scores, there is a significant association of the motif with transcription factor function. Likely EH1 motifs are found in combination with T-Box, Zinc Finger and Doublesex domains as well as discussing other plausible candidate associations. Strong candidate EH1 motifs have been identified in basal metazoan phyla. Candidate EH1 motifs exist in combination with a variety of transcription factor domains, suggesting that these proteins have repressor functions. The distribution of the EH1 motif is suggestive of convergent evolution, although in many cases, the motif has been conserved throughout bilaterian orthologs. Groucho mediated repression was established prior to the evolution of bilateria (Copley, 2005).

Sequence motifs were sought in homeobox containing transcription factors taken from the proteins of human, Drosophila and C. elegans, by first masking known Pfam domains, and then using the expectation maximization algorithm implemented in the meme program. The first non-subfamily specific motif identified corresponded to previously known examples and new instances of, the EH1 motif, in 100 sites, with an E-value of < 10^-126. The same approach was applied to Forkhead containing transcription factors, identifying 25 sites with a combined E-value of < 10^-31. These motifs also appeared to conform to the consensus of the EH1 motif (Copley, 2005).

To further investigate the significance of this similarity, hidden Markov models (HMM) were constructed of the motif (EH1^hox& EH1^fh) which were then searched against the complete set of predicted proteins from human, D. melanogaster and C. elegans. The highest scoring non homeobox containing domain match of EH1^hoxwas a Forkhead protein (human FOXL1), and the second highest scoring non-Forkhead containing match of EH1^fhwas to a homeobox containing protein (Drosophila Invected). In both cases, nearly all the high scoring hits were to proteins containing domains with transcription factor function. Among the best scoring matches of the EH1^hoxsearches were several T-box (TBOX), Doublesex Motif (DM), Zinc finger (ZnF_C2H2) and ETS containing proteins (Copley, 2005).

The presence of EH1 motifs within various homeobox, and to a lesser extent, forkhead-containing proteins has been widely reported, although not systematically studied. EH1-like motifs co-occurring with 3 major groupings of homeobox sub-types were found: the extended-hox class, typified by Drosophila Engrailed; the paired class, including Drosophila Goosecoid, and the NK class, including Drosophila Tinman. Related to the paired class homeobox domains, a number of genes containing PAIRED domains only were also found to contain EH1-like motifs. With only a few exceptions, the EH1-like motif occurs N-terminal to the homeobox domain and C-terminal to the PAIRED domain when present. A number of these proteins have been shown to interact with Groucho or its orthologs, e. g., C. elegans cog-1, Drosophila Engrailed and Goosecoid, and in high throughput assays Drosophila Invected and Ladybird late (Copley, 2005).

A handful of EH1-like motifs are found C-terminal to homeobox domains. Of these, the best characterized is C. elegans unc-4, which has been shown to interact with the groucho ortholog unc-37; the Drosophila ortholog unc-4 also interacts with groucho in high throughput experiments. The C-terminal EH1-like motif is conserved in the closely related Drosophila paralog OdsH. The gene prediction for the human ortholog of unc-4 appears to be artefactually truncated, but the mouse ortholog (Uncx4.1) and corrected human gene models, contain EH1-like motifs both N- and C-terminal to the homeobox domain. Taken together with the fact that in the majority of related homeobox containing proteins the EH1-like motifs are N-terminal, this suggests that the N-terminal motif has been lost in Drosophila and C. elegans unc-4 orthologs (Copley, 2005).

EH1-like motifs also occur N- and C-terminal to Forkhead domains. The N-terminal class consists of the Sloppy-paired genes of Drosophila and orthologous or closely related sequences: human FOXG1, and Drosophila CG9571; the C. elegans ortholog fkh-2 contains an EH1-like motif although a cysteine residue causes a low score. The C-terminal class consists of an apparent clade including the human FOXA, FOXB, FOXC and FOXD genes, although if the EH1 motif was present in the common ancestor of this clade, multiple losses must have later occurred. The situation is complicated somewhat by an EH1-like motif at the N-terminus of C. elegans unc-130, i. e., in the FOXD like family. The EH1 motif in slp1 has been shown to interact with groucho, and FOXA type genes have been shown to interact with human groucho orthologs (Copley, 2005).

Likely EH1 motifs co-occurring with T-Box domains in two distinct contexts. The motif occurs C-terminal to the T-box in the Drosophila Dorsocross proteins Doc1, Doc2 and Doc3. It is found N-terminal to the T-box in 11 proteins including mls-1 and mab-9 from C. elegans; H15, Mid/Nmr2 and Bi/Omd from Drosophila; in humans there are strong matches to TBX18, TBX20 and TBX22 and more marginal matches to TBX3 and TBX2. As far as is known, none of these proteins has been shown to interact with groucho or its orthologs, although several are known to act as transcriptional repressors: for instance, in murine heart development, Tbx20 represses Tbx2 which in turn represses Nmyc; the Dorsocross genes from Drosophila repress wingless and ladybird, and Doc itself is repressed by mid/nmr2. The human proteins TBX1 and TBX10, and Drosophila Org-1 (which are all closely related to those above) do not appear to contain EH1 motifs. The human T (brachyury) protein contains a motif broadly similar to the EH1 consensus: LQYRVDHLLSA in a comparable N-terminal location to those found in other T-box containing proteins. Although this motif scores poorly against EH1^hox, the homologous regions from other T orthologs provide a more persuasive case for the presence of a functioning EH1 motif in these proteins (Copley, 2005).

The highest scoring match of EH1^hoxto a C2H2 zinc finger containing protein, was ces-1 from C. elegans ; this protein interacts with the groucho ortholog unc-37 and can act as a repressor. The putative EH1 motif is at the N-terminal end of ces-1. In contrast, the Drosophila proteins Bowl and Odd have EH1-like motifs at their C-terminal ends. In neither case is there direct evidence from high throughput studies of an interaction with Groucho, but both can function as repressors. The human protein ZNF312 (bit score 8.6) is the ortholog of zebrafish Fezl, which contains an EH1 motif essential for repressor activity -- this motif is conserved in the human paralog and likely Drosophila ortholog CG31670 (Copley, 2005).

The Doublesex Motif (DM) was first found in proteins controlling sexual differentiation in Drosophila. Two DM containing proteins were confidently predicted to contain EH1-like motifs -- human DMRT2, and Drosophila dmrt11e. These are likely orthologs; a C. elegans protein, C27C12.6 contained a weaker match. The molecular function of these proteins is unknown (Copley, 2005).

The EH1 motif is found N- and C-terminal to homeobox, forkhead, T-box and Zn finger protein domains. Clearly, since the locations of the EH1 motif are non-homologous, the N- and C-terminal associations must have occurred independently. The short size of the motif makes it tempting to speculate that the motif itself may have arisen independently (i.e. in repeated cases it may have evolved within sequence that was already part of the gene, rather than via a recombination event). The strongest evidence for this is that, in general, the majority of domain combinations occur in a fixed N to C orientation, suggesting that recombination events combining domains are relatively rare. The fact that there have been many such events suggests that the alternative hypothesis of independent invention is more appropriate (Copley, 2005).

Groucho is orthologous to the C. elegans unc-37 gene, and the four human paralogs TLE1-4 (Transducin Like Enhancer of split). An ortholog is also found in the cnidarian Hydra mangipapillata (e. g., the EST with gi 47137860), and certain cnidarian homeobox containing genes also contain an EH1-like motif, suggesting groucho/EH1 mediated repression pre-dates the split between diplobasts and triplobasts; indeed, a sponge Bar/Bsh like homeobox containing protein also contains an EH1-like motif, as does paxb from the non-bilaterian placozoan Trichoplax adhaerens and a Tlx-like protein from a ctenophore, suggesting the repression system was in place in the earliest animals. High scoring EH1-like motifs are found in Forkhead domain containing proteins from sponges, cnidarians and ctenophores, in both the C-terminal (FOXA-D clade) and N-terminal (FOXG, sloppy paired clade) varieties. The presumed ortholog of 'T' from the Trichoplax adhaerens contains an EH1-like motif. These results suggest that groucho mediated repression using a variety of transcription factors was widespread in the last common ancestor of the metazoa. The EH1 motif is suggestive of a number of instances of convergent evolution, although in many cases the motif has been conserved throughout bilaterian orthologs. Together with the existence of a cnidarian Groucho ortholog, this leads to the conclusion that EH1/Groucho mediated repression was established prior to the evolution of bilateria (Copley, 2005).

date revised: 12 January 2025

The Interactive Fly resides on the
Society for Developmental Biology's Web server.