Gene name - engrailed
Cytological map position - 48A3-4
Function - transcription factor
Keywords - segment polarity
Symbol - en
Genetic map position - 2-62.0
Classification - homeodomain
Cellular location - nuclear
Most animals are constructed of segments. This is as true for worm and frog as it is for fly and human. Much of the epidermis of Drosophila develops as a chain of alternating anterior (A) and posterior (P) compartments, populations of cells that differ from each other because the selector gene engrailed (en) is active in the cells of P but not A. Studies of the wing have led to a general model of how compartments and selector genes build pattern. Early in development, the state of en expression is fixed in sets of cells ('on' in P and 'off' in A); the state being inherited by all the descendants of each set. During growth, the borderlines between A and P compartments act as engines to produce positional information - the mechanism depending on a secreted molecule Hedgehog (Hh) being made by all P cells. Hh crosses over the border to reach nearby A cells which are primed to receive it. These A cells then respond to Hh by becoming a line source of a diffusing morphogen (such as Dpp), to form a gradient with a peak near the compartment boundary - this gradient delivers information of position, polarity and dimension to both A and P compartments of the wing. Variations on this basic mechanism may be used to generate pattern in both insects and vertebrates (Lawrence, 1999. and references therein)
Two questions are discussed below. First, how is en regulated, resulting in its expression in 14 parasegments? The immediate answer is found by looking at pair-rule genes.
Six pair-rule genes (even-skipped, fushi tarazu, sloppy paired, runt, paired and odd-paired), working in concert, ensure that there are 14 stripes of EN. The hallmark of pair-rule genes is their expression in seven stripes early in embryonic development. Pair-rule patterning, producing seven stripes, is a result of the action of gap genes, whose function is to define each of the seven stripes of primary pair-rule genes. Even earlier in the developmental history of the fly, the expression of maternal genes serves to structure the expression of gap genes. Moving from maternal to gap to pair-rule to segment polarity genes (en and wingless), one of the major functions in this hierarchy of genetic expression is the regulation of en and wingless, resulting in the creation of the necessary fourteen evenly spaced segments on either side of the segmental border, thus establishing the segmental compartmentalization of the fly. Even-skipped indirectly regulates engrailed. In odd parasegments, graded expression of eve establishes the en stripes by setting the boundaries of the activator paired and the repressors runt and sloppy paired (Fujioka, 1995). Expression of en in even parasegments results from activation by Fushi tarazu (with FTZ-F1 as cofactor) (Florence, 1997). Only the most anterior cells of each ftz stripe express en and this restriction is dependent upon odd-skipped and naked (DiNardo and O'Farrell, 1987, and Mullen, 1995). For more information on the pair-rule gene regulation of en see the Transcriptional regulation section below, or specific sites for each of the pair-rule genes.
Second, what exactly does en do to define each parasegment, that is, what is the function of en in each segment? en expressing cells in the anterior compartment of each parasegment communicate with the adjacent compartment (expressing wingless) by means of the secreted protein Hedgehog. Engrailed acts cell autonomously to activate transcription of hedgehog. Hedgehog signals then effect the induction of wingless. Engrailed also acts positively to activate invected, Engrailed's partner in establishing cell identity. Engrailed acts in each anterior compartment to suppress proteins made in the posterior compartments, such as Wingless, Decapentaplegic, Patched, Deformed and Cubitus interrruptus. EN also has some segment specific effects, such as downregulating Ultrabithorax in parasegment six. For more information about the targets of en, see the Targets of activity of this site or the sites for each of Engrailed's target genes.
The wing imaginal disc is subdivided into two nonintermingling sets of cells: the anterior (A) and posterior (P) compartments. Anterior cells require reception of the Hedgehog (Hh) signal to segregate from P cells. Evidence is provided that Hh signaling controls A/P cell segregation not by directly modifying structural components but by a Cubitus interruptus (Ci)-mediated transcriptional response. A shift in the balance between repressor and activator forms of Ci toward the activator form is necessary and sufficient to define 'A-type' cell sorting behavior. Moreover, Engrailed (En), in the absence of Ci, is sufficient to specify 'P-type' sorting. It is proposed that the opposing transcriptional activities of Ci and En control cell segregation at the A/P boundary by regulating a single cell adhesion molecule (Dahmann, 2000).
To test the role of En and Hh-signaling components in controlling cell segregation, two experimental assays were applied. Both assays are based on the presumption that cells maximize contact (intermingle) with cells of the same adhesiveness and minimize contact with (sort out from) cells of different adhesiveness. In the 'round-up assay', clones of mutant cells are assayed for their shape. Each clone is analyzed by how circular it is and how smoothly its border interfaces with surrounding tissue. The degree of roundness of the clone and smoothness of its border is taken as a measure for the difference in adhesiveness between cells inside and outside of the clone. In the wild-type wing imaginal disc, cell segregation is confined to the region of the compartment boundaries. Thus, in the more stringent 'choice assay,' clones generated in the vicinity of the A/P boundary are monitored for their sorting behavior. Clones have three choices: they can (1) remain within their compartment of origin; (2) sort completely into the territory of the adjacent compartment defining a straight border with cells of the compartment of origin at the normal position of the A/P boundary, or (3) sort out from cells of both compartments and take up positions overlapping the normal site of the A/P boundary. Depending on the genetic intervention, the compartment of origin of a clone was determined either by the state of the heritable and P-specific expression of an en-lacZ reporter gene or by the position of the 'twin spot' clone, which is composed of sibling wild-type cells. The position of the A/P boundary was inferred from the expression of a hh-lacZ reporter gene expressed exclusively in P cells (Dahmann, 2000).
Two forms of Ci are distinguished, a constitutively active form, Ci[act], and a repressive form, Ci[rep]. Autonomous and direct roles have been established for Ci[act] and En in specifying A and P cell segregation, respectively. Evidence is also provided that Hh signaling is sufficient to specify A-type cell segregation and that it acts by shifting the balance between Ci[rep] and Ci[act] toward low levels of Ci[rep] and high levels of Ci[act]. It is proposed that the opposing transcriptional activities of Ci[act] and Ci[rep]/En lead to differences in the activity of a cell adhesion system at the boundary of A and P cells, thereby preventing these cell populations from intermingling (Dahmann, 2000).
The smooth and straight boundary between compartments has been ascribed to distinct adhesive properties of cells on opposite sides of the boundary causing these cell populations to minimize contact and sort out. In the case of the A/P boundary of the wing, one difference that could account for the distinct sorting behavior is the exclusive presence of two transcription factors, Ci[act] and En in adjacent A and P cells, respectively. For a long time, the view prevailed that En regulates cell segregation by autonomously and directly specifying P, as opposed to A, cell adhesiveness. This hypothesis has recently been challenged by studies indicating that En acts, at least in part, by directing the expression of Hh and that Hh secreted by P cells induces A cells to acquire a distinct cell adhesiveness. These studies, however, provide conflicting results as to whether or not En also has an autonomous, Hh-independent role in specifying cell segregation at the A/P boundary. The same studies further raised, but did not address, the question of whether Hh signaling would specify cell segregation via its normal transduction pathway by leading to a transcriptional output depending on Ci. In various other systems, the activation of signaling receptors can lead to the posttranscriptional activation of small GTPases that can directly, without altering gene transcription, affect cytoskeletal components and thus conceivably cell adhesion. A key tool for addressing these questions is the choice assay. This assay allows for monitoring whether altering the activity of a gene would change a cell's compartmental preference. Using this assay, the above questions have been addressed by systematically considering three distinct situations (Dahmann, 2000 and references therein).
Situation 1: the 'ground state,' where neither Ci nor En is present.
Irrespective of their compartmental origin, clones of cells null mutant for both ci and en take up positions overlapping the normal site of the A/P boundary with smooth borders to wild-type A and P cells. Because En is not required in A cells and because ci minus single mutant A cells behave like ci,en minus double mutant A cells, it is inferred that Ci is required in A cells for their intermingling with other A cells at the compartment boundary. Since Ci acts in these cells as a transcriptional activator, it is concluded that Hh signaling leads to a Ci-dependent transcriptional response in A cells and transcription of the immediate Hh target gene relevant for A segregation is induced, rather than repressed, in anterior boundary cells. The behavior of ci,en minus double mutant clones also clarifies the role of En. Because clones of P cells lacking En and Ci form smooth borders with neighboring wild-type P cells that also lack Ci and, if in contact with A cells, sort partially into A territory, it is inferred that En has a function in specifying P segregation that is independent of Ci. Since Ci is required for all known responses to Hh signaling, it is concluded that En has a Hh-independent role in determining P segregation. The observation that clones of cells mutant for both ci and en occupy A and P territory to a similar extent leads to the conclusion that Ci and En are required for most if not all aspects of the distinct segregation properties of A and P cells, and the difference between the ground state and the 'A state' brought about by Ci[act] is similar to the difference between the ground state and the 'P state' dependent on En (Dahmann, 2000).
Situation 2: Cells expressing En but lacking Ci.
A more direct argument for a Ci/Hh-independent role of En in the specification of cell sorting behavior can be derived from the experiment in which anterior clones were programmed to express low levels of En. Such cells cease to express Ci and take up positions normally occupied only by P cells. The behavior of these cells is different from that of ground state cells that neither express Ci nor En. In contrast to ci,en minus cells, the low level of En-expressing cells of A origin show a complete transgression to P territory, yet they do not intermingle well with P cells. This latter observation is ascribed to the unnaturally low levels of En produced in these cells (several-fold less than in wild-type P cells). These levels may not repress ci completely and might not be sufficient to fully confer P cell adhesiveness (Dahmann, 2000).
Situation 3: Cells expressing Ci but lacking En.
Posterior clones of cells expressing Ci at physiological levels, but lacking En (mutant for enE), take up positions in the territory normally only occupied by A cells and intermingle with A cells. This behavior is dependent on Ci, since ci,en double mutant clones of P origin only partially occupy A territory and sort out from A cells. Furthermore, overexpression of Ci in P cells leads these cells to sort out from neighboring P cells, and, if in contact with A cells, sort into A territory. Together, by comparing situations (1) to (3), it is concluded that Ci is necessary and sufficient to specify A segregation, and, in the absence of Ci, En is necessary and sufficient to specify P segregation (Dahmann, 2000).
Thus En has an autonomous, Hh-independent role in specifying cell segregation. In addition, Ci is necessary and sufficient to specify A segregation. Ci is activated in anterior boundary cells by Hh whose P-specific expression is in turn controlled by En. Thus, En controls cell segregation at the A/P boundary both by a Hh-dependent as well as a Hh-independent pathway. To determine the relative contributions of these two pathways, situations were generated and analyzed in which En activity was altered under conditions of constant Hh signaling, or conversely, situations in which the activity of Hh signal transduction was altered under constant En conditions. From these experiments, it is concluded that for the segregation behavior of wing cells, the state of the Hh pathway prevails over that of En activity. This conclusion is particularly well corroborated by the finding that cells in which both pathways are simultaneously 'on' (P cells expressing Ci), sort with A cells. The behavior of such cells may also explain why the late expression of en in anterior boundary cells has no deleterious effects on the integrity of the compartment boundary. Like the experimental cells, these cells are exposed to the Hh signal, coexpress ci and en, yet associate with other A cells rather than with En-expressing P cells (Dahmann, 2000).
Ci is required in A cells for proper cell segregation at the A/P boundary. Depending on the status of the Hh signaling pathway, Ci can exist in two forms with opposing transcriptional activities (Ci[rep] and Ci[act]). These two forms of Ci regulate the expression of different subsets of Hh target genes, some of which appear to be regulated exclusively by Ci[rep] or Ci[act]. It is argued that the A/P sorting of wing cells is under control of both forms of Ci. This conclusion is based on findings that both Ci[rep] and Ci[act] have a profound influence on the segregation behavior of A cells. Two observations show that Ci[rep] determines a preference for sorting into P territory. (1) A cells expressing Ci[rep] in the absence of Ci[act] or A cells overexpressing Ci[rep] in the presence of Ci[act] both take up positions occupied normally only by P cells. This is in contrast to cells lacking Ci entirely, which take up positions overlapping the normal position of the A/P boundary. (2) P cells lacking En but expressing Ci[rep] are confined to the P compartment, unlike cells that lack En and Ci or cells that only lack En. It is inferred from this that one important function of Hh signaling in its role of specifying A-type segregation properties is to prevent the formation of Ci[rep] in cells close to the A/P boundary (Dahmann, 2000).
The conclusion that not only prevention of Ci[rep] formation but also the induction of Ci[act] plays an important role in A/P sorting is deduced from the observation that cells lacking both forms of Ci do not mingle with wild-type A cells expressing Ci[act] due to their vicinity to the Hh source. Moreover, the addition of Ci to P cells, where Ci is readily converted to Ci[act], programs P cells to segregate with A cells. Because Ci[rep] influences cell segregation, one might have expected that anterior ci minus clones far away from the A/P boundary would sort out from neighboring Ci[rep]-expressing cells. However, ci minus cells intermingle well with neighboring A cells. One likely explanation for this apparent discrepancy is the partial derepression of hh transcription in ci mutant cells. These low Hh levels induce in neighboring cells the formation of some Ci[act] that might neutralize remnant levels of Ci[rep]. In support of this assumption, it has been found that clones of cells double mutant for ci and hh do sort out at anterior positions (Dahmann, 2000).
Ci and En are both DNA-binding proteins known to act as transcription factors, indicating that they control cell segregation by regulating the expression of target genes. By analogy to dpp, a Hh target gene that is also controlled by En and both forms of Ci, a model is proposed illustrating how Ci[rep], Ci[act], and En might shape the expression profile of a putative immediate target gene involved in cell segregation. Since in the absence of Ci and En, cells segregate neither with A nor with P cells, they are likely expressing an intermediate level of this gene that is different from those in A or P cells. Since Ci[rep] can control cell segregation and is present in A cells far away from the boundary, it is proposed that the basal expression of this hypothetical gene is downregulated by Ci[rep] in these cells. In A cells close to the boundary, Hh signaling prevents the formation of Ci[rep] yet causes the formation of Ci[act], from which it is inferred that in these cells the transcription of this target gene is upregulated. In P cells, En may repress this target gene, consistent with its role as a transcriptional repressor. It is proposed that the opposing transcriptional activities of Ci[act] and En lead to a large difference in the expression of this immediate target gene in cells on opposite sides of the A/P boundary (Dahmann, 2000).
In the above model, it is assumed that Ci and En control cell segregation by transcriptionally regulating one and the same gene, although it is also possible that they regulate different genes. While at present these alternatives cannot be distinguised, the simpler model that Ci and En control the same target gene is preferred for two reasons: (1) there is a precedent case for such a gene, dpp, which is known to be regulated by both Ci and En; (2) a difference in the expression level of a single cell adhesion molecule (Shotgun or DE-cadherin) is sufficient for two cell populations to sort out. While it is conceivable that Ci and En directly regulate the expression of cell adhesion molecules like DE-cadherin, it is also possible that they act more indirectly by regulating genes whose products influence the activity of uniformly expressed cell adhesion molecules. Clones of cells lacking detectable amounts of DE-cadherin do sort out from neighboring wing disc cells; they are, however, exclusively confined to the compartment of origin, indicating that DE-cadherin is not required for the separation of cells at the A/P boundary (Dahmann, 2000).
Why does cell segregation at the A/P boundary require two transcription factors with opposing activities? Based on the results presented here, the differential activities of either Ci or En suffices for separating A and P cells. For Ci, this is best illustrated by the key finding that P cells forced to express Ci sort out from wild-type P cells and segregate into A territory. Conversely, in the absence of Ci, expression of En suffices for A cells to sort into P territory. The use of two transcription factors with opposing activities may have the advantage of increasing the fidelity of the sorting process by further contrasting the expression levels of a common putative target gene in cells of opposite sides of the A/P boundary (Dahmann, 2000).
It seems to be a general mechanism that En controls cell segregation both in a Hh-dependent and -independent manner. In the Drosophila abdomen, En has also been implicated to control separation of A and P cells in Hh-dependent and -independent ways. The relative contributions of these two functions of En, however, appear to differ between the wing imaginal discs and the abdomen. While a prevalence of the Hh-dependent pathway is found in the wing disc, the two functions of En seem to contribute equally to the separation of abdominal A and P cells. This difference in dominance of the Hh-signal transduction pathway might be due to a more influential role of Ci[rep] in the sorting of imaginal versus abdominal cells. It is intriguing to notice that the same intricate network that defines the strip of cells expressing Dpp also appears to restrict the activity of a putative cell adhesion molecule to the very same cells. The use of Hh/En signaling for both setting up the Dpp organizer and segregating A and P cells may ensure that the position and shape of the morphogen source that organizes both compartments is stably maintained during development. The prediction of a dpp-like expression pattern provides a novel criterion for the future identification of the elusive molecules conferring cell segregation (Dahmann, 2000).
In Drosophila, the ventral nerve cord (VNC) architecture is built from neuroblasts that are specified during embryonic development, mainly by transcription factors. Engrailed, a homeodomain transcription factor known to be involved in the establishment of neuroblast identity, is also directly implicated in the regulation of axonal guidance cues. Posterior commissures (PC) are missing in engrailed mutant embryos, and axonal pathfinding defects are observed when Engrailed is ectopically expressed at early stages, prior to neuronal specification. frazzled, enabled, and trio, all of which are potential direct targets of Engrailed and are involved in axonal navigation, interact genetically with engrailed to form posterior commissures in the developing VNC. The regulation of frazzled expression in engrailed-expressing neuroblasts contributes significantly to the formation of the posterior commissures by acting on axon growth. A small genomic fragment within intron 1 of frazzled can mediate activation by Engrailed in vivo when fused to a GFP reporter. These results indicate that Engrailed's function during the segregation of the neuroblasts is crucial for regulating different actors that are later involved in axon guidance (Joly, 2007).
During embryogenesis, Engrailed is first expressed in posterior epidermal cells within each segment, and then later in NBs, GMCs, and neurons. Present at all developmental stages in a subpopulation of neural cells, Engrailed is a good candidate for a factor participating in neuronal determination. Several Engrailed target genes involved in neurogenesis have been identified, and in particular in axonal guidance, including eg, con, comm, fra, ena, and trio. This suggested an important role for engrailed in this process (Joly, 2007).
Interestingly, Trio and Ena were recently found to function as effectors of Fra signalling and to act together in the formation of commissural axons. In particular, they were shown to physically interact, suggesting a potential mechanism by which Fra might coordinate the actin cytoskeletal dynamics necessary for axonal cone growth. This study shows that en genetically interacts not only with fra, but also with ena and trio, to form the posterior commissures. En thus appears to directly regulate PC formation by acting at different levels to ensure axon growth through a complex signalling network that involves Fra (Joly, 2007).
Transheterozygous embryos with alterations in both en and in any of several potential targets present axonal defects that are very similar to those observed in homozygous en mutant embryos. Overexpressing Fra using the prd-Gal4 driver cannot rescue the axonal defects of homozygous en mutant embryos. This confirms that En plays an important role in axonal guidance by regulating various target genes, including ena, trio, commissureless (comm), and transcription factors such as eg, that have been identified as potential En targets. While En is often identified as a repressor, there is no evidence for a role for En in the repression of genes that instruct neurons to choose the AC, such as Wnt5/Drl components. This study demonstrates instead that En regulates axonal guidance and growth by activating components necessary for the establishment of neuronal posterior connectives (Joly, 2007).
Several lines of evidence are provided that fra expression is directly controlled by Engrailed. For example, genomic fragment 2C5 was found to bind En in vivo, first during embryogenesis (as assayed by ChIP) and later in larvae (as assayed by immuno-FISH. In addition, this genomic fragment is shown in this study to be able to mediate activation by En in transgenic flies. However, even though it is known to bind En in embryos, 2C5 is not able to drive GFP expression during embryogenesis, suggesting that it recapitulates only a fraction of the frazzled regulatory sequences (Joly, 2007).
Genetic data is provided arguing that fra is regulated by En during embryogenesis. en and fra interact genetically to ensure the formation of a correct scaffold within the VNC. In homozygous en mutant embryos, fra expression is affected by early stage 11, and Fra immunostaining is absent in the PCs at stage 14, correlating with a loss of posterior commissures (Joly, 2007).
This study shows that PC formation requires an early function of En that acts prior to the specification of neuronal cell fate and to axon growth. Indeed, only the ectopic expression of En at early stages leads to axonal misrouting, whereas the use of pan-neuronal drivers does not cause any axonal defects. Once neurons are specified, En is no longer able to change their fate and hence affect their axonal navigation. This confirms a role for En during NB segregation, and suggests that the neuronal expression of Engrailed is not essential for the formation of the VNC (Joly, 2007).
During NB segregation, Engrailed may participate in the specification of pioneer neurons. Indeed, it was observed that not all the axons that form PCs come from en-expressing cells. Moreover, in homozygous mutant en embryos, it was found that the pioneer marker BP102 was affected in PCs. This suggests that a cluster of En-positive neurons corresponds to the pioneers, which are normally required for normal pathfinding by later outgrowing neurons. This could explain the absence of PCs in engrailed homozygous mutant embryos. Interestingly, the use of a late eve-Gal4 driver to ectopically express En in aCC/RP2 pioneer neurons had no effect on axonal pathfinding. This confirms that the En-sensitive period occurs before the specification of the neurons, including the pioneers (Joly, 2007).
This study shows that the En/fra interaction is important for the formation of the PCs, since PCs are not formed in transheterozygous mutant en−/fra− embryos. This absence of PCs might result from a loss of axonal growth, which is known to involve Fra. This might also account for the PC defects that are observed in homozygous en− mutant embryos (Joly, 2007).
Since the function of En in establishing the axon scaffold within the VNC is essential during NB segregation, it is suspected that the regulation of En target genes involved in axonal pathfinding might also occur at early stages. Indeed, it was possible to confirm that the axonal defects detected in en−/fra− transheterozygous embryos required the loss of early fra activation during NB segregation. This was shown by RNA in situ hybridization and by rescue experiments: PC axons of stage 15 enX31/fra1 embryos only develop normally when Fra expression is recovered before the specification of the neurons, but do not form properly once neurons are formed. Therefore, one possible hypothesis is that the activation of fra in NBs allows the axonal growth of the PC pioneers (Joly, 2007).
The data suggest that the fra level in NBs and neurons is crucial for axon growth. Because mutations affecting axon growth must be dominant over axonal guidance problems, it is logical that the VNCs of both en−/en− and en−/fra− present the same missing PC phenotype. Indeed, with fra being a direct target of En, it can be assumed that in the absence of the En activator, fra expression will be lower or lost. Indeed, it was noticed that en−/fra− embryos phenocopy fra−/fra− embryos: in both cases PCs are missing, and neurons express En but show defects in their positioning. Therefore, these changes in neuronal cell fate can be attributed to a change in fra expression. One open question concerns the sensitive period of Fra in this process: frazzled is activated by Engrailed during the segregation of the NBs, but Fra protein is only detectable in neurons. One possible explanation is that Fra protein is present at early stages, but is under the threshold of detection. Another explanation can also be drawn from previous work in vertebrates, where it has been shown that growth cones possess the machinery necessary for protein translation and can translate guidance molecules locally. The resulting rapid changes in protein levels were shown to be involved in axon guidance. Therefore, one hypothesis is that the fra RNA pool in NBs is rapidly translated in growth cones in order to cause changes in the cytoskeleton necessary for axon growth and their further guidance (Joly, 2007).
These results give new insights into En function during neurogenesis and show that En can alter the VNC architecture at different levels to form PCs, playing on axonal pathfinding and axon growth. Indeed, en mutant embryos present PCs that are not properly positioned or not even formed in most segments. Further, ectopic expression of En leads to abnormal axonal pathfinding. Both loss and gain of function of en could be associated with changes in the identity of the NBs (data not shown), confirming a role for En in this process (Joly, 2007).
En functions during neurogenesis act through the regulation of different target genes. One way is through the regulation of transcription factors such as eagle, but it also regulates the expression of fra, trio and ena, which are more directly involved in axon growth and which participate with En in the formation of the PCs. Indeed, monitoring eg-expressing neurons in an en−/fra− genetic background showed that axons projecting through PCs do not grow properly, confirming that en and fra are involved in this process (Joly, 2007).
Together, these results illustrate how En can act during NB segregation to build a wild-type VNC. Recent results in vertebrates suggest that the regulatory pathway that this study has identified between En and fra (EN1 and DCC in vertebrates) may be evolutionarily conserved. Elucidating the molecular events that allow En/Fra-positive neurons to specifically project axons through PCs but not ACs will be the next challenge to explore in order to better understand axonal guidance (Joly, 2007).
cDNA clone length - 2411
Bases in 5' UTR - 177
Exons - three
Bases in 3' UTR - 406
Engrailed has a divergent homeodomain. The cross homology of Engrailed with Ultrabithorax and Antennapedia is 58% and 53% respectively (Fjose, 1985). Engrailed has an alanine rich region homologous to Tup1 that can function in gene repression.
The Engrailed homeoprotein is a dominantly acting, so-called 'active' transcriptional repressor, both in cultured cells and in vivo. When retargeted via a homeodomain swap to the endogenous fushi tarazu gene (ftz), Engrailed actively represses ftz, resulting in a ftz mutant phenocopy. Functional regions of Engrailed have been mapped using this in vivo repression assay. In addition to a region containing an active repression domain identified in cell culture assays, there are two evolutionarily conserved regions that contribute to activity. The one that does not flank the HD is particularly crucial to repression activity in vivo. This domain is present not only in all engrailed-class homeoproteins but also in all known members of several other classes, including goosecoid, Nk1, Nk2 (vnd) and muscle segment homeobox. The repressive domain is located in the eh1 region, known as 'region three', found several hundred amino acids N-terminal to the homeodomain. The consensus sequence, arrived at by comparing Engrailed, Msh, Gsc, Nk1 and NK2 proteins from a variety of species, consists of a 23 amino acid homologous motif found in all these proteins. Thus Engrailed's active repression function in vivo is dependent on a highly conserved interaction that was established early in the evolution of the homeobox gene superfamily. Using rescue transgenes it has been shown that the widely conserved in vivo repression domain is required for the normal function of Engrailed in the embryo (Smith, 1996).
The 2.2 A resolution structure of the Drosophila Engrailed homeodomain bound to its optimal DNA site is reported. The original 2.8 A resolution structure of this complex provided the first detailed three-dimensional view of how homeodomains recognize DNA, and has served as the basis for biochemical studies, structural studies and molecular modeling. The refined structure confirms the principal conclusions of the original structure, but provides important new details about the recognition interface. Biochemical and NMR studies of other homeodomains have led to the notion that Gln50 is an especially important determinant of specificity. However, refined structure shows that this side-chain makes no direct hydrogen bonds to the DNA. The structure does reveal an extensive network of ordered water molecules that mediate contacts to several bases and phosphates (including contacts from Gln50), and a model provides a basis for detailed comparison with the structure of an Engrailed Q50K altered-specificity variant. Comparing the proposed structure with the crystal structure of the free protein confirms that the N and C termini of the homeodomain become ordered upon DNA-binding. However, several key DNA contact residues in the recognition helix are found to have the same conformation in the free and bound protein, and several water molecules also are preorganized to contact the DNA. The proposed structure helps provide a more complete basis for the detailed analysis of homeodomain-DNA interactions (Fraenkel, 1998).
A novel Drosophila paired-like homeobox gene, DPHD-1, has been isolated. The homeodomain of DPHD-1 shows 85% amino-acid identity with that of the C. elegans Unc-4 protein. Whole-mount in situ hybridization of embryos and third-instar larvae reveal that the DPHD-1 mRNA is specifically localized in subsets of postmitotic neurons in the central nervous system (CNS) and in the developing epidermis, where it displays a segmentally repeated pattern. Double staining with a posterior compartment marker, an anti-Engrailed antibody, has shown that DPHD-1 expressing neurons in the CNS are present in the posterior compartment, whereas DPHD-1 expression in the epidermis is restricted to the anterior compartment in each segment. This temporal and spatial expression pattern suggests that DPHD-1 may play a role in determining the distinct cell types in each segment (Tabuchi, 1998).
The homeodomain (HD) is a ubiquitous protein fold that confers DNA binding function on a superfamily of eukaryotic gene regulatory proteins. Here, the DNA binding of recognition helix variants of the HD from the engrailed gene of Drosophila was investigated by phage display. Nineteen different combinations of pairwise mutations at positions 50 and 54 were screened against a panel of four DNA sequences consisting of the Engrailed consensus, a non-specific DNA control based on the lambda repressor operator OR1 and two model sequence targets containing imperfect versions of the 5'-TAAT-3' consensus. The resulting mutant proteins can be divided into four groups that vary with respect to their affinity for DNA and specificity for the Engrailed consensus. The altered specificity phenotypes of several mutant proteins were confirmed by DNA mobility shift analysis. Lys50/Ala54 is the only mutant protein that exhibits preferential binding to a sequence other than the Engrailed consensus. Arginine is a functional replacement for Ala54. The functional combinations at 50 and 54 identified by these experiments recapitulate the distribution of naturally occurring HD sequences and illustrate how the Engrailed HD can be used as a framework to explore covariation among DNA binding residues (Connolly, 1999).
The Engrailed Homology 1 (EH1) motif is a small region, believed to have evolved convergently in homeobox and forkhead containing proteins, that interacts with the Drosophila protein Groucho (C. elegans unc-37, Human Transducin-like Enhancers of Split). The small size of the motif makes its reliable identification by computational means difficult. The predicted proteomes of Drosophila, C. elegans and human have been systematically searched for further instances of the motif. Using motif identification methods and database searching techniques, which homeobox and forkhead domain containing proteins also have likely EH1 motifs was examined. Despite low database search scores, there is a significant association of the motif with transcription factor function. Likely EH1 motifs are found in combination with T-Box, Zinc Finger and Doublesex domains as well as discussing other plausible candidate associations. Strong candidate EH1 motifs have been identified in basal metazoan phyla. Candidate EH1 motifs exist in combination with a variety of transcription factor domains, suggesting that these proteins have repressor functions. The distribution of the EH1 motif is suggestive of convergent evolution, although in many cases, the motif has been conserved throughout bilaterian orthologs. Groucho mediated repression was established prior to the evolution of bilateria (Copley, 2005).
Sequence motifs were sought in homeobox containing transcription factors taken from the proteins of human, Drosophila and C. elegans, by first masking known Pfam domains, and then using the expectation maximization algorithm implemented in the meme program. The first non-subfamily specific motif identified corresponded to previously known examples and new instances of, the EH1 motif, in 100 sites, with an E-value of < 10-126. The same approach was applied to Forkhead containing transcription factors, identifying 25 sites with a combined E-value of < 10-31. These motifs also appeared to conform to the consensus of the EH1 motif (Copley, 2005).
To further investigate the significance of this similarity, hidden Markov models (HMM) were constructed of the motif (EH1hox & EH1fh) which were then searched against the complete set of predicted proteins from human, D. melanogaster and C. elegans. The highest scoring non homeobox containing domain match of EH1hox was a Forkhead protein (human FOXL1), and the second highest scoring non-Forkhead containing match of EH1fh was to a homeobox containing protein (Drosophila Invected). In both cases, nearly all the high scoring hits were to proteins containing domains with transcription factor function. Among the best scoring matches of the EH1hox searches were several T-box (TBOX), Doublesex Motif (DM), Zinc finger (ZnF_C2H2) and ETS containing proteins (Copley, 2005).
The presence of EH1 motifs within various homeobox, and to a lesser extent, forkhead-containing proteins has been widely reported, although not systematically studied. EH1-like motifs co-occurring with 3 major groupings of homeobox sub-types were found: the extended-hox class, typified by Drosophila Engrailed; the paired class, including Drosophila Goosecoid, and the NK class, including Drosophila Tinman. Related to the paired class homeobox domains, a number of genes containing PAIRED domains only were also found to contain EH1-like motifs. With only a few exceptions, the EH1-like motif occurs N-terminal to the homeobox domain and C-terminal to the PAIRED domain when present. A number of these proteins have been shown to interact with Groucho or its orthologs, e. g., C. elegans cog-1, Drosophila Engrailed and Goosecoid, and in high throughput assays Drosophila Invected and Ladybird late (Copley, 2005).
A handful of EH1-like motifs are found C-terminal to homeobox domains. Of these, the best characterized is C. elegans unc-4, which has been shown to interact with the groucho ortholog unc-37; the Drosophila ortholog unc-4 also interacts with groucho in high throughput experiments. The C-terminal EH1-like motif is conserved in the closely related Drosophila paralog OdsH. The gene prediction for the human ortholog of unc-4 appears to be artefactually truncated, but the mouse ortholog (Uncx4.1) and corrected human gene models, contain EH1-like motifs both N- and C-terminal to the homeobox domain. Taken together with the fact that in the majority of related homeobox containing proteins the EH1-like motifs are N-terminal, this suggests that the N-terminal motif has been lost in Drosophila and C. elegans unc-4 orthologs (Copley, 2005).
EH1-like motifs also occur N- and C-terminal to Forkhead domains. The N-terminal class consists of the Sloppy-paired genes of Drosophila and orthologous or closely related sequences: human FOXG1, and Drosophila CG9571; the C. elegans ortholog fkh-2 contains an EH1-like motif although a cysteine residue causes a low score. The C-terminal class consists of an apparent clade including the human FOXA, FOXB, FOXC and FOXD genes, although if the EH1 motif was present in the common ancestor of this clade, multiple losses must have later occurred. The situation is complicated somewhat by an EH1-like motif at the N-terminus of C. elegans unc-130, i. e., in the FOXD like family. The EH1 motif in slp1 has been shown to interact with groucho, and FOXA type genes have been shown to interact with human groucho orthologs (Copley, 2005).
Likely EH1 motifs co-occurring with T-Box domains in two distinct contexts. The motif occurs C-terminal to the T-box in the Drosophila Dorsocross proteins Doc1, Doc2 and Doc3. It is found N-terminal to the T-box in 11 proteins including mls-1 and mab-9 from C. elegans; H15, Mid/Nmr2 and Bi/Omd from Drosophila; in humans there are strong matches to TBX18, TBX20 and TBX22 and more marginal matches to TBX3 and TBX2. As far as is known, none of these proteins has been shown to interact with groucho or its orthologs, although several are known to act as transcriptional repressors: for instance, in murine heart development, Tbx20 represses Tbx2 which in turn represses Nmyc; the Dorsocross genes from Drosophila repress wingless and ladybird, and Doc itself is repressed by mid/nmr2. The human proteins TBX1 and TBX10, and Drosophila Org-1 (which are all closely related to those above) do not appear to contain EH1 motifs. The human T (brachyury) protein contains a motif broadly similar to the EH1 consensus: LQYRVDHLLSA in a comparable N-terminal location to those found in other T-box containing proteins. Although this motif scores poorly against EH1hox, the homologous regions from other T orthologs provide a more persuasive case for the presence of a functioning EH1 motif in these proteins (Copley, 2005).
The highest scoring match of EH1hox to a C2H2 zinc finger containing protein, was ces-1 from C. elegans ; this protein interacts with the groucho ortholog unc-37 and can act as a repressor. The putative EH1 motif is at the N-terminal end of ces-1. In contrast, the Drosophila proteins Bowl and Odd have EH1-like motifs at their C-terminal ends. In neither case is there direct evidence from high throughput studies of an interaction with Groucho, but both can function as repressors. The human protein ZNF312 (bit score 8.6) is the ortholog of zebrafish Fezl, which contains an EH1 motif essential for repressor activity -- this motif is conserved in the human paralog and likely Drosophila ortholog CG31670 (Copley, 2005).
The Doublesex Motif (DM) was first found in proteins controlling sexual differentiation in Drosophila. Two DM containing proteins were confidently predicted to contain EH1-like motifs -- human DMRT2, and Drosophila dmrt11e. These are likely orthologs; a C. elegans protein, C27C12.6 contained a weaker match. The molecular function of these proteins is unknown (Copley, 2005).
The EH1 motif is found N- and C-terminal to homeobox, forkhead, T-box and Zn finger protein domains. Clearly, since the locations of the EH1 motif are non-homologous, the N- and C-terminal associations must have occurred independently. The short size of the motif makes it tempting to speculate that the motif itself may have arisen independently (i.e. in repeated cases it may have evolved within sequence that was already part of the gene, rather than via a recombination event). The strongest evidence for this is that, in general, the majority of domain combinations occur in a fixed N to C orientation, suggesting that recombination events combining domains are relatively rare. The fact that there have been many such events suggests that the alternative hypothesis of independent invention is more appropriate (Copley, 2005).
Groucho is orthologous to the C. elegans unc-37 gene, and the four human paralogs TLE1-4 (Transducin Like Enhancer of split). An ortholog is also found in the cnidarian Hydra mangipapillata (e. g., the EST with gi 47137860), and certain cnidarian homeobox containing genes also contain an EH1-like motif, suggesting groucho/EH1 mediated repression pre-dates the split between diplobasts and triplobasts; indeed, a sponge Bar/Bsh like homeobox containing protein also contains an EH1-like motif, as does paxb from the non-bilaterian placozoan Trichoplax adhaerens and a Tlx-like protein from a ctenophore, suggesting the repression system was in place in the earliest animals. High scoring EH1-like motifs are found in Forkhead domain containing proteins from sponges, cnidarians and ctenophores, in both the C-terminal (FOXA-D clade) and N-terminal (FOXG, sloppy paired clade) varieties. The presumed ortholog of 'T' from the Trichoplax adhaerens contains an EH1-like motif. These results suggest that groucho mediated repression using a variety of transcription factors was widespread in the last common ancestor of the metazoa. The EH1 motif is suggestive of a number of instances of convergent evolution, although in many cases the motif has been conserved throughout bilaterian orthologs. Together with the existence of a cnidarian Groucho ortholog, this leads to the conclusion that EH1/Groucho mediated repression was established prior to the evolution of bilateria (Copley, 2005).
date revised: 5 Dec 97
Home page: The Interactive Fly © 1995, 1996 Thomas B. Brody, Ph.D.
The Interactive Fly resides on the
Society for Developmental Biology's Web server.