Drosophila gene families: The Enhancer of split complex

The Interactive Fly

Zygotically transcribed genes

The Enhancer of split complex

(part 2/2)

Target specificities of Enhancer of split basic helix-loop-helix proteins

Seven Enhancer of split genes in Drosophila melanogaster encode basic-helix-loop-helix transcription factors that are components of the Notch signaling pathway. They are expressed in response to Notch activation and mediate some effects of the pathway by regulating the expression of target genes. Using random oligonucleotide selection, the optimal DNA binding site for the Enhancer of split proteins has been determined to be a palindromic 12-bp sequence, 5'-TGGCACGTG(C/T)(C/T)A-3', which contains an E-box core (CACGTG). This site is recognized by all of the individual Enhancer of split basic helix-loop-helix proteins, consistent with their ability to regulate similar target genes in vivo. The 3 base pairs flanking the E-box core are intrinsic to DNA recognition by these proteins and the Enhancer of split and proneural proteins can compete for binding on specific DNA sequences. Furthermore, the regulation conferred on a reporter gene in Drosophila by three closely related sequences demonstrates that even subtle sequence changes within an E box or flanking bases have dramatic consequences on the overall repertoire of proteins that can bind in vivo (Jennings, 1999).

The three related sequences studied were the B1, A1 and A2 sites. The B1 and A1 sites contain optimal flanking bases and differ by a single-base substitution that switches the E box from a class B site (B1) to a class A site (A1). The A2 site has the class A E-box core but suboptimal flanking sequences. The consensus binding site for the E(spl)bHLH proteins contains a class B canonical E-box (CACGTG). This is compatible with the presence of the key arginine residue in the basic region that is characteristic of all other bHLH proteins that recognize class B sites and which contacts the central G. The selected site differs from the previously identified N box (CANNAG), indicating that the latter may not be generally representative of E(spl) target sites -- in vitro, the N box is a much lower affinity target that the class B site. The 12-bp palindrome (the optimal site) is termed the ESE box. Flanking bases have been implicated in DNA binding by other bHLH proteins, including c-Myc and Hairy, based on in vitro assays and X-ray crystallography studies that reveal interactions between bHLH proteins and bases outside the E-box core. However, the flanking bases preferred by c-Myc and Hairy differ from those selected by E(spl)bHLH proteins, indicating that in vivo, the sequences immediately surrounding an E box are important for determining exactly which bHLH proteins bind there to regulate transcription (Jennings, 1999 and references).

The interactions with flanking bases helps to explain the specificity in vivo of different bHLH proteins, an important factor given the large number of bHLH proteins identified to date. The in vivo expression patterns produced by E boxes with different flanking bases in these experiments emphasizes the significance of the flanking sequences. For example, a comparison between the A2 and A1 sites demonstrates that the former is a target for many more transcriptional activators. These experiments also illustrate the relevance of different E-box core sequences, since a single-base difference within the E-box core (A1 to B1) is sufficient to prevent binding of proneural proteins and other activators. This is in agreement with earlier studies that argued that proneural proteins and E(spl)bHLH repressors recognize sites with distinct types of E-box cores. However, these results show that E(spl)bHLH repressors prefer the class B core, which is recognized by many different bHLH activators and repressors, over the class C core. Class C has been designated the target for repressor bHLH proteins that contain a proline residue in the basic domain. The class C site (CACGCG) is the optimal binding site for the Drosophila Hairy protein, whose basic domain contains a proline residue but differs from E(spl)bHLH proteins in 7 of the 11 remaining residues, which could account for the different profile of DNA binding specificities. The distinctions in the DNA binding specificities could be significant for studies of the vertebrate homologs of the E(spl)bHLHs and Hairy. Overall, the in vitro binding experiments and the activity of different sites in vivo demonstrate that the bHLH proteins that were tested can recognize a specific range of target sequences and that both core and flanking bases are important for determining the binding specificity (Jennings, 1999 and references).

Although flanking bases may distinguish sites for different types of E-box binding proteins, there are no significant differences in the bases recognized by individual E(spl) proteins; the same consensus binding site was derived for each of three proteins tested. There were subtle differences in the ranges of oligonucleotides, with Mdelta selecting a broader range of variants at the flanking sites than Mgamma and M3 and the latter two proteins exhibiting more tolerance for variants in the core E box, but experiments comparing the affinity of the proteins for these variant sites reveal no detectable bias (Jennings, 1999).

The binding specificities observed are all for homodimers of individual E(spl) proteins. In places where more than one E(spl)bHLH protein is expressed (e.g., proneural clusters), it is possible that the proteins form heterodimers among themselves to bind DNA and repress transcription. However, given that the amino acid sequences of the DNA binding domains and the DNA binding preferences of the individual E(spl)bHLH proteins are so similar, it seems unlikely that heterodimers between E(spl)bHLH proteins would differ greatly from homodimers in their DNA binding sequence preferences. In addition, during several developmental processes, a single E(spl)bHLH protein predominates (e.g., Mbeta in the presumptive intervein region of the wing), indicating that E(spl)bHLH proteins are likely to function as do homodimers. There is also no evidence to suggest that the E(spl)bHLH proteins are required to form heterodimers with other bHLH family members to bind DNA and repress gene transcription in response to Notch signaling. Thus, the homodimers analyzed in these experiments likely represent complexes that are functional in vivo (Jennings, 1999 and references).

The overall similarity in the binding of different E(spl) proteins in vitro suggests that they are capable of recognizing the same targets in vivo and is consistent with the phenotypes observed when the individual proteins are expressed ectopically. Ectopic expression of M8, M5, Mbeta, Mdelta, and M7 all produce phenotypes of vein and bristle loss. Both Mbeta and M7 are able to interact with DNA sequences regulating achaete. The ability to recognize the same DNA target sequences could explain the apparent redundancy between the E(spl) genes, as they would all have the potential to act in the same processes. The observation that specific E(spl)bHLH proteins are more or less efficient in regulating different processes (e.g., Mbeta more effective at suppressing veins and M8 more effective at suppressing bristles) is thus more likely to be consequence of differences in protein:protein interactions than of differences in target recognition (Jennings, 1999 and references).

In the absence of E(spl)bHLH proteins, proneural protein expression persists at high levels in all cells of a proneural cluster. Thus, one action of E(spl)bHLH proteins is to antagonize the proneural proteins, with the ultimate consequence that proneural gene expression is repressed. It has been proposed that E(spl)bHLH proteins exert their influence by binding to regulatory regions within the AS-C and repressing transcription of the proneural genes. This hypothesis is supported by the observations that expression of Achaete is induced by M7ACT and MbetaACT and that induction of ectopic bristles in the Drosophila wing and notum by M7ACT is abolished in the absence of proneural proteins. One putative binding site for the E(spl)bHLH proteins, that upstream of the achaete gene, has the sequence 5'-CGGCACGCGACA-3' (Hairy site). Mgamma will bind this site in vitro, and M7 can bind this sequence and repress transcription in a cotransfection assay in Drosophila S2 cultured cells. However, mutation of this site in vivo results in a phenotype resembling that caused by mutations in hairy rather than in the E(spl)-C. This fits with the observation that this sequence conforms to an optimal Hairy DNA binding site but is a suboptimal site for the E(spl) proteins and indicates that the E(spl) proteins do not recognize this sequence in vivo. Thus, if E(spl) proteins are directly repressing achaete expression, there should be more optimal target sites elsewhere within the AS-C. Indeed, a search of recently available AS-C genomic sequence identifies >10 sequences with good matches to ESE boxes, in addition to the sites that have been identified by in vitro binding assays (Jennings, 1999 and references).

An alternative hypothesis is that the primary function of the E(spl)bHLH proteins is to antagonize the actions of proneural proteins posttranscriptionally. Evidence in support of this comes from experiments in which L'sc is ectopically expressed using a heterologous promoter that is not subject to direct regulation by E(spl)bHLH proteins. Under these conditions L'sc expression results in isolated ectopic bristles, rather than clusters of bristles, demonstrating that lateral inhibition is still able to restrict neural fate to a single cell even though l'sc transcription is insensitive to Notch signaling. This implies that E(spl)bHLH proteins are able to antagonize proneural genes in ways other than by repressing their transcription. One possibility is that the E(spl) proteins can interact with the same targets as proneural proteins, but that they repress rather than activate transcription. The ability of E(spl) proteins to bind to the B1 and A1 sequences and repress transcription from a heterologous promoter is consistent with this model, as is the observation that M7ACT can induce certain ectopic leg bristles in the absence of the achaete and scute genes. In the latter context, M7ACT is likely to be acting on genes with functions downstream of the proneural proteins to cause neural differentiation. In addition, the E(spl)bHLH proteins are involved with developmental processes that do not involve the proneural proteins, e.g. wing vein development; thus, they cannot act solely to repress proneural gene transcription during development (Jennings, 1999 and references).

How might E(spl)bHLH repress transcription of target genes? The closely related protein Hairy has been shown to repress transcription in a dominant manner even when its binding sites are located at some distance from the promoter, leading to the hypothesis that Hairy is able to mediate stable, inheritable repression of the target genes. It is anticipated that E(spl)bHLH repression will be transitory, so that if Notch signaling were terminated, the E(spl) proteins would decay and the target genes would be susceptible to reactivation. Although proneural and E(spl)bHLH proteins optimally prefer different core E-box binding sites, so that independent binding to target genes appears likely, the importance of the bases flanking the E box in target recognition means that there is potential for overlap in the binding sites of the two groups of proteins. Thus, in cells where expression of E(spl)bHLH proteins is induced by Notch signaling, the proteins accumulate to high levels and could compete for binding to proneural protein target sites of the A1 type described here. Among the E-box sequences recognized by proneural proteins in vitro that have been described, at least a subset have good matches with the ESE consensus and thus could be recognized by both classes of proteins. Now that the sequence preferences of the E(spl)bHLH proteins have been identified, when target genes of proneural and E(spl)bHLH proteins have also been identified and their regulatory regions analyzed, it will be possible to determine whether the sites present offer the potential for competition (e.g., by resembling A1 sites) or whether they have the features of completely distinct binding sites for E(spl)bHLH, Hairy, proneural, and other bHLH proteins (Jennings, 1999 and references).

Enhancer of split complex proteins target twist

One of the first steps in embryonic mesodermal differentiation is allocation of cells to particular tissue fates. In Drosophila, this process of mesodermal subdivision requires regulation of the bHLH transcription factor Twist. During subdivision, Twist expression is modulated into stripes of low and high levels within each mesodermal segment. High Twist levels direct cells to the body wall muscle fate, whereas low levels are permissive for gut muscle and fat body fate. Su(H)-mediated Notch signaling represses Twist expression during subdivision and thus plays a critical role in patterning mesodermal segments. This work demonstrates that Notch acts as a transcriptional switch on mesodermal target genes, and it suggests that Notch/Su(H) directly regulates twist, as well as indirectly regulating twist by activating proteins that repress Twist. It is proposed that Notch signaling targets two distinct 'Repressors of twist' - the proteins encoded by the Enhancer of split complex [E(spl)C] and the HLH gene extra machrochaetae (emc). Hence, the patterning of Drosophila mesodermal segments relies on Notch signaling changing the activities of a network of bHLH transcriptional regulators, which, in turn, control mesodermal cell fate. Since this same cassette of Notch, Su(H) and bHLH regulators is active during vertebrate mesodermal segmentation and/or subdivision, this work suggests a conserved mechanism for Notch in early mesodermal patterning (Tapanes-Castillo, 2004).

Analysis of Notch mutant embryos revealed that Notch signaling is essential for Twist regulation at mesodermal subdivision. However, comparison of Notch and Su(H) mutant embryos indicated that Notch regulates Twist differently from Su(H). At stage 10, uniform high Twist expression was maintained in Nnull mutants; by contrast, Su(H)null mutants have a wild-type-like Twist pattern. Furthermore, while constitutive activation of Notch represses Twist expression at stage 10, constitutive expression of a transactivating form of Su(H) [Su(H)-VP16] increases Twist expression. Despite these differences, double mutant analysis and rescue experiments demonstrate that Notch requires Su(H) to repress Twist. Moreover, further rescue experiments show that Notch signaling acts as a transcriptional switch, which alleviates Su(H)-mediated repression and promotes transcription. In addition, genetics, combined with promoter analysis, suggest that Notch and Su(H) have multiple inputs into twist. Notch/Su(H) signaling both directly activates twist and indirectly represses twist expression by activating proteins that repress Twist. Finally, the data indicate that Notch targets two distinct 'Repressors of twist' - E(spl)-C genes and Emc. It is proposed that Notch signaling activates expression of E(spl)-C genes, which then act directly on the twist promoter to repress transcription. Since removing groucho enhances the phenotype of the E(spl)-C mutant embryos, it is suggested that the corepressor, Groucho, acts with E(spl)-C proteins and the Hairless/Su(H) repressive complex to mediate direct repression of twist. The second 'Repressor of twist', Emc, mediates repression of Twist in an alternative fashion. It is hypothesized Emc activity inhibits dimerization of Da with itself or another bHLH protein. This, in turn, prevents Da from binding DNA and activating twist transcription. Since Emc is expressed in the embryo prior to stage 10, it is likely that the transition from uniform high Twist expression to a modulated Twist pattern involves Emc inhibition of Da activity at stage 9. In conclusion, this work uncovers how Notch signaling impacts a network of mesodermal genes, and specifically Twist expression. Given that Notch signaling directs cell fate decisions in many Drosophila embryonic and adult tissues and that Notch regulates Twist in adult flight muscles, these data may suggest a more universal mode of Notch regulation (Tapanes-Castillo, 2004).

The distinct mesodermal phenotypes of Notch and Su(H) mutants can be explained by Notch acting as a transcriptional switch. This aspect of Notch signaling has been described in other systems, and the early Drosophila mesoderm appears no different in this regard. However, these data suggest that there is more to the phenotypes; that is, additional layers of Notch regulation in the transcriptional control of twist (Tapanes-Castillo, 2004).

Genetic experiments, as well as promoter analysis, raised the hypothesis that Notch signaling regulates twist directly, as well as indirectly by activating expression of a 'repressor of twist.' This indirect repression of twist concurs with the role of Notch in activating E(spl) transcriptional repressors. Moreover, a mechanism involving direct and indirect regulation is consistent with Su(H) mutant phenotypes. In Su(H)null embryos, neither twist nor repressor of twist (for example, emc) are repressed. The de-repression of both genes at the same time results in Twist expression appearing 'wild-type-like'. When a constitutively activating form of Su(H) is expressed, both twist and repressor of twist are activated. In these embryos, high Twist domains are expanded, but uniform high Twist expression is not observed because repressor of twist is expressed (Tapanes-Castillo, 2004).

However, simple direct and indirect regulation [through emc and E(spl)-C genes] by Notch still does not fully explain the phenotypes of Notch mutants. Both twist and repressor of twist should be repressed in Nnull embryos because Su(H) will remain in its repressor state. While the Nnull phenotype was consistent with repressor of twist being repressed, twist was still strongly expressed. Additionally, constitutive Notch activation should cause both twist and repressor of twist to be expressed. Consequently, Nintra was expected to cause a phenotype similar to that caused by Su(H)-VP16. Contrary to these predictions, panmesodermal expression of Nintra represses Twist, consistent with only repressor of twist being strongly expressed. Taken together, these results suggested that at stage 10, the twist promoter is less receptive to Notch/Su(H) activation than to Notch/Su(H) repression. As a result, constitutive activation of Notch represses twist, while loss of Notch activates twist ectopically (Tapanes-Castillo, 2004).

While Notch signaling has the ability to activate twist, Notch/Su(H) signaling ultimately leads to repression of twist at stage 10. This predominance of repression can be explained in two ways: (1) direct Notch activation of the twist promoter is overpowered by Notch activated repressors of twist; and (2) a repressor of twist gene, such as E(spl), is more responsive to Notch/Su(H) activation than twist. These ideas are discussed below in light of the results (Tapanes-Castillo, 2004).

The first model proposes that while Notch signaling might directly promote both twist and repressor of twist activation, repressors of twist might suppress an increase in twist transcription. The data suggest that Notch regulates multiple repressors of twist, including E(spl)-C genes and Emc. On the twist promoter, these multiple repressors could overwhelm Su(H) activation. Hence, twist would be transcriptionally repressed rather than activated. In Su(H)-VP16 embryos, the constitutive activating ability of Su(H) on the twist promoter might inhibit some of this repression. Consequently, Twist is ectopically expressed at high levels (Tapanes-Castillo, 2004).

The data are also consistent with the second model, which proposes that twist and a repressor of twist gene, such as E(spl), respond differently to Notch activation. The reason for this differential response is provided by the concept of Notch instructive and permissive genes. Transcription of Notch instructive genes requires the intracellular domain of Notch (Nicd) first to alleviate Su(H)-mediated repression and then to serve as a coactivator for Su(H). Transcription of Notch permissive target genes requires Nicd solely to de-repress Su(H); Su(H) bound to other coactivators and/or other transcriptional activators is necessary for permissive gene activation. Since panmesodermal expression of Nintra does not activate twist, it is concluded that simple de-repression of Su(H) is insufficient to activate twist expression and that other factors are required. Hence, Notch acts permissively on the twist promoter. By contrast, panmesodermal expression of Nintra is sufficient to activate a repressor of twist, resulting in the strong Twist repression. Since E(spl)-C genes have been categorized as Notch instructive target genes, it is suggested that E(spl)-C genes are the Notch instructive repressor of twist genes in this system. Although Notch can upregulate Emc expression, the inability to see a change in Emc expression in Nnull and Su(H)null mutants suggests Emc is not a Notch instructive target gene. Thus, based on all of this work, the instructive and permissive target gene regulation model is currently favored (Tapanes-Castillo, 2004).

The Enhancer of split complex of Drosophilids derived from simple ur-complexes preserved in mosquito and honeybee

In Drosophila melanogaster the Enhancer of split-Complex [E(spl)-C] consists of seven highly related genes encoding basic helix-loop-helix (bHLH) repressors, intermingled with four genes that belong to the Bearded (Brd) family. Both gene classes are targets of the Notch signalling pathway. The Achaete-Scute-Complex [AS-C] comprises four genes encoding bHLH activators. Focussing on Diptera and the Hymenoptera Apis mellifera, the question arose how these complexes evolved with regard to gene number in the evolution of insects. In Drosophilids, both gene complexes are highly conserved, spanning roughly 40 million years of evolution. However, in species more diverged, like Anopheles or Apis , dramatic differences are found. Here, the E(spl)-C consists of one bHLH () and one Brd family member (malpha) in a head to head arrangement. Interestingly in Apis but not in Anopheles, there are two more E(spl) bHLH like genes within 250 kb, which may reflect duplication events in the honeybee that occurred independently of those in Diptera. The AS-C may have arisen from a single sc/l'sc like gene which is well conserved in Apis and Anopheles and a second ase like gene that is highly diverged, however, located within 50 kb. Thus, E(spl)-C and AS-C presumably evolved by gene duplication to the current complex composition in Drosophilids in order to govern the accurate expression patterns typical for these highly evolved insects. The ancestral ur-complexes, however, consisted most likely of just two genes: (1) E(spl)-C contains one bHLH member of type and one Brd family member of malpha type, and (2) AS-C contains one sc/l'sc and a highly diverged ase like gene (Schlatter, 2005).

In total, 12 genes in D. melanogaster are known to encode Hairy/E(spl)-like proteins, i.e. bHLH proteins that also have the orange domain and a WRPW-type Gro-binding motif. Apart from the seven E(spl) bHLH proteins, these include Hairy, Deadpan, Side, Hey and Her. Moreover, there is similarity to Stich1/Sticky, which has a bHLH and an orange domain but not the typical Gro-binding motif. Since the number of E(spl) bHLH genes is not conserved in honeybee and mosquito, it was interesting to ask whether all the other genes were present. The Ensembl database was searched with the respective D. melanogaster protein sequences: orthologs were found of all genes except Her in both species. However, most of the predictions are incomplete. It is known from D. melanogaster that these genes contain introns, which complicates the search for potential coding sequences within genomic DNA. Thus, the protein sequence predictions are uncertain. With the sole exception of Dpn, all the proteins are better conserved between Drosophila and Anopheles than between Drosophila and Apis, confirming the evolutionary relationship. The best conserved proteins are Hey and Hairy. The Hey orthologs are 76% identical between Drosophila and Anopheles and 66% between Drosophila and Apis and the Hairy orthologs between 72% and 65%, respectively. Less conservation is found for Side, Dpn and Stich1 (62%/57% Side identity, 57%/59% Dpn identity and 60%/57% Stich1 identity), comparing fly with mosquito and honeybee, respectively. All proteins share the bHLH and orange domains. The WRPW motif of Hairy, Dpn and Side as well as the YRPW motif of Hey is present in the orthologs (Schlatter, 2005).

Extensive genome analyses in recent years revealed that there are not many examples of large gene complexes that are widely conserved. Prominent examples are the HOX (homeobox) complexes, which contain homeotic genes in Drosophila. HOX complexes are well conserved in metazoans despite some variations in gene number. HOX-genes encode regulatory proteins with specific individual functions and mutations affect different aspects of the body plan. Not surprisingly, it is almost only the homeodomain, which serves as sequence-specific DNA binding motif, that is conserved amongst different species. In contrast, similarity among bHLH proteins encoded by the E(spl)-C extends over the entire length, even within the same species, indicating rather recent duplication events. The D. melanogaster proteins M8/M5 and Mß/M3 are most similar with over 70% identity, whereas Mdelta is the most diverged. However, Mdelta still shares at least 50% identity with other E(spl) bHLH protein members. More interesting is the analysis of the overall similarity among these proteins. Here, any one of the proteins is compared with the other six and the result is averaged. Clearly, Mß (73%/64%, similarity/identity) closely followed by Mgamma (72%/63%) is most similar to all others, whereas Mdelta (66%/55%) shows the lowest values. One interpretation might be that the different bHLH genes evolved by duplication out of or mgamma. Remarkably, these two bHLH proteins in addition to M3 are the best conserved in the three Drosophila species. It is postulated that these are the most ancient proteins with the most general function and, therefore, the highest selection pressure. This hypothesis is supported by the finding that has the most general expression pattern from which the others can be derived by a decrease of gene activity. The conspicuous conservation of M3 might hint to an important function during egg development as this gene is also expressed maternally. The high degree of conservation of all E(spl) bHLH orthologous proteins in Drosophilids, which is clearly higher than the similarity within this protein family in D. melanogaster, indicates specific and non-redundant roles during development. Some of these functions have been identified in the past. It is conceivable that regulatory sequences were not duplicated or evolved more rapidly so that now highly dynamic expression patterns of these genes are found (Schlatter, 2005).

appears to be the ancestral bHLH gene of the E(spl)-C in Drosophilids based on its great similarity with all the other bHLH proteins. This assumption is strongly supported by the sequence conservation of the E(spl) bHLH proteins in A. gambiae and A. mellifera. The single E(spl) bHLH protein encoded by the mosquito genome has the highest identity to Mß. The genome of honeybee contains three prospective genes that encode proteins most highly related to E(spl) D.m.Mß and D.m.Mgamma. All three are clustered within a single sequence contig, albeit they span a large segment of about 250 kb, whereas the whole E(spl)-C in D. melanogaster comprises roughly 50 kb. Despite the fact that two of these genes possess introns just within the bHLH domain and at positions close to the ones found in the D. melanogaster genes dpn, hairy or Her , the amino acid sequence similarity classifies them clearly as E(spl) bHLH proteins. A comparison of Anopheles and Apis proteins reveals, that the presumptive Mß homologs have highest similarity (83%) and identity (76%), whereas the protein that classified as A.m.Mgamma is just 70% similar and 66% identical to A.g.Mß (Schlatter, 2005).

In Drosophilids, malpha is located close to and is transcribed in the opposite direction (head to head). This arrangement is likewise found in Anopheles and Apis. Notably, A.m.malphais next to A.m.mß, whereas the two Apis A.m.mgamma and A.m.mß' genes are much further apart. This arrangement is thought to be very ancient. In the beetle Tribolium, which on the tree of evolution is found even more deeply rooted (~300 Myr to Dipterans), two similar genes coding for Mß-like proteins (~65% and ~67% identity to D.m.Mß) are found and one is within ~18 kb of a gene coding for an Malpha-like protein (~52% identity to D.m.Malpha). It is postulated that the ur-complex consists of these two ancestral genes, malpha and . It is intriguing that they belong to the two different classes of Notch-responsive genes in the E(spl)-C, the bHLH and the Brd-class. In the fly, malpha and bHLH genes are similarly expressed. It is not unlikely that they share common regulatory elements that could explain their co-segregation in the process of evolution (Schlatter, 2005).

The Achaete-Scute complex (AS-C) is well conserved in D. virilis: all four genes, achaete (ac), lethal of scute (l'sc), scute (sc) and asense (ase) are found in the same order and orientation on the X-chromosome. As in D. melanogaster, the genes are without introns. All proteins share the typical bHLH motif of the AS-C proteins and this domain reveals the lowest evolutionary rate. However, compared with the bHLH proteins of the E(spl)-C the bHLH proteins of the AS-C evolve faster. The complex can be separated into two clusters that are distinguished by their rates of conservation. L'sc and Sc are well conserved with an identity between D. melanogaster and D. virilis of more than 75% ; in contrast, Ac and Ase are conserved with an identity of less than 69%. Note that the highest divergence that was found between these two species in the E(spl)-C was for M8 with still almost 81% identity (Schlatter, 2005).

Of the four AS-C gene members in D. melanogaster, ase stands out because it is much larger than the other three. In D. virilis, the size increase is even more striking: D.v.Ase is predicted to comprise 619 residues, whereas D.m.Ase is only 486 residues in length. This extension of more than 20% additional residues is caused by multiple insertions of repetitive sequences that code for poly-glutamine (Q), poly-alanine (A) and poly-asparagine (N) stretches. Like in D. melanogaster the unrelated gene pepsinogen-like (pcl) is located between l'sc and ase (Schlatter, 2005).

Genes related to achaete or scute have been identified in a large number of species, from hydra to mouse, and so these are also to be expected in the different insects. The AS-C was most intensely studied in various species of Schizophora flies, apart from Drosophila. The number of genes varies between one and four, however, is not strictly correlated with the position in the phylogenetic tree. For example, AS-C of Calliphora vicina contains three genes, whereas other dipteran flies like Drosophila contain four. Two genes are found in the branchiopod crustacean Triops longicaudatus. In Dipteran flies the expression patterns of the proneural genes are largely varied. This is regulated by positional information through the Iroquois Complex and pannier and in addition by a transcriptional feed-back loop involving AS-C proteins. Eventually, neural precursors are selected by the repressive activity of E(spl) bHLH proteins. In this way, location and number of the large bristles on the notum is precisely controlled. The mosquito is covered with rows of large sensory bristle, where number and position varies between individuals. This is in accordance with the fact that there is only one scute-like gene, A.g.ash that is expressed all over the presumptive notum in a modular pattern. Recently it was shown that the Anopheles A.g.ash gene can mimic the endogenous Drosophila genes and that overexpression leads to many ectopic bristles (Schlatter, 2005).

Although the bristle pattern on the notum of different Drosophilids varies slightly, bristle number and position is highly stereotyped. Therefore, it is not surprising to find the AS-C highly conserved within Drosophilids. Yet, the rate of change came unexpectedly and is quite remarkable outside of the bHLH domain. Compared to E(spl) bHLH proteins, those encoded by AS-C have a rather low degree of similarity, most notably Ac. In fact, the big flesh fly Calliphora vicina, which like Drosophila belongs to the Schizophora, is totally lacking the ac gene and is covered with bristles. In agreement, no ac was found in Anopheles or Apis, arguing for rapid evolution. The best conservation rate is found in Sc and L'sc suggesting high evolutionary pressure and maybe common ancestry. Not only the bHLH domain, but also two small stretches outside (aa 203; SPTPS in D. melanogaster L'sc) and also the C-terminus are of high similarity, the latter found identical in Calliphora. Presumably these protein domains are of functional importance. Indeed, the C-terminus acts as a transcriptional activation domain and is also used to recruit E(spl) bHLH proteins. Although the alignments of the respective genes of honeybee and mosquito to sc and l'sc are very similar, the tendency is toward a closer relationship to l'sc. However, it is proposed that this gene pair arose by duplication in the course of Drosophilid evolution, such that a common ancestor may be present in the other two species (Schlatter, 2005).

The rate of conservation is very limited for the Ase homologs. Decent conservation is found within the bHLH domain, and moreover, a further well-conserved box is present (NGxQYxRIPGTNTxQxL; x are differences between A. gambiae and D. melanogaster). This sequence is likewise detected in the Ase protein of C. vicina, which shares many more similarities with D.m.Ase. In Apis, there is no such conservation outside of the bHLH domain, which itself is highly diverged. The overall degree of conservation is so poor that further statements about the relationship are difficult. It is argued that this gene represents A.m.ase by its close proximity to A.m.ash, although other interpretations are similarly possible. An analysis of its expression pattern in honeybee may help to solve these questions (Schlatter, 2005).

In conclusion this study found that both E(spl)-C and AS-C expanded rather recently because they are only present in their current complex structures in Drosophilids. In Apis and in Anopheles, very similar arrangements are found indicative of an ancient ur-complex. The E(spl)-C seems to have evolved from two genes, one HES-like and one Brd-like that are arranged in a head to head orientation. Both types of genes are responsive to Notch signalling in Drosophila. The data suggest that the most ancient genes are E(spl) bHLH and E(spl) malpha from which the other E(spl)-C genes derived by duplication and subsequent change. Moreover, an E(spl) ur-complex is likewise detected in Tribolium castaneum that belongs to the order Coleoptera. In Drosophila the complex also gained unrelated genes like m1 and gro. The latter is highly conserved, however, located at different genomic positions. Whereas in Anopheles the ur-complex seems to exist in its original form, two additional -like bHLH genes are found in the Apis genome that possess introns. These introns are at similar positions as the introns of two other HES-like genes, dpn and h which themselves are highly conserved in the three insect species, arguing for a common evolutionary history. Presumably, the introns are evolutionarily ancient because they are also found in the C. elegans E(spl)/h like gene lin-22. The AS-C seems to originate from a single sc/l'sc like bHLH gene and a second largely diverged bHLH gene that shares similarity with Drosophila ase. The high degree of variation in the latter makes it difficult to conclusively decide on the original arrangement of this gene complex (Schlatter, 2005).


REFERENCES

Jennings, B. H., Tyler, D. M. and Bray, S. J. (1999). Target specificities of Drosophila Enhancer of split basic helix-loop-helix proteins. Mol. Cell. Biol. 19: 4600-4610

Schlatter, R. and Maier, D. (2005). The Enhancer of split and Achaete-Scute complexes of Drosophilids derived from simple ur-complexes preserved in mosquito and honeybee. BMC Evol. Biol. 5: 67. 16293187

Tapanes-Castillo, A. and Baylies, M. K. (2004). Notch signaling patterns Drosophila mesodermal segments by regulating the bHLH transcription factor Twist. Development 131: 2359-2372. 15128668

back to Enhancer of split complex part 1/2


Related sites:

  • The function and regulation of groucho and Enhancer of split
  • Notch and Delta and the dual function signaling protein and transcription factor, Suppressor of Hairless
  • VND ( also known as NK2) protein: another regulator of E(spl)-C

  • Zygotically transcribed genes

    Home page: The Interactive Fly © 1995, 1996 Thomas B. Brody, Ph.D.

    The Interactive Fly resides on the
    Society for Developmental Biology's Web server.