scratch

Gene name - scratch

Synonyms -

Cytological map position - 63B1-B12

Function - transcription factor

Keyword(s) - pan-neural, transcriptional repressor

Symbol - scrt

FlyBase ID:FBgn0004880

Genetic map position - 3-[15]

Classification - zinc finger

Cellular location - nuclear



NCBI link: Entrez Gene

scratch orthologs: Biolitmine
Recent literature
Ramat, A., Audibert, A., Louvet-Vallée, S., Simon, F., Fichelson, P. and Gho, M. (2016). Escargot and Scratch regulate neural commitment by antagonizing Notch-activity in Drosophila sensory organs. Development [Epub ahead of print]. PubMed ID: 27471258
Summary:
During Notch (N)-mediated binary cell fate decisions, cells adopt two different fates according to the levels of N-pathway activation: an Noff-dependent or an Non-dependent fate. How cells maintain these N-activity levels over time remains largely unknown. This study addresses this question in the cell lineage that gives rise to the Drosophila mechanosensory organs. In this lineage a primary precursor cell undergoes a stereotyped sequence of oriented asymmetric cell divisions and transits through two different neural precursor states before acquiring a neuron identity. Using a combination of genetic and cell biology strategies, it was shown that Escargot and Scratch, two transcription factors belonging to the Snail superfamily, maintain an Noff neural commitment by blocking directly the transcription of N-gene targets. The study proposes that Snail factors act by displacing proneural transcription activators from DNA binding sites. As such, Snail factors maintain the Noff state in neural precursor cells by buffering any ectopic variation in the level of N-activity. Since Escargot and Scratch orthologs are present in other precursor cells, these findings are essential for the understanding of precursor cell fate acquisition in other systems.


BIOLOGICAL OVERVIEW

The Drosophila scratch (scrt) gene is expressed in most or all neuronal precursor cells and encodes a predicted zinc finger transcription factor closely related to the product of the mesoderm determination gene snail (sna). Adult flies homozygous for scrt null alleles have a reduced number of photoreceptors in the eye, and embryos lacking the function of both scrt and the pan-neural gene deadpan (dpn), which encodes a basic helix-loop-helix (bHLH) protein, exhibit a significant loss of neurons. Conversely, ectopic expression of a scrt transgene during embryonic and adult development leads to the production of supernumerary neurons. Consistent with scrt functioning as a transcription factor, various genes are more broadly expressed than normal in scrt null mutants. Reciprocally, these same genes are expressed at reduced levels in response to ectopic scrt expression. It is proposed that scrt promotes neuronal cell fates by suppressing expression of genes promoting non-neuronal cell fates. The similarities are discussed between the roles of the ancestrally related scrt, sna, and escargot (esc) genes in regulating cell fate choices (Roark, 1995).

A variety of evidence suggests that the normal function of the scrt gene is to promote neuronal development. Consistent with a role in specifying neuronal cell fates, scrt expression is restricted to the nervous system at all developmental stages examined. Although embryos homozygous for null scrt alleles appear morphologically normal and can survive to adulthood if cultured with care, adult scrt minus escapers have slightly roughened eyes reflecting a reduction in photoreceptor number. Deeper within the eye the scrt mutant phenotype is much more severe, with large spaces separating broken ommatidial clusters. In addition, there are genetic interactions between scrt and dpn, another pan-neural gene, which aggravate the adult eye phenotype. The synergistic action of dpn and scrt is particularly striking in dpn; scrt homozygous double mutant embryos that have significant reductions in neuron number. Consistent with loss-of-function scrt phenotypes leading to neuron loss, ectopic expression of scrt leads to the production of supernumerary neurons during embryogenesis and development of the adult nervous system. These data suggest that scrt normally plays a role in promoting neurogenesis but that additional genes (e.g., dpn) act in parallel with scrt. The strong genetic interaction between scrt and dpn provides evidence that dpn is also likely to play a role in promoting nervous system development (Roark, 1995).

scrt encodes a zinc finger protein related to the products of the sna and esc genes, which have been shown to repress expression of various target genes. The expression of a variety of neuronal and non-neuronal markers were examined in mutants lacking scrt and in HS-scrt individuals induced to express scrt ubiquitously to determine whether scrt might function analogously as a repressor of non-neuronal genes. These experiments have identified several potential scrt target genes such as the Egf-r gene. Ectopic Egf-r expression is sporadically observed in the neuroblast layer of scrt mutant embryos. This phenotype is enhanced in dpn; scrt double mutant embryos suggesting that dpn also contributes to repression of Egf-r expression in the nervous system. Egf-r also is expressed at higher than normal levels in developing photoreceptor cells in scrt eye discs. Reciprocally, Egf-r is strongly down-regulated in epidermal cells when scrt is expressed ubiquitously during embryogenesis or adult development (Roark, 1995).

In general, loss of scrt function leads to ectopic expression of potential target genes, whereas, reciprocally, ubiquitous scrt expression leads to a reduction in expression of these genes. Because scrt functions to promote the formation of neurons at the level of cell fate specification, it is proposed that scrt represses transcription of genes such as Egf-r that promote the establishment of non-neuronal cell fates. In contrast, expression of all neuronal markers that were examined was normal in scrt, dpn;scrt, and HS-scrt embryos. These data are consistent with scrt acting like Sna and Esc to repress expression of target genes. However, more potential direct target genes must be identified and the cis-acting elements of putative scrt responsive genes must be analyzed for functional scrt repressor binding sites to establish direct repression as a mechanism of scrt action (Roark, 1995).

Whereas some of the effects that were observed on gene expression patterns may be attributable to the direct action of scrt as a transcription factor, the anterior expansion of hairy expression in scrt mutant eye discs must be indirect because cells anterior to the furrow normally do not express scrt. It is possible that the expansion of hh expression posterior to the furrow in scrt mutants plays some role in mediating this effect, although existing data support models in which Hh diffusing over a short distance induces expression of dpp, which in turn encodes a long-range signal to promote furrow progression. Interestingly, a recent role for hairy in combination with extramacrochaete [emc] in eye development suggests that these negative regulators of neurogenesis function to retard progression of the furrow. Thus, scrt expression posterior to the furrow may promote indirectly furrow progression by suppressing expression of signals required for activating genes such as hairy ahead of the furrow, which slow furrow progression. The precocious appearance of neuroblasts and primary PNS precursor cells in HS-scrt embryos similarly could be explained by models in which scrt functions normally to initiate neurogenesis, perhaps by repressing expression of genes that antagonize neurogenesis (Roark, 1995).

It is worth noting that pan-neural expression of several vertebrate genes depends on repression of these genes in non-neuronal cells by a factor belonging to a large subfamily of zinc finger proteins that includes scrt and sna. Thus, nervous system specific gene expression de-pends on two forms of negative regulation: (1) repression of non-neuronal genes in the nervous system, and (2) repression of nervous system-specific gene expression in non-neuronal tissues. This suggests that repression must be considered on par with activation as a general mechanism for achieving nervous system specific gene expression (Roark, 1995).

Data described above suggest that scrt and dpn collaborate to repress expression of non-neuronal genes in neuroblasts. Although DNA binding has not yet been demonstrated for scrt, the amino acid sequence in the DNA-binding zinc finger region is highly similar to Sna, which does bind specific DNA sequences and has been shown to behave as a repressor. Similarly, the bHLH region of Dpn is closely related to Hairy, which binds functionally important sequences in the achaete promoter to repress gene expression. Recently, Dpn also has been found to bind DNA. Likewise, E(spl)m8, another protein with a bHLH region related to Hairy, binds DNA and this activity is required to mediate the E(spl) minus phenotype. Sna has been shown to function as a repressor over short distances (< 100-150 bp). Dpn, Hairy, and E(spl) proteins repress transcription of target genes by recruiting Groucho through an interaction with the carboxy-terminal WRPW motif, although it remains to be determined whether this takes place over short or long distances (>1 kb). Thus, it is possible that scrt and dpn collaborate through different mechanisms to repress expression of target genes. In this case, the partial redundant function of these two genes would not be attributable to one gene substituting for the other as has been observed in the case of the structurally related genes comprising the AS-C, the E(spl) complex, and the myogenic family of bHLH-encoding genes (Roark, 1995).

Other known pan-neural genes may collaborate to establish neuronal fates by different mechanisms. For example, asense is likely to function as an activator of neuronal genes, pros turns off expression of pan-neural genes such as dpn in GMCs, scheggia (sea) encodes a secreted factor that inhibits neighboring non-neuronal cells from adopting neuronal fates, and cyclin A most likely functions to regulate neuron-specific cell cycle progression because maternal cyclin stores have largely disappeared by the time these late embryonic cell divisions take place. Thus, neuron-specific gene expression appears to be accomplished by a combination of negative transcription factors such as scrt and Dpn repressing non-neuronal gene expression, and positive factors such as Asense and the recently identified vertebrate bHLH protein NeuroD, which activate expression of neuron-specific genes. These neuron autonomous functions in combination with lateral inhibition of neighbors mediated by secreted factors such as Sea represent three of the most obvious mechanisms by which pan-neural genes might function to promote primary neuronal precursor fates. Expression of these primary precursor genes is then terminated by pros, which permits these cells to move on to the next developmental stage of neurogenesis. The diversity of pan-neural gene function and the regulation of pan-neural genes by different primary upstream regulators, suggests that the neuronal tissue type identity is established by distinct parallel functions rather than by a single orchestrating master gene (Roark, 1995).

It is noteworthy that cells devoted to forming mesoderm and neuronal tissues express the highly related Scrt and Sna zinc finger proteins in tissue-specific patterns. A global role in establishing a common tissue identity also has been proposed for Esc, which participates in distinguishing imaginal diploid cells from other differentiated larval cells that become polyploid. A collaboration between bHLH proteins and zinc finger proteins may be an important parallel between formation of the mesoderm and the nervous system. In the mesoderm, Sna directly represses the expression of nonmesodermal target genes such as rho, which participates in specification of the neuroectoderm, and the bHLH protein encoded by twist (Twi) acts as an activator of mesoderm-specific genes. In the nervous system, bHLH genes of the AS-C (including the pan-neural asense gene) are required for activation of neuronal genes whereas scrt and dpn directly or indirectly repress expression of the epidermal Egf-r gene. Whereas single mutant phenotypes of asense, sen, and dpn are subtle, the combined action of negative and positive transcription factors during neurogenesis may be analogous to that of Sna and Twi during myogenesis. It will be interesting to determine whether a protein in the Dpn, Hairy, and E(spl) repressor subclass of WRPW bHLH proteins also contributes to myogenesis, perhaps by collaborating with Sna to repress expression of neuroectodermal genes (Roark, 1995).


GENE STRUCTURE

cDNA clone length - 6.5 kb

Exons - two


PROTEIN STRUCTURE

Amino Acids - 644

Structural Domains

There are 5 zinc-fingers near the carboxyl domain (Roark, 1995)

Evolutionary Homologs

Scratch is most closely related to Snail, a gene essential for mesodermal determination, and Escargot, a gene that controls polyploidy during imaginal disc growth. Xenopus protein Snail and chick protein Slug are also part of this family (Roark, 1995).

The ces-1 and ces-2 genes of C. elegans control the programmed deaths of specific neurons. Genetic evidence suggests that ces-2 functions to kill these neurons by negatively regulating the protective activity of ces-1. ces-2 encodes a protein closely related to the vertebrate PAR family of bZIP transcription factors, and a ces-2/ces-1-like pathway may play a role in regulating programmed cell death in mammalian lymphocytes. ces-1 encodes a Snail family zinc finger protein, most similar in sequence to the Drosophila neuronal differentiation protein Scratch. An element important for ces-1 regulation is defined and evidence is providd that CES-2 can bind to a site within this element and thus may directly repress ces-1 transcription. These results suggest that a transcriptional cascade controls the deaths of specific cells in C. elegans (Metzstein, 1999).

Members of the Snail family of zinc finger transcription factors are known to play critical roles in neurogenesis in invertebrates, but none of these factors has been linked to vertebrate neuronal differentiation. Expression of a mammalian Snail family member is restricted to the nervous system. Human and murine Scratch (Scrt) share 81% and 69% identity to Drosophila Scrt and the Caenorhabditis elegans neuronal antiapoptotic protein, CES-1, respectively, across the five zinc finger domain. Expression of mammalian Scrt is predominantly confined to the brain and spinal cord, appearing in newly differentiating, postmitotic neurons and persisting into postnatal life. Additional expression is seen in the retina and, significantly, in neuroendocrine (NE) cells of the lung. In a parallel fashion, hScrt expression is detected in lung cancers with NE features, especially small cell lung cancer. hScrt shares the capacity of other Snail family members to bind to E-box enhancer motifs, which are targets of basic helix-loop-helix (bHLH) transcription factors. hScrt directly antagonizes the function of heterodimers of the proneural bHLH protein achaete-scute homolog-1 and E12, leading to active transcriptional repression at E-box motifs. Thus, Scrt has the potential to function in newly differentiating, postmitotic neurons and in cancers with NE features by modulating the action of bHLH transcription factors critical for neuronal differentiation (Nakakura, 2001).

Like other Snail family members, hScrt is a nuclear protein that functions as a transcriptional repressor. Repressor activity resides within the N-terminal non-zinc finger region. However, the conserved N-terminal eight amino acids that hScrt shares with other SNAG domain containing proteins are not required for repressor function. This observation is in contrast to other reports ascribing important repressor function to the N-terminal twenty amino acids of the SNAG domain of vertebrate Snail, Slug, Smuc, and Gfi1 proteins. Though modest nuclear targeting activity has been described for this N-terminal region of Gfi1, effective nuclear localization depends on the full-length protein, including the zinc finger domain. The N-terminal non-zinc finger region of hScrt is not sufficient for proper expression in the nucleus; conversely, information within the zinc finger domain is necessary and sufficient to effect nuclear localization (Nakakura, 2001).

During vertebrate development, Mash1 and Neurogenin1 and -2 are transiently expressed in proliferating neurons of the nervous system and exhibit determination and differentiation functions. mScrt is expressed in an adjacent layer characteristic of newly differentiating neurons and hScrt can repress hASH1-E12-mediated reporter transactivation. Taken together, Scrt may modulate the effects of ASH1-E12 on common target genes, thereby potentially affecting neuronal determination and differentiation. Because Scrt expression is more widespread than Mash1, it is possible that Scrt may interact functionally with other bHLH transcription factors, such as the Neurogenins. Functional interactions occur between Escargot and Scute-Daughterless, as well as Smuc and MyoD-E12. Thus, interactions between Snail family and bHLH factors may be a common theme in development. Further studies of Scrt should provide insight into programs of neural differentiation that appear conserved in normal and neoplastic tissues (Nakakura, 2001).

Members of the Snail family of zinc finger transcription factors are known to play critical roles in neurogenesis in invertebrates, but none of these factors has been linked to vertebrate neuronal differentiation. Reported here is the isolation of a gene encoding a mammalian Snail family member that is restricted to the nervous system. Human and murine Scratch (Scrt) share 81% and 69% identity to Drosophila Scrt and the Caenorhabditis elegans neuronal antiapoptotic protein, CES-1, respectively, across the five zinc finger domain. Expression of mammalian Scrt is predominantly confined to the brain and spinal cord, appearing in newly differentiating, postmitotic neurons and persisting into postnatal life. Additional expression is seen in the retina and, significantly, in neuroendocrine (NE) cells of the lung. In a parallel fashion, hScrt expression is detected in lung cancers with NE features, especially small cell lung cancer. hScrt shares the capacity of other Snail family members to bind to E-box enhancer motifs, which are targets of basic helix-loop-helix (bHLH) transcription factors. hScrt directly antagonizes the function of heterodimers of the proneural bHLH protein achaete-scute homolog-1 and E12, leading to active transcriptional repression at E-box motifs. Thus, Scrt has the potential to function in newly differentiating, postmitotic neurons and in cancers with NE features by modulating the action of bHLH transcription factors critical for neuronal differentiation (Nakakura, 2002).

Mammalian Scrt is the first vertebrate Snail family member known to be expressed highly and specifically in neural tissues. Outside the nervous system and lung NE cells, significant levels of Scrt expression are detected. In flies, scrt is expressed in dividing neuronal precursors and persists in postmitotic neurons. In contrast, mScrt transcripts were not detected in regions known to contain proliferating neurons -- the VZ of the telencephalon, spinal cord, and retina. Rather, prominent domains of mScrt expression were seen adjacent to the VZ in these tissues. Specifically, mScrt expression is detected in a distribution similar to BrdUrd-/Tuj1+ neurons in the telencephalon. Thus, mScrt expression in the CNS appears to be confined to newly differentiating, postmitotic neurons, suggesting a potential role in neuronal differentiation (Nakakura, 2002).

Because Snail family proteins share DNA binding specificity to the hepatanucleotide sequence ACAGGTG, whether hScrt exhibits similar binding properties was investigated. hScrt protein was produced by IVT. In an electrophoretic mobility gel shift assay, hScrt bound to an oligonucleotide containing the wild-type Snail family consensus binding site. Specificity of binding was supported by absence of a shift when a mutant probe containing three base changes was used. Unlabeled wild-type oligonucleotide competes with labeled probe for binding in a dose-dependent manner. This competition is specific; unlabeled mutant oligonucleotide does not compete for binding, even at 2,000-fold molar excess to labeled probe (Nakakura, 2002).

Because the Snail consensus binding sequence contains the E-box motif CAGGTG recognized by bHLH transcription factors, the ability of hASH1 to bind to the same sequence was tested. When either hASH1 or E12 (a heterodimerizing partner for bHLH proteins) was used alone, no or only a modest shift of the wild-type Snail probe was seen, respectively. However, a mixture containing hASH1 and E12 produced a distinct shifted band of the wild-type probe. No shift was detected with the mutant probe. Competition for binding of the wild-type probe was observed with an excess of unlabeled wild-type, but not mutant, oligonucleotide in a dose-dependent manner. Therefore, hASH1-E12 heterodimers can bind specifically to similar sequences as hScrt (Nakakura, 2002).

Because hScrt shares DNA binding specificity with hASH1-E12 and both hScrt and hASH1 are expressed in similar normal tissues and lung cancer cell lines, whether they could interact functionally was tested. Before testing the actions of various hScrt constructs in this context, their subcellular expression was detected by confocal microscopy. Wild-type hScrt is expressed in the nucleus. An hScrt mutant (Delta Zinc Fingers, amino acids 1-190) containing the N-terminal half but lacking all five zinc fingers was expressed diffusely throughout the nucleus and cytoplasm. However, a mutant containing only the zinc finger domain of hScrt (Delta N terminus, amino acids 173-348) localizes to the nucleus indistinguishably from the wild-type protein. Therefore, a critical nuclear localization signal is present within the zinc finger region of hScrt, and the N-terminal SNAG-like domain is dispensable for nuclear localization (Nakakura, 2002).

To determine whether hScrt exhibits active repressor activity, various hScrt domains were fused in-frame with the GAL4 DNA binding domain and targeted to a reporter containing five GAL4 binding sites upstream of the TK promoter. Expression of all constructs was verified by Western blot. GAL4-fusion constructs containing full-length hScrt (GAL4-hScratch), hScrt lacking all five zinc fingers (GAL4-Delta Zinc Fingers), and hScrt minus the first eight amino acids of the SNAG domain (GAL4-Delta 8) all repressed reporter activity as effectively as a construct containing the known repressor MeCP2. In contrast, when the first 40 amino acids of hScrt were expressed as a GAL4-fusion protein (GAL4-N terminus), no repression was seen. Therefore, hScrt repressor activity resides in the non-zinc finger region and is not dependent on the conserved N-terminal eight amino acids of the SNAG domain, which has been reported to be important for repressor activity in Snail family and other zinc finger transcription factors (Nakakura, 2002).

During vertebrate development, Mash1 and Neurogenin1 and -2 are transiently expressed in proliferating neurons of the nervous system and exhibit determination and differentiation functions. mScrt is expressed in an adjacent layer characteristic of newly differentiating neurons and hScrt can repress hASH1-E12-mediated reporter transactivation. Taken together, Scrt may modulate the effects of ASH1-E12 on common target genes, thereby potentially affecting neuronal determination and differentiation. Because Scrt expression is more widespread than Mash1, it is possible that Scrt may interact functionally with other bHLH transcription factors, such as the Neurogenins. Others have reported functional interactions between Escargot and Scute-Daughterless, as well as Smuc and MyoD-E12 in vitro. Thus, interactions between Snail family and bHLH factors may be a common theme in development. Further studies of Scrt should provide insight into programs of neural differentiation that appear conserved in normal and neoplastic tissues (Nakakura, 2002).


REGULATION

Promoter Structure

Analysis of the promoters of deadpan and scratch reveal separate regulatory regions confering expression in the central and peripheral nervous systems. This separate regulation represents an enigma, since regulation by proneural genes of the achaete-scute complex could jointly regulate pan-neural expression. The number of scratch-expressing cells is reduced but not eliminated in proneural mutants. The most parsimonius explanation is that there are unknown regulators of scratch restricted to the CNS or PNS. Such putative regulators are likely to be a repressor of expression in postmitotic CNS neurons, which acts on the PNS element to make these elements completely PNS specific. So in order to maintain orderly development in the fly, even repressors are regulated by repressors (Emery, 1995).

The scrt promoter has separable elements responsible for PNS and CNS expression. The proximal region is responsible for the full range of expression in the PNS, while an upstream region gives CNS expression during late waves of neuroblast delamination. Another fragment, between between proximal and distal regions drives expression in both PNS and CNS. The proximal region also drives expression later in subsets of cells in the CNA and along the ventral midline (Emery, 1995).

Targets of Activity

Scratch functions to repress expression of various genes promoting non-neuronal cell fates. These targets include torpedo (the EGF receptor), orthodenticle, scabrous, hedgehog, and hairy (Roark, 1995).

Sequence conservation and combinatorial complexity of Drosophila neural precursor cell enhancers

The presence of highly conserved sequences within cis-regulatory regions can serve as a valuable starting point for elucidating the basis of enhancer function. This study focuses on regulation of gene expression during the early events of Drosophila neural development. EvoPrinter and cis-Decoder, a suite of interrelated phylogenetic footprinting and alignment programs, were used to characterize highly conserved sequences that are shared among co-regulating enhancers. Analysis of in vivo characterized enhancers that drive neural precursor gene expression has revealed that they contain clusters of highly conserved sequence blocks (CSBs) made up of shorter shared sequence elements which are present in different combinations and orientations within the different co-regulating enhancers; these elements contain either known consensus transcription factor binding sites or consist of novel sequences that have not been functionally characterized. The CSBs of co-regulated enhancers share a large number of sequence elements, suggesting that a diverse repertoire of transcription factors may interact in a highly combinatorial fashion to coordinately regulate gene expression. Information gained from the comparative analysis was used to discover an enhancer that directs expression of the nervy gene in neural precursor cells of the CNS and PNS. The combined use EvoPrinter and cis-Decoder has yielded important insights into the combinatorial appearance of fundamental sequence elements required for neural enhancer function. Each of the 30 enhancers examined conformed to a pattern of highly conserved blocks of sequences containing shared constituent elements. These data establish a basis for further analysis and understanding of neural enhancer function (Brody, 2008).

To determine the extent to which neural precursor cell enhancers share highly conserved sequence elements, cis-Decoder analysis was performed of in vivo characterized enhancers. This analysis revealed the presence of both novel elements and sequences that contained consensus DNA-binding sites for known regulators of early neurogenesis. None of the illustrated conserved neural specific sequence elements within two or more neural precursor cell enhancers were present in a collection of 819 CSBs from in vivo characterized mesodermal enhancers, thus ensuring their enrichment in neural enhancers. Consensus binding sites for known TFs were represented: basic Helix-Loop Helix (bHLH) factors and Suppressor of Hairless [Su(H)], respectively acting in proneural and neurogenic pathways; Antennapedia class homeodomain proteins, identified by their core ATTA binding sequence, and the ubiquitously expressed Pbx- (Pre-B Cell Leukemia TF) class homeodomain protein Extradenticle, a cofactor of many TFs, identified by the core binding sequence of ATCA. More than half the conserved elements, termed cis-Decoder tags or cDTs were novel, without identified interacting proteins. Many of the CSBs consisted of 8 or more bp, and often contained core sequences identical to binding sites for known factors as well as other core sequences that aligned with shorter novel cDTs, suggesting that the longer cDTs may contain core recognition sequences for two or more TFs (Brody, 2008).

Most cDTs discovered in this analysis represent elements that are shared pairwise, i.e., by only two of the NB enhancers examined (see the website for a list of cDTs that are shared by only two of the enhancers examined). The fact that the majority of cDTs are shared two ways, with only a small subset of sequences being shared three or more ways, suggests that the cis-regulation of early neural precursor genes is carried out by a large number of factors acting combinatorially and/or that many of the identified cDTs may in fact represent interlocking sites for multiple factors, and the exact orientation and spacing of these sites may differ among enhancers (Brody, 2008).

During Drosophila neurogenesis, bHLH proteins function as proneural TFs to initiate neurogenesis in both the central and peripheral nervous system. TFs encoded by the achaete-scute complex function in both systems, while the related Atonal bHLH protein functions exclusively in the PNS. Different proneural bHLH TFs, acting together with the ubiquitous dimerization partner Daughterless, bind to distinct E-boxes that contain different core sequences. In addition to the core recognition sequence, flanking bases are important to the DNA binding specificity of bHLH factors (Brody, 2008).

One of the principle observations of this study was that the core central two bases of the hexameric E-box DNA-binding site (CANNTG; core bases are bold throughout) were conserved in all the species used to generate the EvoPrint. All of the enhancers included in this study contained one or more conserved bHLH-binding sites, with NB and PNS enhancers averaging 3.9 and 4.1 binding sites respectively. More than a third of the core bases in NB bHLH sites contained a core GC sequence, and more than a third of the core bases in PNS bHLH sites contained either a core GC or a GG sequence. The most common E-box among the NB CSBs was CAGCTG with 14 sites in four of the six enhancers. The CAGCTG and CAGGTG E-boxes are high-affinity sites for Achaete/Scute bHLH proteins. However the CAGCTG site itself is not specific to NB enhancers, as evidenced by its presence in four of the mesodermal enhancer CSBs . The most common bHLH-binding site among PNS enhancers was also the CAGCTG E-box with 11 occurrences in six of the 13 enhancers. In contrast, the most common bHLH motif in enhancers of the E(spl)-complex was CAAGTG, with 16 occurrences in 8 of the 11 enhancers. CAGGTG, previously shown to be an Atonal DNA-binding site, was also common in E(spl) enhancers, with 9 occurrences in 8 of the 13 enhancers, but was less prevalent among NB enhancers. The CAGGTG box was also overrepresented in PNS and E(spl) enhancers relative to its appearance in NB enhancers, and it was also present in four of the characterized mesodermal enhancer CSBs. The CAGATG box was present six times among PNS enhancers but not at all among NB enhancers. Thus there appears to be some specificity of E-boxes in the different enhancer types. The fact that each of these E-boxes is conserved in all the species in the analysis, suggests that there is a high degree of specificity conferred by the E-box core sequence (Brody, 2008).

The analysis also revealed that not only are the core bases of E-boxes shared between similarly regulated enhancers, but bases flanking the E-box were also found to be highly conserved and are also frequently shared by these enhancers. Among the E-boxes found in CSBs of NB enhancers (many are illustrated in the accompanying Table aaCAGCTG (core bases of E-box are bold, flanking bases lower case) is repeated three times in nerfin-1 and once in scrt; gCACTTG is repeated three times in scrt; CAGCTGCA is repeated twice in wor, and CAGCTGctg is repeated twice in scrt . In the dpn CNS NB enhancer, the E-box CAGCTG is found twice, separated by a single base (CAGCTGaCAGCTG). None of these sequences were present in mesodermal enhancers examined, but each is found in PNS enhancers; CAGCTGCA is repeated multiple times among PNS enhancers. Among the conserved PNS enhancer E-boxes (CAAATGca, gcCAAATG, cacCAAATGg, CACATGttg, gCACGTGtgc, ttgCACGTG, agCACGTGcc, aCAGATG, ggCAGATGt, CAGCTGccg, CAGCTGcaattt, gCAGGTGta and cCAGGTGa) each, including flanking bases, is found in two or three PNS enhancers, and these are distributed among all 13 enhancers. Of these, only agCACGTGcc, CAGCTGccg, cCAGGTGa were found once in the sample of neuroblast enhancers and none were found in the sample of mesodermal enhancers. The sequence aaCAAGTG is found in 4 E(spl) complex enhancers, those for E(spl)m8, mγ, HLHmδ and m6, and the sequence aCAGCTGc is found twice in E(spl)m8 and once in m4 and m6; neither sequence was found in the mesodermal enhancers. Therefore, although a given hexameric sequence may often be shared by all three types of enhancers, NB, PNS and E(spl), when flanking bases are taken into account there appears to be enhancer type-specific enrichment for different E-boxes (Brody, 2008).

Antennapedia class homeodomain proteins play essential roles in multiple aspects of neural development including cell proliferation and cell identity. The segmental identity of Drosophila NBs is conferred by input from TFs encoded by homeotic loci of the Antennapedia and bithorax complexes. For example, ectopic expression of abd-A, which specifies the NB6-4a lineage, down-regulates levels of the G1 cyclin, CycE. Loss of Polycomb group factors has been shown to lead to aberrant derepression of posterior Hox gene expression in postembryonic NBs, which causes NB death and termination of proliferation in the mutant clones (Brody, 2008).

This study examined the enhancer-type specificity of sequences flanking the Antennapedia class core DNA-binding sequence, ATTA. Nearly 25% of the NB and PNS CSBs examined in this study contain this core recognition sequence. ATTA-containing sites were found multiple times in selected NB and PNS enhancers. The cis-Decoder analysis identified 18 different neural specific ATTA containing cDTs that were exclusively shared by two or more PNS enhancers or CNS enhancers and 10 were found to be shared between PNS and CNS. The most common cDT, ATTAgca, was shared by two CNS and two PNS enhancers; consensus homeodomain-binding sites are bold, flanking sequence lower case). In addition, 6 homeodomain-binding site cDTs were found twice in wor CSBs, aATTAccg, tttgaATTA, aatcaATTA, ATTAATctt and aaacaaATTAg, but not in other CNS or PNS enhancer CSBs. In some cases these cDTs were found repeated in given enhancer CSBs. Only one of these cDTs aligned with CSBs of enhancers of the E(spl) complex. Given that 2/3 of the occurrences of HOX sites in these promoters can be accounted for by cDTs whose flanking sequences are shared between enhancers, it is unlikely that the appearance of these shared sequences occurs by chance (Brody, 2008).

In summary, the appearance of Hox sites in the context of conserved sequences shared by functionally related enhancers suggests that the specificity of consensus homeodomain-binding sites is conferred by adjacent bases, either through recognition of adjacent bases by the TF itself or in conjunction with one or more co-factors (Brody, 2008).

Examination of the cDTs from Drosophila NB and PNS enhancers revealed that many contained the core Pbx/Extradenticle docking site ATGA. In Drosophila , Extradenticle has been shown to have Hox-dependent and independent functions. Studies have also shown that Pbx factors provide DNA-binding specificity for homeodomain TFs, facilitating specification of distinct structures along the body axis. In the CNS enhancers of Drosophila , most predicted Pbx/Extradenticle sites are not, however, found adjacent to Hox sites (Brody, 2008).

Cytoscape analysis of Pbx motifs revealed that 8 were shared between CNS and PNS enhancer types, and 16 were shared between similarly expressed enhancers, thus indicating that there appears to be some degree of specificity to Pbx site function when flanking bases are taken into account. Three of the Pbx binding-site containing elements also exhibit ATTA Hox sites: 1) the dodecamer GATGATTAATCT (Pbx site is ATGA, Hox sites in bold) shared by the PNS enhancers edl and amos , contains a homeodomain ATTA site that overlaps the Pbx site by a single base, and 2) the smaller heptamer ATGATTA, shared by pfe and ato, likewise contains a homeodomain ATTA site (bold) that overlaps ATGA Pbx site by a single base. Adjacent Hox and Pbx sites have been documented to facilitate synergy between the two factors. Taken together these findings suggest that, as with homeodomain-binding sites, the conserved bases flanking putative Pbx sites are functionally important. These flanking bases are likely to confer different DNA-binding affinities for Pbx factors or are required for binding of other TFs (Brody, 2008).

Also indicating a degree of biological specificity of enhancer types is the distribution of Suppressor of Hairless Su(H) binding sites among neural enhancers. Su(H) is the Notch pathway effector TF of Drosophila . The members of the E(spl) complex, both the multiple basic helix-loop-helix (bHLH) repressor genes and the Bearded family members, have been shown to be Su(H) . The consensus in vitro DNA binding site for Su(H) is RTGRGAR (where R = A or G). Notch signaling via Su(H) occurs through conserved single or paired sites and the presence of conserved sites for other transcription regulators associated with CSBs containing Su(H) binding sites has been documented (Brody, 2008).

Within the CSBs of the six NB enhancers examined, only two, dpn and wor, contained conserved putative Su(H)-binding sites; two dpn sites matched one of the Su(H) consensus sites (GTGGGAA) and two wor sites match the sequence ATGGGAA. Only one of the two dpn sites contained flanking bases conforming to the widely distributed CGTGGGAA site of E(spl) Su(H) binding sites and none of the NB enhancers contained paired Su(H) sites typical of the E(spl) enhancers. Of the 13 PNS cis-regulatory regions examined, only four enhancers contained putative Su(H)-binding sites [sna and ato (ATGGGAA), brd (GTGGGAG)] and dpn (GTGGGAA). dpn also contained a pair of sites that conforms to the SPS configuration frequently found in Su(H) enhancers (CSB sequence: AATGTGAGAAAAAAACTTTCTCACGATCACCTT, Su(H) sites in bold, Pbx site is ATCA). The lack of Su(H) sites in PNS enhancers has been noted in a previous study, and it was suggested that these enhancers are directly regulated by the proneural proteins but not activated in response to Notch-mediated lateral inhibitory signaling. Among the conserved sequences of E(spl) gene enhancers there is an average of 3.4 consensus Su(H) binding sites per enhancer, with most enhancers containing both types of sites, i.e., those with either A or G in the central position (Brody, 2008).

This study offers three insights with respect to Su(H) binding sites. First, although in vitro DNA-binding studies suggest there is a flexibility in the Su(H) binding site, like the bHLH E-box, comparative analysis shows that within any one the Su(H) sites there is no sequence flexibility. Except for the pair of Su(H) sites in the dpn PNS enhancer, none of the CNS or PNS sites contained a central A; less that a quarter of the E(spl) sites consisted of a central A, and all these were conserved across all species examined. In light of the high conservation in these regions the invariant core and flanking sequences are important for the unique Su(H) function at any particular site (Brody, 2008).

A second finding was the extensive conservation of bases flanking the consensus Su(H) sequence in the E(spl) complex genes. For example, the cDT GTGGGAAACACACGAC [Su(H) site bold] was present in HLHm3 and HLHm5 enhancer CSBs, and ACCGTGGGAAAC was conserved in HLHm3 and HLHmβ enhancers. The conservation of bases flanking the consensus Su(H) binding site suggests that the Su(H) site may be flanked by additional binding sites for co-operative or competitive factors, or else, that Su(H) contacts additional bases besides the consensus heptamer (Brody, 2008).

A third observation is that in most cases Su(H) binding sites are imbedded in larger CSBs, suggesting that CSB function is regulated by the integrated function of multiple TFs. For example the dpn NB enhancer Su(H) site is imbedded in a CSB of 24 bases, and the atonal PNS enhancer Su(H) site is imbedded in a CSB of 45 bases. In the E(spl) complex, CSB #6 of HLHmγ, consisting of 30 bases and CSB#13 of m8, consisting of 31 bases (each contains a GTGGGAA Su(H) site, a CACGAG element, conforming to a Hairy N-box consensus CACNAG, and an AGGA Tramtrack (Ttk) DNA-binding core recognition sequence, but the order and context of these three sites is different for each enhancer). Although Su(H) binding sites were present in only a minority of NB and PNS enhancers, the conservation of core bases, as well as the complexity of their flanking conserved sequences points to a diversity of Su(H) function and interaction with other factors (Brody, 2008).

Neural specific cDTs contain core DNA-binding sites for other known TFs. Two of these elements, one exclusively present in NB enhancers (CAGGATA) and a second exclusively present in PNS enhancers (GTAGGA), contained consensus core AGGA DNA-binding sites for Ttk, a BTB domain TF that has been shown to regulate pair rule genes during segmentation and to repress neural cell fates. Another site (CACCCCA), shared by both NB and PNS enhancers, conforms to the consensus binding site of IA-1 (ACCCCA), the vertebrate homolog of nerfin-1 . Most of the neural specific sequence elements illustrated in the paper do not contain sequences corresponding to consensus binding-sites of known regulators of NB expression. The fact that they are represented multiple times in NB CSB sequences suggests that they contain binding sites for unknown regulators of neurogenesis in Drosophila (Brody, 2008).

Neural enriched cDTs that are shared between multiple NB enhancers and also exhibit a low frequency in the sample of mesodermal enhancers examined in this study serve as a resource for understanding enhancer elements that may not have an exclusive neural function [see cis-Decoder tags with multiple hits on two or more NB enhancers]. Notable here is the presence of CAGCTG bHLH DNA binding sites (all with flanking A, CC and TC) and Antennapedia class homeobox (Hox) core DNA binding site ATTA, as well as additional Ttk and Pbx/Extradenticle sites. Present in this list are portions of sequences conforming to Su(H) binding sites. Of particular interest are sequences that are also enriched in the PNS; these sites may bind factors that play similar developmental roles in different tissues. For example, the presumptive Ttk site, AAAGGA (core sequence in bold) is highly enriched in segmental enhancers. Thus, some of these sites can be identified as targets of known TFs, but the identity of most are as yet unknown. These elements shared by multiple enhancers may be useful in identifying other enhancers driving expression in NBs (Brody, 2008).

EvoPrint analysis revealed that all of the enhancer regions examined in this study contained multiple CSBs that were greater that 15 to 20 bases in length. The occurrence of overlapping DNA-binding sites for different TFs is currently the best explanation for the maintenance of intact CSB sequences across ~160 millions of years of collective species divergence. This analysis has revealed that the sequence context, order and orientation of shared cDTs can differ between co-regulating enhancers (Brody, 2008).

Two examples are given here of the complex contextual appearance of cDTs. Each of the eight illustrated CSBs shown was nearly fully 'covered' by cDTs of the NB library, suggesting that each contains multiple overlapping binding sites for a number of TFs. In these two examples, there is no consistent spatial constraints to the association of known TF-binding sites (i.e., bHLH-binding E-box sites) with novel cDTs; a picture that emerges is one of combinatorial complexity, in which known or novel cDTs are associated with each other in different contexts on different CSBs (Brody, 2008).

The information derived from cis-Decoder analysis of neural precursor cell enhancers was used to search for other genomic sequences with similar cis-regulatory properties. Having identified cDTs found multiple times among NB enhancers, the genomic search tool FlyEnhancer was used to identify Drosophila melanogaster genomic sequences that contained clusters of the following cDTs (number in parenthesis is the total number of each cDT in the sample of six NB enhancers): GGCACG (6), GGAATC (4), TGACAG (6), TGGGGT (4), CAGCTG (14), TGATTT (9) CAAGTG (7), CATATTT (5), TGATCC (7) and CTAAGC (6). As a lower limit, a minimum of three CAGCTG bHLH sites was set for this search, because of the prevalence of this site in nerfin-1 and deadpan NB enhancers. Each sequence detected by this search was subjected to EvoPrinter analysis to determine the extent of its sequence conservation. Among the cDT clusters identified, the search identified a 5' region adjacent to the nervy gene that contained three conserved CAGCTG sites as well five other sites identical to TGACAG, GGAATC, TGGGGT, GGCACG and CATATTT. nervy, originally identified as a target of homeotic gene regulation, is expressed in a subset of early CNS NBs, as well as in PNS SOP cells. Later studies have implicated nervy, along with cyclic adenosine monophosphate (cAMP)-dependent protein kinase (PKA) in antagonizing Sema-1a-PlexA-mediated axonal repulsion, and nervy has been shown to promote mechanosensory organ development by enhancing Notch signaling (Brody, 2008).

EvoPrinter analysis revealed that the cluster of neural precursor cell enhancer cDTs positioned 90 bp upstream from the nervy transcribed sequence contains highly conserved sequences. This region contains 10 CSBs that include six conserved E-boxes, three of which conform to the CAGCTG sequence that was prominent in nerfin-1 and deadpan promoters. To determine if this region functions as a neural precursor cell enhancer, transformant lines were generated containing the nervy CSB cluster linked to a minimal promoter/GFP reporter transgene. This analysis of the reporter expression driven by the nervy upstream fragment revealed a pattern indistinguishable from early nervy mRNA expression. Specifically, expression was detected in a large subset of early delaminating NBs and in SOPs and secondary precursor cells of the PNS. Significantly, the nervy enhancer, unlike nerfin-1 and deadpan NB enhancers, activates reporter expression in then PNS and not just in early NBs (Brody, 2008).

The major finding of this study is that enhancers of co-regulated genes in neural precursor cells possess complex combinatorial arrangements of highly conserved cDT elements. Comparisons between NB and PNS enhancers identified CNS and PNS type-specific cDTs and cDTs that were enriched in one or another enhancer type. cis-Decoder analysis also revealed that many of the conserved sequences contain DNA-binding sites for classical regulators of neurogenesis, including bHLH, Hox, Pbx, and Su(H) factors. Although in vitro DNA-binding studies have shown that many of these factors have a certain degree of flexibility in the sequences to which they bind, defined in terms of a position weight matrix, the studies described in this paper show that for any given appearance these sites are actually highly conserved across all species of the Drosophila genus. The genus invariant conservation in many of these characterized binding sites indicates that there are distinct constraints to that sequence in terms of its function (Brody, 2008).

The high degree of conservation displayed in the enhancer CSBs could derive from unique sequence requirements of individual TFs, or the intertwined nature of multiple DNA-binding sites for different TFs. Thus there is a higher degree of biological specificity to these sites than the flexibility that is detected using in vitro DNA-binding studies. As an example, the requirement for a specific core for the bHLH binding site, i.e., for a CAGCTG E-box for nerfin-1, deadpan and nervy, suggests that it is the TF itself that demands sequence conservation; however, the requirement for conserved flanking sequences suggests that additional specific factors may be involved. Although the inter-species conservation of core and flanking sites has been noted by others, the extent of this conservation is rather surprising. To what extent and how evolutionary changes in enhancer function take place, given the conservation of core enhancer sequences, remains a question for future investigation (Brody, 2008).

In addition to classic regulators of neurogenesis, cis-Decoder reveals additional conserved novel elements that are widely distributed or only detected in pairs of enhancers. Many of these novel elements flank known transcription binding motifs in one CSB, but appear independent of known motifs in another. The appearance of novel elements in multiple contexts suggests that they may represent DNA-binding sites for additional factors that are essential for enhancer function. Only through discovery of the factors binding these sequences will it become clear what role they play in enhancer function (Brody, 2008).

Preliminary functional analysis of CSBs within the nerfin-1 neuroblast enhancer reveals that CSBs carry out different regulatory roles. Altering cDT sequences within the nerfin-1 CSBs reveals that most are required for cell-specific activation or repression or for normal enhancer expression levels. CSB swapping studies reveals that, for the most part, the order and arrangement of a number of tested CSBs was not important for enhancer function in reporter studies. The discovery of the nervy neural enhancer by searching the genome with commonly occurring NB cDTs underscores the potential use of EvoPrinter and cis-Decoder analysis for the identification of additional neural enhancers. By starting with known enhancers and building cDT libraries from their CSBs, one now has the ability to search for other genes expressed during any biological event (Brody, 2008).


DEVELOPMENTAL BIOLOGY

Embryonic

scratch transcripts are first detected in a single row of ectodermal cells flanking the ventral midline. During early germ band extention [Images], scratch-expressing cells delaminate and contribute to the first of three rows of S1 neuroblasts in the ventral CNS. It is subsequently expressed in all S1 neuroblasts, as well as neuroblasts formed during later rounds of segregation. scratch continues to be expressed in ganglion mother cells (GMCs), the immediate progeny of neuroblasts, and later in postmitotic neurons.

Unlike scratch, deadpan is not expressed in GMCs. In the peripheral nervous system scratch is first expressed in primary sensory mother cells, then in secondary precursor cells, and finally in postmitotic neurons. scratch is expressed prior to deadpan in the first row of neuroblasts, but subsequently deadpan expression precedes scratch in the CNS and sensory mother cells. Likewise snail and scratch expression partially overlap. Ectopic expression of scratch generates extra neurons (Roark, 1995).

Wild-type zygotic scrt expression first appears in the S1 neuroblasts as they delaminate. As in the case of dpn, scrt expression extends to each subsequent wave of neuroblast delamination. Unlike dpn, however, scrt continues to be expressed in GMCs. It is likely that expression in GMCs derived from divisions of the S1 neuroblasts obscures the rosette pattern of neuroblast expression observed with dpn. By stage 12, most or all cells in the CNS express scrt, with strongest expression in a group of 5-7 cells per abdominal hemisegment arranged in a crescent. A small subset of cells along the ventral midline also expresses scrt at this time. Expression in the PNS begins with delamination of the earliest precursors and continues as subsequent precursors segregate from the epithelium. Postmitotic cells in both the CNS and PNS express scrt prior to their morphological differentiation. In the PNS, where all postmitotic neurons can be identified, it is possible to determine that scrt is expressed in every neuron. In the CNS, there are still some GMCs present after germband retraction, and it is possible that scrt is expressed in these cells and/or in glia as well as in postmitotic neurons (Emery, 1995).

Larval

scratch is expressed in neuronal cells of the wing disc, in developing neurons in pupal wings, in cells posterior to the morphogenic furrow, in the eye-antennal disc, in leg neuronal precursors, and in many cells of the third-instar larval brain and ventral nerve cord (Roark, 1995).

Effects of mutation or deletion

scratch null mutants have scarred facets in the eye, a phenotype that gives the gene its name. It is also a male sterile mutation, producing defective spermatids (Roark, 1995).

scratch interacts genetically with deadpan. These two genes have similar pan-neural expression patterns but encode unrelated proteins. Loss of function of either of these genes alone does not lead to obvious morphological disruption of the embryonic nervous system. Under optimal conditions, animals homozygous null for each of these genes occasionally complete development and eclose. In contrast, animals null for both genes never hatch and frequently exhibit a dramatic reduction of the nervous system. In addition, axon projections are frequently disorganized. Double mutants have missing and disorganized longitudinal and commissural axon tracts. CNS defects are evident early during neurogenesis as there are fewer hunchback expressing cells contributing to the S1 wave of neuroblasts than in wild type embryos (Roark, 1995).


REFERENCES

Brody, T., Rasband, W., Baler, K., Kuzin, A., Kundu, M. and Odenwald, W. F. (2008). Sequence conservation and combinatorial complexity of Drosophila neural precursor cell enhancers. BMC Genomics 9: 371. PubMed Citation: 18673565

Emery, J.F. and Bier, E. (1995). Specificity of CNS and PNS regulatory subelements comprising pan-neural enhancers of the deadpan and scratch genes is achieved by repression. Development 121: 3549-3560 8582269

Metzstein, M. M. and Horvitz, H. R. (1999). The C. elegans cell death specification gene ces-1 encodes a snail family zinc finger protein. Mol. Cell 4(3): 309-19. 10518212

Nakakura, E. K., et al. (2001). Mammalian Scratch: A neural-specific Snail family transcriptional repressor. Proc. Natl. Acad. Sci. 98: 4010-4015. 11274425

Nakakura, E. K., et al. (2002). Mammalian Scratch participates in neuronal differentiation in P19 embryonal carcinoma cells. Brain Res. Mol. Brain Res. 95(1-2): 162-6. 11687288

Roark, M., Sturtevant, M. A., Emery, J., Vaessin, H., Grell, E. and Bier, E. (1995). scratch, a panneural gene encoding a zinc finger protein related to snail, promotes neuronal development. Genes Dev 9: 2384-2398. PubMed ID: 7557390

date revised:  15 July 2015
Home page: The Interactive Fly © 1995, 1996 Thomas B. Brody, Ph.D.

The Interactive Fly resides on the
Society for Developmental Biology's Web server.