nautilus
There is no nau expression in dorsal or twist mutants (Michelson, 1990).
When normal cell-cell interactions are inhibited between mesoderm and ectoderm most
cells in gastrulation-arrested embryos do not differentiate, they express latent germ layer-specific
genes appropriate for their position. Mesoderm cells require proximity to ectoderm to express
several muscle-specific genes. Ventral ectoderm induces mesoderm cells to express
nautilus and to differentiate somatic myofibers, whereas dorsal ectoderm
induces mesoderm cells to express visceral and cardiac muscle-specific genes (Baker, 1995).
Ectopic expression of muscle segment homeobox in the mesoderm results in altered
expression of the S59/NK1 and nau genes, leading to a loss of some muscles and defects in the
patterning of others, suggesting that the muscle defects are at the level of recruitment and/or
patterning of muscle precursor cells (Lord, 1995).
The neurogenic genes genes also control mesoderm development.
Embryonic cells that express nautilus are overproduced in each of seven
neurogenic mutants (Notch, Delta, Enhancer of split, big brain, mastermind, neuralized, and
almondex), at the apparent expense of neighboring, nonexpressing mesodermal cells. The
mesodermal defect does not appear to be a simple consequence of associated neural hypertrophy,
suggesting that the neurogenic genes may function similarly and independently in establishing cell
fates in both ectoderm and mesoderm. Altered patterns of beta 3-tubulin and myosin heavy chain
gene expression in the mutants indicate a role for the neurogenic genes in development of most
visceral and somatic muscles (Corbin, 1991).
Correct diversification of cell types during development ensures the formation of functional organs. The evolutionarily conserved homeobox genes from ladybird/Lbx family were found to act as cell identity genes in a number of embryonic tissues. A prior genetic analysis showed that during Drosophila muscle and heart development ladybird is required for the specification of a subset of muscular and cardiac precursors. To learn how ladybird genes exert their cell identity functions, muscle and heart-targeted genome-wide transcriptional profiling and a chromatin immunoprecipitation (ChIP)-on-chip search were performed for direct Ladybird targets. The data reveal that ladybird not only contributes to the combinatorial code of transcription factors specifying the identity of muscle and cardiac precursors, but also regulates a large number of genes involved in setting cell shape, adhesion, and motility. Among direct ladybird targets, bric-a-brac 2 gene was identified as a new component of identity code and inflated encoding αPS2-integrin playing a pivotal role in cell-cell interactions. Unexpectedly, ladybird also contributes to the regulation of terminal differentiation genes encoding structural muscle proteins or contributing to muscle contractility. Thus, the identity gene-governed diversification of cell types is a multistep process involving the transcriptional control of genes determining both morphological and functional properties of cells (Junion, 2007).
Uncovering how the cell fate-specifying genes exert their functions and determine unique properties of cells in a tissue is central to understanding the basic rules governing normal and pathological development. To approach the cell fate determination process at a whole genome level a search was performed for transcriptional targets of the homeobox transcription factor Lb known to be evolutionarily conserved and required for specification of a subset of cardiac and muscular precursors. To this end the targeted expression profiling and the novel ChIP-on-chip method ChEST were combined. The data revealed an unexpectedly complex gene network operating downstream from lb, which appears to act not only by regulating components of the cell identity code but also as a modulator of pan-muscular gene expression at fiber-type level. Of note, the role of Drosophila lb in regulating segment border muscle (SBM) founder motility appears reminiscent of the role of its vertebrate ortholog Lbx1, known to control the migration of leg myoblasts (Vasyutina, 2005) in mouse embryos (Junion, 2007).
Earlier genetic studies revealed that within the same competence domain the cell fate specifying factors acted as repressors to down-regulate genes determining the identity of neighboring cells. Consistent with this finding, lb was found to repress msh and kruppel (kr) during diversification of lateral muscle precursors and even skipped (eve) within the heart primordium. This study found that additional identity code components are regulated negatively by lb. In the lateral muscle domain lb acts as a repressor of the MyoD ortholog nau and the NK homeobox gene slou, both known to be required for the specification of a subset of somatic muscles. This suggests that a particularly complex network of transcription factors (Ap, Msh, Kr, Nau, Slou) controls the specification of individual muscle fates in the lateral domain. Interestingly, none of these factors is coexpressed with lb in the SBM, which appears to be a functionally distinct muscle requiring a specific developmental program. Besides factors with well-documented roles in diversification of muscle fibers, the global approach identified a few novel potential players in the muscle identity network. Among those that are expressed in somatic muscle precursors are the Pdp1 gene encoding Par domain factor and the CG32611 gene containing a zinc finger motif (Junion, 2007).
Interestingly, in the cardiac domain the data demonstrate that lb is able to positively regulate the expression of tin and the effector of RTK pathway pointed (pnt), both involved in cardiac cell fate specification. These findings are consistent with earlier observations that the forced lb expression leads to the ectopic tin-positive cells within the dorsal vessel. Also, during early cardiogenesis lb directly represses bric a brac 2 (bab2), which emerges as a novel component of the genetic cascade controlling the diversification of cardiac cells. Thus, the ability of Lb to act either as repressor or as activator suggests a context-dependent interaction with cofactors. Of note, several miroarray identified Lb targets have also been found in the RNAi-based screen for genes involved in heart morphogenesis (Junion, 2007).
The data indicate that lb exerts its muscle identity functions via regulation of pan-muscular genes that control cell movements, cell shapes and cell-cell interactions including myoblast fusion, myotube growth, and attachment events (Junion, 2007).
Misexpression of nautilus induces myogenesis in cardioblasts and alters the pattern of somatic muscle fibers. Ectopic expression of nautilus results in lethality throughout fly development. Antibody staining with anti-myosin heavy chain reveals abnormalities that include an absence of cardial cells, coincident with the appearance of novel muscle fibers adjacent to the dorsal vessel. Moreover, many cardioblasts express increased levels of muscle-specific genes such as myosin, actin 57B and Mlp60A (a protein that is restricted to the somatic, visceral and pharyngeal muscles). It appears that the missing cardial cells have been transformed into cells with properties similar to those of the somatic muscles. In addition, ubiquitous expression of nautilus in thesomatic muscle cells of these embryos results in muscle pattern defects. Specifically, muscles that do not normally express nautilus are frequently absent, and novel fibers are observed in positions reminiscent of nautilus-expressing muscles. Thus nautilus can alter the developmental program of muscle precursors (Keller, 1997).
Specification of muscle identity in Drosophila is a multistep process: early positional information defines competence groups termed promuscular clusters, from which muscle progenitors are selected, followed by asymmetric division of progenitors into muscle founder cells (FCs). Each FC seeds the formation of an individual muscle with morphological and functional properties that have been proposed to reflect the combination of transcription factors expressed by its founder. However, it is still unclear how early patterning and muscle-specific differentiation are linked. This question was addressed using Collier (Col; also known as Knot) expression as both a determinant and read-out of DA3 muscle identity. Characterization of the col upstream region driving DA3 muscle specific expression revealed the existence of three separate phases of cis-regulation, correlating with conserved binding sites for different mesodermal transcription factors. Examination of col transcription in col and nautilus (nau) loss-of-function and gain-of-function conditions showed that both factors are required for col activation in the 'naive' myoblasts that fuse with the DA3 FC, thereby ensuring that all DA3 myofibre nuclei express the same identity programme. Together, these results indicate that separate sets of cis-regulatory elements control the expression of identity factors in muscle progenitors and myofibre nuclei and directly support the concept of combinatorial control of muscle identity (Dubois, 2007).
col belongs to the class of Drosophila regulatory genes
with numerous introns, large amounts of flanking sequence and multiple
expression sites. During embryogenesis, col is expressed in the
MD2/PS0 head region, the somatic DA3 muscle, precursor cells of the lymph
gland, a small set of multidendritic (md) neurons of the peripheral nervous
system and specific neurons of the central nervous system (CNS). A lacZ reporter transgene (P{5col::lacZ}, abbreviated P5cl, contains 5 kb of
col upstream DNA, which faithfully reproduced col
transcription both in the MD2/PS0 and the DA3 muscle, starting at the
progenitor stage and not in promuscular cluster(s). To identify the missing cis-regulatory information, a longer construct was tested containing the entire 9 kb region separating col from CG10200, the next predicted upstream gene. In
addition to the head and DA3 muscle, P9cl expression reproduced
col expression in md neurons and a subset of neurons in the CNS. A
DNA fragment located further upstream, between CG10200 and the next predicted
gene CG10202, was independently shown to drive col expression in the
anteroposterior organiser of the wing imaginal disc
(Hersh, 2005). However, neither this construct nor P9cl reproduced Col expression in
promuscular clusters. The col transcription unit is immediately flanked at its 3' end by another gene, BEAF32, making rather unlikely the presence of cis-regulatory
elements within this region. However, it contains ten different introns, of total length around 30 kb, the cis-regulatory content of which remains to be assessed (Dubois, 2007).
To delineate more precisely the CRM driving col expression in the
DA3 muscle, a series of constructs was tested containing 2.6, 2.3, 1.6 and 0.9
kb of DNA upstream of the col transcription start site, respectively. P2.6cl retained the information necessary for col expression in MD2/PS0 and
the DA3 progenitor and muscle, although it was noted that P2.6cl expression in muscle progenitors was less robust than P9cl. P2.3cl was also activated in
MD2/PS0 at stage 6 and the DA3 muscle. However, unlike P9cl or
P2.6cl, P2.3cl was not activated in the DA3/DO5 progenitor but only
at the FC stage; ectopic lacZ expression was observed in clusters of neuroectodermal cells at embryonic stage 11). This difference indicated that cis-regulatory elements required for col expression in the DA3/DO5 progenitor reside
between positions -2.6 and -2.3 and act separately from those required for
expression in the DA3 FC and muscle. P1.6cl was active only in
MD2/PS0, whereas no expression at all could be detected with P0.9cl. Together, expression data from this series of reporter constructs allowed the mapping of the CRM required for col-specific expression in the DA3/DO5 muscle progenitor and DA3 FC/myofibre to a DNA fragment between positions -2.6 and -1.6 upstream of the col
transcription start (Dubois, 2007).
Advantage was taken of the recently available genome sequences of several
Drosophila species to search for conserved motifs in the col
upstream DNA, as it has often proven to be effective to identify functionally
important cis-regulatory elements. Among these species, D. virilis (D. vir) is the most distant from D. melanogaster (D. mel). It was first verified that Col expression in D. vir was similar to that in D. mel embryos and could infer from this that the regulatory information controlling col
transcription in the DA3 muscle lineage has been conserved. Sequence
comparison of 9 kb of the col upstream region between D. mel, D.
vir and four other Drosophila species, D. yakuba, D.
ananassae, D. pseudoobscura and D. mojavensis revealed numerous
stretches of high sequence conservation, of sizes up to 100 bp. Ten conserved motifs of size >20 bp, numbered 1 to 10 from 5' to 3', were found in the same order and at the
same relative position between position -2.6 and the start of transcription in
all six Drosophila species. To test the
relevance of this conservation, lacZ reporter constructs were created
containing either D. vir or D. mel DNA (Dubois, 2007).
P.3.4clvir corresponds to D. mel P2.6cl, whereas
P3.4-1.3clvir and P2.6-0.9cl are truncated
versions covering motifs 1 to 10. All four reporter genes showed expression in
the DA3 muscle, starting at the progenitor stage, confirming the evolutionary
conservation of a DA3-muscle-specific CRM. A Gal4 driver line
containing only the -2.6 to -1.6 region (P2.6-1.6cG), harbouring only
motifs 1 to 7, was also specifically expressed in the DA3 muscle. This confirmed that
the DA3 muscle CRM is located between positions -2.6 and - 1.6. It was noticed,
however, that expression of P2.6-1.6cG was weaker and more sporadic
than P2.6-0.9cl, suggesting the existence of cis-regulatory
element(s) between positions -1.6 and -0.9 contributing to robust DA3 muscle
expression. The conserved motifs 1 to 10 were searched for consensus
binding sites of known TFs that could account for col activation in
the DA3 muscle. This identified a binding site for the mesodermal basic
helix-loop-helix (bHLH) protein Twi (within motif 2), correlating well with the position of the muscle progenitor cis-element and a potential EBF/Col-binding site within motif 7. Further visual inspection of the sequence alignments identified other conserved TF-binding sites, including one Mef2-binding site within the -1.6 to -0.9 fragment and one consensus binding site for Nau (Huang, 1996; Kophengnavong, 2000). In contrast, the position of the Mef2 site correlated well with the requirement of the -1.6 to -0.9 fragment for robust DA3 muscle expression. The presence of a Nau-binding site was particularly intriguing since Nau is required for DA3 muscle formation. Potential binding sites for other TFs could be found in the DA3 CRM, but the annotation to the conserved sites. The relative paucity of known TF-binding sites in the conserved sequence motifs found in the DA3 muscle CRM leaves largely open the question of the roles of these motifs in col regulation (Dubois, 2007).
Functional dissection of the DA3 muscle CRM present in the col
upstream region showed that col expression in the DA3 FC can be
separated from its expression in the DA3/D05 progenitor and the promuscular
cluster. It thus revealed the existence of three steps in the transcriptional
control of muscle identity. That col expression in the DA3/D05 progenitor could
be uncoupled from that in promuscular clusters was in apparent contradiction
with the previous conclusion from pioneering studies on Eve expression in
dorsal muscle progenitors that this expression issued from Eve activation in
promuscular clusters. Restriction of Eve expression to progenitors was
considered a secondary step, mediated by N-signalling during progenitor
selection by lateral inhibition. To reconcile these data and this model, it is proposed that the muscle DA3 CRM is active only in the DA3/D05 progenitor because it lacks some
positively acting cis-elements necessary to counteract N-mediated repression
of col transcription. It has been shown that col
transcription is repressed by N during the progenitor selection process.
It is also noted that a Twi-binding site is present in the 'progenitor' subdomain
of the DA3 CRM. The functional importance of this site is supported by its in vivo occupancy in 4- to 6-hour-old embryos when selection of the DA3/DO5 progenitor takes place (Sandmann, 2007). Together, Twi in vivo binding and the col/P2.6cl/P2.3cl expression data suggest that Twi activity contributes to col expression in the DA3/DO5 progenitor but may not be sufficient to override N
repression of col transcription before progenitor selection.
Additional binding sites for Twi present in the col upstream region,
between positions -8.7 and -8.3, are also bound by Twi in vivo
(Sandmann, 2007) and probably contribute to the robustness of P9cl expression in
progenitor cells, but the question of which cis-regulatory elements mediate
col activation in promuscular clusters remains open. From Eve
expression studies, a computational framework has been developed to identify other FC-specific genes (Estrada, 2006; Philippakis, 2006). This framework, named Codefinder, integrates transcriptome data and clustering
of combinations of binding sites for five different TFs (Pnt, dTCF, Mad, Twi
and Tin). col/kn was selected by Codefinder owing to the presence of
five clusters of binding sites, four of which are located within introns
(Philippakis, 2006). It remains to be determined which of these could be responsible for col activation in promuscular clusters, but it is interesting to note that another
in vivo Twi-binding site in 4-6-hour-old embryos correlates with the
3'-most cluster (Sandmann, 2007). In addition to Twi, conserved binding sites for Nau and
Mef2 are found within the DA3 CRM. The Mef2 binding site is located in a
region required for robust DA3-muscle expression of a reporter gene. A direct control of col transcription by Mef2 during the muscle fusion process is further supported by the recent finding (Sandmann, 2006) that Mef2 binds in vivo to the col upstream region between 6 and 8 hours of embryonic development (Dubois, 2007).
Detailed analysis of col auto-activation revealed a reiterative,
two-step process: import of pre-existing Col protein in the fusion competent myoblast nuclei that incorporate into the growing DA3 myofibre precedes activation of col
transcription. This process ensures that all incorporated FCM nuclei acquire the same identity. Nau is required for maintaining col transcription in the DA3 muscle
precursor and this control is probably direct. The presence of a putative
EBF-binding site in the DA3 muscle CRM also correlates with the Col
requirement for maintaining its own transcription beyond the FC stage.
Thus, despite the failure to detect strong Col binding to this
site in vitro, it appears to be essential for col auto-regulation in
vivo. This suggests that in vivo binding is potentiated by one or more
specific co-factor(s) present in the DA3 muscle. One co-factor is probably
Nau, as the ability of Col to activate its own transcription in newly
recruited fusion competent myoblasts is dependent upon Nau activity. Nau is not
sufficient, however, as many muscles containing both Nau and Col proteins do
not activate col transcription. Interestingly, mouse
EBF (also known as Ebf1 and Olf1 - Mouse Genome Informatics) and E2A (Tcfe2a -
Mouse Genome Informatics), a bHLH protein of the same subgroup as MyoD, have
been shown to act on the same target promoter and synergistically upregulate
transcription of B-lymphocyte-specific genes, although no direct physical
interaction between EBF and E2A could be found in vitro. This suggested that
functional interaction of EBF and E2A, similar to Col and Nau, requires yet
another factor. Taking into account the restricted pattern of ectopic
col activation in hs-col conditions, it is hypothesised that Vg
could be another component of the DA3 combinatorial identity. However, we
found that Vg is not required for DA3 muscle specification, leaving open the
question of which factor may bridge Col and Nau functions (Dubois, 2007).
Unlike col or P2.6cl, P2.3cl is expressed in the DA3 FC and muscle
precursor but not the DA3/DO5 progenitor, showing that col
transcription in the progenitor and muscle precursor is under separate
control. These two phases of col regulation are intimately linked,
however, as Col is required for activating its own transcription in the nuclei
of FCM recruited by the DA3 FC. This regulatory cascade may explain how
pre-patterning of the somatic mesoderm and muscle identity are
transcriptionally linked in the Drosophila embryo. As discussed
above, the ability of Col to auto-regulate depends upon the presence of Nau, another muscle identity TF. Col and Nau act as obligatory co-factors fo maintenance/activation of Col expression in all nuclei of the DA3 muscle, thus bringing to light a clear case of combinatorial coding of muscle identity (Dubois, 2007).
Hox transcription factors control many aspects of animal morphogenetic diversity. The segmental pattern of Drosophila larval muscles shows stereotyped variations along the anteroposterior body axis. Each muscle is seeded by a founder cell and the properties specific to each muscle reflect the expression by each founder cell of a specific combination of 'identity' transcription factors. Founder cells originate from asymmetric division of progenitor cells specified at fixed positions. Using the dorsal DA3 muscle lineage as a paradigm, this study shows that Hox proteins play a decisive role in establishing the pattern of Drosophila muscles by controlling the expression of identity transcription factors, such as Nautilus and Collier (Col), at the progenitor stage. High-resolution analysis, using newly designed intron-containing reporter genes to detect primary transcripts, shows that the progenitor stage is the key step at which segment-specific information carried by Hox proteins is superimposed on intrasegmental positional information. Differential control of col transcription by the Antennapedia and Ultrabithorax/Abdominal-A paralogs is mediated by separate cis-regulatory modules (CRMs). Hox proteins also control the segment-specific number of myoblasts allocated to the DA3 muscle. It is concluded that Hox proteins both regulate and contribute to the combinatorial code of transcription factors that specify muscle identity and act at several steps during the muscle-specification process to generate muscle diversity (Enriquez, 2010).
Eve expression in the DA1 muscle lineage provided the first paradigm for studying the early steps of muscle specification. Detailed characterization of an eve muscle CRM showed that positional and tissue-specific information were directly integrated at the level of CRMs via the binding of multiple transcription factors, including dTCF, Mad, Pnt, Tin and Twi. Based on this transcription factor code and using the ModuleFinder computational approach, this study has identified a CRM, CRM276, that precisely reproduces the early phase of col transcription. This CRM also drove expression in cells of the lymph gland, another organ that is issued from the dorsal mesoderm where col is expressed. Parallel to this study, two col genomic fragments were selectively retrieved in chromatin immunoprecipitation (ChIP-on-chip) experiments designed to identify in vivo binding sites for Twi, Tin or Mef2 in early embryos. One fragment overlaps with CRM276. Based on this overlap and interspecies sequence conservation, a 1.4 kb subfragment of CRM276 that retained most of the transcription factor binding sites identified by ModuleFinder was tested, and it was found to specifically reproduced promuscular col expression. This in vivo validation shows that intersecting computational predictions and ChIP-on-chip data should provide a very efficient approach to identify functional CRMs on a genome-wide scale (Enriquez, 2010).
The eve and col early mesodermal CRMs are activated at distinct A/P and D/V positions. It is now possible to undertake a comparison of these two CRMs, in terms of the number and relative spacing of common activator and repressor sites and their expanded combinatorial code, in order to understand how different mesodermal cis elements perform a specific interpretation of positional information (Enriquez, 2010).
A progenitor is selected from the Col promuscular cluster in T2 and T3 but not T1. One cell issued from the Col-expressing promuscular cluster in T1 nevertheless shows transiently enhanced Col expression, suggesting that the generic process of progenitor selection is correctly initiated in T1. This process aborts, however, in the absence of a Hox input, as shown by the loss of progenitor Col expression and DA3 muscle in specific segments in Hox mutants. The similar changes in Nau and Col expression observed under Hox gain-of-function conditions leads to the conclusion that the expression of 'identity' transcription factor iTFs is regulated by Hox factors at the progenitor stage. The superimposition of Hox information onto the intrasegmental information thereby implements the iTF code in a segment-specific manner and establishes the final muscle pattern. Unlike DA3, a number of specific muscles are found in both T1 and T2-A7, such as the Eve-expressing DA1 muscle; other muscles form in either abdominal or thoracic segments, as illustrated by the pattern of Nau expression in stage 16 embryos. This diversity in segment-specific patterns indicates that Hox regulation of iTF expression is iTF and/or progenitor specific (Enriquez, 2010).
As early as 1994, Hox proteins were proposed to regulate the segment-specific expression of iTFs. Seven years later, an apterous mesodermal enhancer (apME680) active in the LT1-4 muscles was characterized and itwas proposed that regulation by Antp was direct. However, mutation of the predicted Antp binding sites present in apME680 abolished its activity also in A segments, suggesting that some of the same sites were bound by Ubx/AbdA. Evidence is now available that the regulation of col expression by Ubx/AbdA in muscle progenitors is direct and involves a single Hox binding site. However, regulation by Antp does require other cis elements. It remains to be seen whether regulation by Antp is also direct. Since Antp, Ubx and AbdA display indistinguishable DNA-binding preferences in vitro, the modular regulation of col expression by different Hox paralogs suggests that other cis elements and/or Hox collaborators contribute to Hox specificity. Direct regulation of col by Ubx has previously been documented in another cellular context, that of the larval imaginal haltere disc, via a wing-specific enhancer. In this case, Ubx directly represses col expression by binding to several sites, contrasting with col-positive regulation via a single site in muscle progenitors. This is the second example, in addition to CG13222 regulation in the haltere disc, of direct positive regulation by Ubx via a single binding site. Hox 'selector' proteins collaborate on some cis elements with 'effector' transcription factors that are downstream of cell-cell signaling pathways. In the DA3 lineage, it seems that Dpp, Wg and Ras signaling act on one col cis element and the Hox proteins on others. The regulation of col expression by Hox proteins in different tissues via different CRMs provides a new paradigm to decipher how different Hox paralogs cooperate and/or collaborate with tissue- and lineage-specific factors to specify cellular identity (Enriquez, 2010).
The DA3 muscle displays fewer nuclei in T2 and T3 than in A1-A7, an opposite situation to that described for an aggregate of the four LT1-4 muscles. It has been proposed that the variation in the number of LT1-4 nuclei was controlled by Hox proteins. These studies of the DA3 muscle extend this conclusion by showing that the variations due to Hox control are specific to each muscle and are exerted at the level of FCs. Since the number of nuclei is both muscle- and segment-specific, Hox proteins must cooperate and/or collaborate with various iTFs to differentially regulate the nucleus-counting process. As such, Hox proteins contribute to the combinatorial code of muscle identity. Identifying the nature of the cellular events and genes that act downstream of the iTF/Hox combinatorial code and that are involved in the nucleus-counting process represents a new challenge (Enriquez, 2010).
Home page: The Interactive Fly © 1997 Thomas B. Brody, Ph.D.
The Interactive Fly resides on the
nautilus:
Biological Overview
| Evolutionary Homologs
| Developmental Biology
| Effects of Mutation
| References
Society for Developmental Biology's Web server.