Interactive Fly, Drosophila

bagpipe, found downstream of tinman, is involved in the differentiation of ventral visceral mesoderm (gut musculature). While the defect in ventral visceral mesoderm invagination is partial in tinman mutants, it is absolute in bagpipe mutants (Azpiazu, 1993).

The function of the Drosophila mef2 gene, a member of the MADS box supergene family of transcription factors, is critical for terminal differentiation of the three major muscle cell types, namely somatic, visceral, and cardiac. During embryogenesis, mef2 undergoes multiple phases of expression, which are characterized by initial broad mesodermal expression, followed by restricted expression in the dorsal mesoderm, specific expression in muscle progenitors, and sustained expression in the differentiated musculatures. Evidence is presented that temporally and spatially specific mef2 expression is controlled by a complex array of cis-acting regulatory modules that are responsive to different genetic signals. Functional testing of approximately 12 kb of 5' flanking region of the mef2 gene shows that the initial widespread mesodermal expression is achieved through a 280-bp twist-dependent enhancer. Subsequent dorsal mesoderm-restricted mef2 expression is mediated through a 460-bp dpp-responsive regulatory module, which involves the function of the Smad4 homolog Medea and contains several binding sites for Medea and Mad. Regulated mef2 expression in the caudal and trunk visceral mesoderm, which give rise to longitudinal and circular gut musculatures, respectively, is under the control of distinct enhancer elements. In addition, mef2 expression in the cardioblasts of the heart is dependent on at least two distinct enhancers, which are active at different periods during embryogenesis. Mef2 expressing cells are coincident with those expressing Tinman. Notably, both Mef2 and Tinman expression are in four of six cardioblasts that are present per hemisegment. The complete overlap between the two expression patterns suggests that the activity of this enhancer element could be dependent on tinman function or under similar regulatory controls as is tinman. The cardiac enhancer that functions at later stages also drives mef2 expression in the caudal visceral mesoderm as well as in the somatic mesoderm. Moreover, multiple regulatory elements are differentially activated for specific expression in presumptive muscle founders, prefusion myoblasts, and differentiated muscle fibers. Taken together, the presented data suggest that specific expression of the mef2 gene in myogenic lineages in the Drosophila embryo is the result of multiple genetic inputs that act in an additive manner on distinct enhancers in the 5' flanking region (Nguyen, 1999).

The MADS-box transcription factor MEF2 is expressed specifically in developing cardiac, somatic, and visceral muscle cell lineages during Drosophila embryogenesis and is required for myoblast differentiation and muscle morphogenesis. To define the mechanisms that regulate Mef2 transcription, the Mef2 upstream region was analyzed for sequences sufficient to recapitulate the expression pattern of the gene in Drosophila embryos. Described here is a complex enhancer located 5.8 kb upstream of the Drosophila Mef2 gene that controls transcription in cardial cells of the dorsal vessel, a subset of somatic muscle founder cells, and the visceral muscle cells. The 237-bp cardial enhancer is located between -5907 to -5670 upstream of the Mef2 gene. The core of this enhancer contains two evolutionarily conserved binding sites for the homeodomain protein Tinman (Tin), expressed in developing cardiac, somatic, and visceral muscle lineages. Both Tin binding sites are required for enhancer activity in all three muscle cell lineages. Whereas the 285-bp enhancer core alone is sufficient for expression in cardiac cells, expression in somatic founder cells and visceral muscle is dependent on the core enhancer plus unique flanking sequences that include an evolutionarily conserved E box. These results reveal an essential role for Tin in the activation of Mef2 transcription in multiple myogenic lineages and demonstrate that transcriptional activity of Tin is dependent on combinatorial interactions with other factors unique to different muscle cell types (Cripps, 1999).

Expression of ladybird genes in the subset of cardioblast and pericardial cell precursors is critically dependent on mesodermal tinman function, epidermal Wingless signaling and the coordinate action of neurogenic genes. lb-expressing heart progenitors contribute to the increased number of cardiac precursor cells in Notch, Delta, Enhancer of split, mastermind, big brain and neuralized mutants. Negative regulation by hedgehog is required to restrict ladybird expression to two out of six cardioblasts in each hemisegment. Overexpression of ladybird causes a hyperplasia of heart precursors and alters the identity of even-skipped-positive pericardial cells. Surprisingly, the number of eve-expressing pericardial cells is strongly reduced in overexpressors. These lb expressing cells are transformed into l-paracardial cells. Loss of ladybird function leads to the opposite transformation, suggesting that ladybird participates in the determination of heart lineages and is required to specify the identities of subpopulations of heart cells. Both early Wingless signaling and ladybird-dependent late Wingless signaling are required for proper heart formation. Thus, it is proposed that ladybird plays a dual role in cardiogenesis: (1) during the early phase, it is involved in specification of a segmental subset of heart precursors as a component of the cardiogenic tinman-cascade and (2) during the late phase, it is needed for maintaining wingless activity and thereby sustaining the heart pattern process. These events result in a diversification of heart cell identities within each segment. Since tinman, bagpipe, S59 and ladybird genes are all part of the same homeobox gene cluster, it is likely that their association has to do with the orchestrated diversification of mesoderm (Jagla, 1997).

In an effort to isolate genes required for heart development and to further the understanding of cardiac specification at the molecular level, PlacZ enhancer trap lines were screened for expression in the Drosophila heart. One of the lines generated in this screen, designated B2-2-15, is particularly interesting because of its early pattern of expression in cardiac precursor cells, an expression pattern dependent on the homeobox gene tinman, a key determinant of heart development in Drosophila. A gene was isolated and characterized in the vicinity of B2-2-15 that exhibits an identical expression pattern to that of the reporter gene of the enhancer trap. apontic mutant embryos show distinct abnormalities in heart morphology as early as mid-embryonic stages when the heat tube assembles: segments of heart cells (those of myocardial and pericardial identity) are often missing. These abnormalities become obvious shortly before the assembly of the heart precursor cells at the dorsal midline. The defects in heart tube formation are seen with three markers: (1) Evenskipped, which is present in a subset of pericardial cells (EPC); (2) Mef2, which marks the cardial cells of the heart, and (3) Zfh-1, which is primarily present in the non-EPC pericardial cells. No obvious defects are observed in somatic and visceral muscle patterning, suggesting a specific requirement for apt in heart formation, as opposed to other mesodermal derivatives. Since the initial cardiac mesoderm seems to form normally in these mutants, it seems likely that apt is primarily required for a late differentiation step, such as the correct assembly of the heart tube. This would be consistent with a cell autonomous function of apt in the developing heart (Su, 1999).

During Drosophila embryogenesis, the beta3 tubulin gene is expressed in the visceral and somatic mesoderm as well as in the dorsal vessel. Transcription of the gene is limited to four pairs of cardioblasts per segment. Its expression in the dorsal vessel (dv) is mediated by a 333-bp enhancer located upstream of the gene (between -21705 and -21385 bp). The homeodomain protein Tinman is expressed in these cardioblasts, implying that Tinman might be a key regulator of the beta3 tubulin gene. Gel retardation and footprint assays indeed has revealed two Tinman binding sites within the dv-specific enhancer. The relevance of the Tinman binding sites was analyzed in a transgenic fly assay and distinct functions for both sites were observed. The BS(Tin-1460) site is absolutely required for expression in cardioblasts, while BS(Tin-1425) is needed for high-level expression. Thus, these two Tinman binding sites act in concert to drive beta3 tubulin gene expression during heart development. Tinman initially functions in the specification of visceral mesoderm and heart progenitors, but remains expressed in cardioblasts until dorsal closure. Overall, these data demonstrate a late function for Tinman in the regulation of beta3 tubulin gene expression in the forming heart of Drosophila (Kremser, 1999a).

D-mef2 is a target for Tinman activation during Drosophila heart development

Genetic analyses indicate that tinman and D-mef2 act at early and late steps, respectively, in the cardiac lineage. D-mef2 expression in the developing heart requires a novel upstream enhancer containing two Tinman binding sites, both of which are essential for enhancer function in cardiac muscle cells. The upstream enhancer is located 5.4 kb upstream of the structural gene for D-mef2. Transcriptional activity of this cardiac enhancer is dependent on tinman function, and ectopic Tinman expression activates the enhancer outside of the cardiac lineage. These results define the only known in vivo target for transcriptional activation by Tinman and demonstrate that D-mef2 lies directly downstream of tinman in the genetic cascade controlling heart formation in Drosophila. Higher up in the cascade, both DPP and Wingless expression in the ectoderm are required for tinman expression in the dorsal mesoderm (Gajewski, 1997).

The Drosophila mef2 gene encodes a MADS domain transcription factor required for the differentiation of cardiac, somatic, and visceral muscles during embryogenesis and the patterning of adult indirect flight muscles assembled during metamorphosis. A prerequisite for Mef-2 function in myogenesis is its precise expression in multiple cell types. Novel enhancers for Mef-2 transcription in cardiac and adult muscle precursor cells have been identified and their regulation by the Tinman and Twist myogenic factors have been demonstrated. However, these results suggest the existence of additional regulators and provide limited information on the specification of progenitor cells for different muscle lineages. The heart enhancer has been further characterized and shown to be part of a complex regulatory region controlling the activation and repression of Mef-2 transcription in several cell types. The presence of two Tinman binding sites is necessary but not sufficient for enhancer function; additional sequences are required for cardial cell expression. The mutation of a GATA sequence in the enhancer changes its specificity from cardial to pericardial cells. Also, the addition of flanking sequences to the heart enhancer results in the expression of Mef-2 in a new cell type: the founder cells for a subset of body wall muscles. Since tinman function is required for Mef-2 expression in both the cardial and founder cells, these results define a shared regulatory DNA that functions in distinct lineages due to the combinatorial activity of Tinman and other factors that work through adjacent sequences. The forced mesodermal expression of Twist causes a repression of the enhancer element in founder cells while allowing normal function in cardial cells. The analysis of Mef-2-lacZ fusion genes in mutant embryos reveals that the specification of the muscle precursor cells involves the wingless gene. Wg is required both in the formation of specific founder cells and in the specification of the progenitors of the cardial cells. Ectodermal cells must have a ventral identity for the formation of founder cells. These results demonstrate that the cell fate status of ectodermal cells adjacent to the domain of ventral founder cell specification is crucial for the proper formation of these cells. Mesodermal cell fate also depends on the activation of a receptor tyrosine kinase signaling pathway. Ectopic expression of an activated form of Ras1 throughout the mesoderm results in a substantial overproduction of the ventral founder cells as compared to control embryos. This signal may be transduced through the mesodermally-active EGF or FGF receptor tyrosine kinases (Gajewski, 1998).

The zinc finger proteins Pannier and GATA4 function as cardiogenic factors in Drosophila

The regulation of cardiac gene expression by GATA zinc finger transcription factors is well documented in vertebrates. However, genetic studies in mice have failed to demonstrate a function for these proteins in cardiomyocyte specification. In Drosophila, the existence of a cardiogenic GATA factor has been implicated through the analysis of a cardial cell enhancer of the muscle differentiation gene Mef2. The GATA gene pannier is expressed in the dorsal mesoderm and required for cardial cell formation while repressing a pericardial cell fate. Ectopic expression of Pannier results in cardial cell overproduction, while co-expression of Pannier and the homeodomain protein Tinman synergistically activate cardiac gene expression and induce cardial cells. The related GATA4 protein of mice likewise functions as a cardiogenic factor in Drosophila, demonstrating an evolutionarily conserved function between Pannier and GATA4 in heart development (Gajewski, 1999).

tinman gene function is required for heart development in Drosophila. The initial programming of the cardiac lineage occurs at a time when tin is broadly expressed in the dorsal mesoderm. A subset of the tin-expressing cells will become heart precursors, appearing in 11 clusters along the dorsalmost part of the mesoderm. The Mef2 enhancer-lacZ fusion gene marks heart progenitors at stage 11 and will eventually be expressed in four pairs of cardial cells per segment of the dorsal vessel. Thus, the tin expression domain is significantly larger than the territory of heart precursor specification, suggesting the involvement of additional factors in the formation of these cells. The Mef2 heart enhancer requires the presence of at least three elements for its activity, including two Tin binding sites and one GATA sequence. The GATA gene pnr is expressed in cells of the dorsal ectoderm around the time of heart cell specification. However, there is no report of pnr transcription in the mesoderm. To investigate this possibility, embryos were stained for PNR mRNA and embryo cross-sections were examined. At late stage 10, gene expression is observed in the dorsal ectoderm of the germband-extended embryo. PNR mRNA was detected in four clusters of cells located in the dorsalmost part of the mesoderm that corresponds to the cardiogenic region. Additionally, a pnr mesodermal enhancer has been identified that directs lacZ expression in the heart-forming region, but not in the overlying ectoderm. Therefore, pnr is expressed in the cardiogenic mesoderm where it could function in cardial cell specification and the regulation of Mef2 transcription (Gajewski, 1999).

Certain NK-2 class homeodomain and GATA family proteins have been shown to physically interact in their cooperative activation of gene expression in cell culture systems. To test the possibility that Tin and GATA factors could functionally interact in an embryological context, the tin, pnr and mGATA4 genes were expressed independently or in combination in Drosophila embryos. When tin, pnr or mGATA4 are expressed alone in the twi enhancer-expressing cells, the Mef2 heart enhancer is activated ectopically in the cephalic (tin) or dorsal (pnr and mGATA4) mesoderm. Since Tin is a known regulator of the Mef2 enhancer, it could be activating the Mef2 sequence in the head region through its fortuitous interaction with a co-factor normally expressed in these cells. The results are striking when both Tin and either of the GATA factors are co-expressed under the control of the twi-Gal4 driver. A cardial cell marker is now activated in both the cephalic region and throughout the dorsal and ventral trunk mesoderm. Likewise, a strong ectopic expression of the Mef2 heart enhancer is detected in ventral midline cells of the developing CNS. The data point to a combinatorial interaction of Tin and the two GATA factors in the de novo activation of the cardial cell marker in both mesodermal and non-mesodermal cells. They also suggest these genetic combinations are inducing a cardial cell fate along the ventral midline of the CNS (Gajewski, 1999).

In summary, the discovery of early heart phenotypes in pnr mutant embryos, coupled with the demonstration of uniquely conserved cardiogenic abilities of Pnr and GATA4, provide novel evidence for the function of GATA family members in the specification of a heart cell type. In an embryological context, these proteins can work with the Tin homeodomain factor to program cells into an apparent cardial fate in both mesodermal and non-mesodermal cell types. This genetic combination appears to be essential, but not necessarily sufficient, for cellular commitment to the cardiac lineage as other factors may contribute to the specification process. Additional studies using the Drosophila cardiogenic assay should prove instrumental in revealing other key members of this genetic program (Gajewski, 1999).

Pannier is a transcriptional target and partner of Tinman during Drosophila cardiogenesis

During Drosophila embryogenesis, the homeobox gene tinman is expressed in the dorsal mesoderm where it functions in the specification of precursor cells of the heart, visceral, and dorsal body wall muscles. The GATA factor gene pannier is similarly expressed in the dorsal-most part of the mesoderm where it is required for the formation of the cardial cell lineage. Despite these overlapping expression and functional properties, potential genetic and molecular interactions between the two genes remain largely unexplored. pannier has been shown to be a direct transcriptional target of Tinman in the heart-forming region. The resulting coexpression of the two factors allows them to function combinatorially in the regulation of cardiac gene expression, and a physical interaction of the proteins has been demonstrated in cultured cells. Functional domains of Tinman and Pannier have been described that are required for their synergistic activation of the D-mef2 differentiation gene in vivo. Together, these results provide important insights into the genetic mechanisms controlling heart formation in the Drosophila model system (Gajewski, 2001).

Around the time of heart precursor cell specification, pnr RNA is detected in the dorsal mesoderm in cells that also express the tin gene. A pnr enhancer has been identified that is active in this heart-forming region and maps to a 457-bp DNA immediately upstream of the gene. Because the enhancer contains two putative Tin recognition sites, the binding of Tim to this DNA was investigated. A GST-Tin fusion protein was used in a gel-shift assay to test its ability to bind to the Tin1 sequence TCAAGTG, a known recognition element of the homeodomain protein in mesodermal enhancers of the D-mef2, tin, and b3 tubulin genes. Tin can bind specifically to the Tin1 consensus but not to a mutant version of the sequence. DNase I protection assays were also performed with the fusion protein on the pnr DNA, and two separate footprints were obtained that correspond to the Tin1 and Tin2 sequences. These in vitro experiments demonstrate that Tin can recognize and bind to two sites within the pnr dorsal mesoderm enhancer, suggesting the regulatory DNA might be a direct target of Tin transcriptional activity (Gajewski, 2001).

The defined pnr enhancer functions in the dorsal-most cells of the mesoderm and in cells of the amnioserosa. To determine whether its activity is regulated by Tin around the time of heart cell formation, the expression of a pnr enhancer-lacZ fusion gene was monitored in tin gain and loss of function embryos. Initially, the Gal4/UAS binary system was used to express tin throughout the mesoderm and mesectoderm under the control of the twi-Gal4 driver. An expanded function of the enhancer was observed within the mesoderm, coupled with ectopic activity in midline cells of the central nervous system (CNS) due to the forced expression of Tin. Conversely, in tin null embryos, a complete absence of beta-galactosidase expression is observed, which demonstrates a requirement of Tin function for enhancer activity (Gajewski, 2001).

The D-mef2 gene is a direct transcriptional target of Tin and Pnr in cardioblasts. A defined heart enhancer for the gene contains a pair of essential Tin binding sites and a required GATA element located in close proximity to one of the Tin recognition sequences. Coexpression of the two factors in CNS midline cells results in the ectopic activation of the D-mef2 enhancer normally expressed only in cardial cells. This result is compatible with the nuclear colocalization and physical interaction of Tin and Pnr in cultured cells and provides an embryological assay for identifying regions of the proteins that are essential for their functional synergism. Nine deleted or point mutant versions of Tin were tested in the synergism assay. Tin(N351Q) has a single amino acid change in the homeodomain and is unable to bind DNA. Coexpression of this mutant with wild-type Pnr fails to activate the D-mef2 enhancer. While a competent homeodomain must be present in Tin for synergism with Pnr, this region by itself is not sufficient as it fails in the coactivation assay. The TN domain is a highly conserved 12 amino acid region found in Tin and most other NK-2 class proteins. A 10-amino acid deletion was made within this domain to generate the Tin(1-35, 46-416) mutant, but this altered protein is still able to function combinatorially with Pnr. Thus, the TN domain is dispensable in the synergism assay (Gajewski, 2001).

A transcriptional activation domain has been mapped to the N-terminal 114 amino acids of Tin by using a cell transfection strategy. To determine whether this region is required for functional interaction with Pnr, the Tin(111-416) deletion was generated and tested. This truncated protein remained competent to synergize with Pnr in the activation of the D-mef2 enhancer, showing that the Tin transactivation domain is not required. However, larger N-terminal deletions result in Tin proteins that are functionally inactive. Specifically, removal of an additional 41 amino acids in Tin(152-416) has identified residues 111 to 151 as essential for Tin synergism with Pnr. The Tin(1-109, 192-416) variant that contains the transactivation domain and homeodomain, but lacks internal sequences including the required 41-amino acid region, is likewise nonfunctional in the D-mef2 enhancer coactivation assay. Therefore, these studies identify two distinct regions of Tin needed for its combinatorial function with Pnr, an internal segment of 41 amino acids adjacent to the transactivation domain and the conserved homeodomain (Gajewski, 2001).

The ectopic activation assay was used to determine those regions of Pnr that are essential for its functional synergism with Tin. Six deleted or point mutant forms were tested for enhancer activation in CNS midline cells. Pnr(1-457) represents a C-terminal truncation of the GATA factor that maintains zinc fingers 1 and 2, but deletes two putative amphipathic a helices. This C-terminal region has been shown to contain a transcriptional activation domain, and the inability of the truncated protein to synergize with Tin demonstrates an essential requirement of this Pnr sequence. Pnr(E168K) and Pnr(C190S) contain single amino acid changes in the N-terminal zinc finger that correspond to mutations found in dominant alleles pnr. These mutations may affect the formation of the first zinc finger and result in proteins that heterodimerize poorly with the Ush antagonist. However, two different dominant mutant Pnr proteins are able to synergize with Tin and direct D-mef2 expression in the CNS. In contrast, the mutation of a conserved cysteine residue in zinc finger 2 in Pnr(C247S) inactivates the protein in the synergism assay. This amino acid change is likely to influence the formation of the C-terminal zinc finger and identifies this region as an essential functional domain of Pnr in the coactivation of D-mef2. It is important to note that, although Pnr(1-457) and Pnr(C247S) fail to synergize with Tin, they are competent to bind the homeodomain protein in the GST pull-down assay. In combination, these results substantiate that intrinsic functional properties of Pannier are perturbed in the two mutant forms of the GATA factor (Gajewski, 2001).

An unexpected finding of this work is that, while the C-terminal transactivation domain of Pnr is required in the combinatorial assay, the N-terminal transactivation domain of Tin is not. One could envision a mechanism wherein the presence of the single domain provided by Pnr is sufficient for the activation properties of the heterodimeric complex. Additionally, it can not be ruled out that a second transactivation domain exists in Tin that was not revealed previously in cell transfection studies. Also of note is the nonrequirement of a proposed cardiogenic domain of Tin that maps to the N-terminus of the protein. Specifically, Tin(111- 416) is competent to work with Pnr in the cooperative activation of the D-mef2 heart enhancer, despite the absence of residues 1 through 42. Instead, an internal 41-amino acid region between the Tin transactivation domain and homeodomain has emerged as a vital sequence for functional interaction with Pnr. A repressor activity of Tin has been ascribed to residues 111 through 188, and it is plausible that, based on the biological assay being used, multiple functional characteristics may be uncovered within this region (Gajewski, 2001).

In the context of Tin's synergistic interaction with Pnr in regulating a defined cardiac enhancer, association of the two through this domain may prevent Pnr from interacting with other proteins such as Ush. At the same time, because Tin has the potential to act as a transcriptional repressor that recruits Groucho via this domain, the interaction of Tin and Pnr through the essential 111 to 151 subregion may be beneficial to Tin in its role as a transcriptional activator by eliminating its possible association with inhibitory cofactors. Preliminary results suggest the molecular interaction of Tin and Pnr may be due in part to the presence of this domain (Gajewski, 2001).

Ras pathway specificity is determined by the integration of multiple signal-activated and tissue-restricted transcription factors

Ras signaling elicits diverse outputs, yet how Ras specificity is generated remains incompletely understood. Wingless and Decapentaplegic confer competence for receptor tyrosine kinase-mediated induction of a subset of Drosophila muscle and cardiac progenitors by acting both upstream of and in parallel to Ras. In addition to regulating the expression of proximal Ras pathway components, Wg and Dpp coordinate the direct effects of three signal-activated transcription factors (dTCF, Mad, and Pointed that function in the Wg, Dpp, and Ras/MAPK pathways, respectively) and two tissue-restricted transcription factors (Twist and Tinman) on even-skipped, a progenitor identity gene enhancer. The integration of Pointed with the combinatorial effects of dTCF, Mad, Twist, and Tinman determines inductive Ras signaling specificity in muscle and heart development (Halfon, 2000).

Cell fate specification in the somatic mesoderm of the Drosophila embryo has been examined as a model for dissecting the molecular basis of combinatorial signaling involving receptor tyrosine kinases (RTKs). The somatic musculature and the cells that compose the heart develop from specialized cells called progenitors. Each progenitor divides asymmetrically to produce two founder cells that possess information that specifies individual muscle fate and that seed the formation of multinucleate myofibers. The focus of this study has been a small subset of somatic mesodermal cells that express the transcription factor Even skipped. Eve is expressed in the progenitors and founders of both the dorsal muscle fiber DA1 and a pair of heart accessory cells, the Eve pericardial cells or EPCs. Since eve is the earliest known marker for these cells and is required for their formation, eve is referred to here as a progenitor identity gene (Halfon, 2000).

Given that the eve MHE recapitulates early mesodermal Eve expression, a determination was made of whether this enhancer contains binding sites for candidate signal-dependent and mesoderm-specific transcription factors. Focus was placed on two mesoderm-specific factors, Tin and Twi, as well as the nuclear factors that act downstream of Wg (dTCF), Dpp (Mad) and Ras (Pnt, Yan). A computer-based search of the MHE sequence has suggested the presence of potential binding sites for each of these transcription factors. Gel-shift assays confirm that these putative sites actually bind the relevant factors. This analysis establishes the existence of one binding site for dTCF, six for Mad, two for Twist, and four each for Tin and Pnt. Since Yan binds to each of the Pnt sites, these are referred to as Ets sites (Halfon, 2000).

To ascertain whether these in vitro binding sites have in vivo functional significance, the sites were mutated, both singly and in combination, within the context of the entire MHE. All mutagenesis was by base substitution so as not to affect the spacing between other potential cis-regulatory elements. The ability of the mutated MHEs to drive reporter gene expression was tested in transgenic embryos and this expression was compared to that of endogenous Eve. Of the six Mad sites, only Mad4, 5, and 6 are critical for MHE function when inactivated singly or in combination. Mutation of the single dTCF site or of individual binding sites for Twi, Tin, or the Ets factors also lead to loss of reporter gene expression in some, but not all, Eve-expressing cells, with some mutant sites associated with a more severe loss than others. Of note, both the EPC and DA1 lineages are affected equally by all of the mutations. In addition, the activity level in those Eve-expressing cells that do maintain reporter gene expression is on average lower than that seen with the wild-type MHE. In contrast to the single site mutants, mutation of the two Twi, all four Tin, or all four Ets sites completely eliminate MHE activity. It is concluded that binding sites for two tissue-specific and three signal-responsive transcription factors are required for full activity of the MHE in both the muscle and the heart lineages (Halfon, 2000).

The finding that the three Wg-dependent factors, dTCF, Twi, and Tin, that directly regulate eve could explain why activated Ras is incapable of bypassing Wg in the induction of Eve progenitors. Therefore attempts were made to rescue Eve expression in wg mutant embryos by ectopically expressing Twi and Tin together with activated Ras. However, Eve progenitors were not recovered by this manipulation, perhaps due to the direct requirement of dTCF for eve MHE activity. While activated Arm can supply the missing downstream Wg transcription factor in this rescue experiment, Arm alone is capable of fully rescuing not only the Eve progenitors but also all of the Wg-dependent factors that regulate the MHE, including Twi, Tin, and the RTK/Ras pathway components. Thus, the combined effects of the MHE transcription factors could not be further evaluated in the absence of Wg signaling. Nevertheless, the rescue and enhancer mutagenesis data strongly support the involvement of Wg as a mesodermal competence determinant both upstream of the Ras pathway and directly (via dTCF) as well as indirectly (via Twi and Tin) in the transcriptional response to inductive RTK signaling (Halfon, 2000).

Since mutation of any single transcription factor binding site in the MHE causes only a partial loss of enhancer activity, it was considered whether different sites might function together synergistically. To test this possibility, binding site mutations for two different activators were combined. Simultaneous mutation of the dTCF and Twi1 sites led to reporter gene expression in approximately 5-fold fewer cells than would be expected from the additive independent effects of each mutation. A similar, though slightly less robust, synergy was observed when the dTCF and Ets3 mutations were combined (Halfon, 2000).

An assessment was made of whether ectopic coexpression of individual transcription factors or upstream signals would lead to cooperative effects on endogenous Eve expression. As previously reported, ectopic Wg has no effect on Eve expression at late stage 11, activated Ras1 induces extra Eve progenitors, and ectopic Wg plus activated Ras1 cause a lateral expansion of the progenitor clusters. When Twi is expressed using a twi-Gal4 driver, a few Eve-positive cells develop at ectopic positions. The magnitude of this effect is increased by coexpression of Wg and Twi, and even more so by coexpression of Twi with activated Ras1. The latter effect strikingly resembles that of Wg plus activated Ras1. With the simultaneous ectopic expression of Wg, Twi, and activated Ras1, Eve progenitors form an almost continuous anteroposterior stripe confined to the dorsal mesoderm. These results demonstrate a synergistic induction of Eve progenitors by various combinations of Wg, Twi, and activated Ras1 that parallels the synergistic loss of MHE activity seen by mutating the dTCF, Twi, and Ets binding sites. Taken together, these loss- and gain-of-function findings suggest that dTCF, Twi, and Pnt cooperate at the MHE to synergistically regulate Eve transcription and, by extension, to induce the specification of Eve progenitor fates (Halfon, 2000).

It is concluded that Wg and Dpp coordinate a series of signal-activated (dTCF and Mad) and mesoderm-specific (Twi and Tin) transcription factors in a temporal and spatial pattern that facilitates cooperation with the Ras transcriptional effector Pnt. The synergistic integration of these five transcription factors by a single enhancer generates a specific developmental response to Ras/MAPK signaling. Moreover, Wg and Dpp exert proximal effects in this signaling network by enabling Ras/MAPK activation through the regulated localized expression of upstream components of the RTK signal transduction machinery. A model governing the acquisition of developmental competence, signal integration and response specificity in this system is presented. Wg and Dpp provide competence through the regulation of tissue-specific transcription factors (Tin and Twi), signal-responsive transcription factors (Mad and dTCF), and proximal components of the RTK/Ras pathways (Htl, Hbr, and Rho). The Ras signaling cascade leads to activation of the inductive transcription factor, Pnt, and inactivation of the Yan repressor. While a direct role for Mad in regulating Tin expression has been demonstrated, Wg regulation of Tin, Twi, Htl, Hbr, and Rho may be either direct or indirect. Dpp has additional effects on the proximal RTK factors. The five transcriptional activators assemble at and are integrated by the MHE, where they function synergistically to promote eve expression. Specificity of the response to inductive RTK/Ras signaling derives from the combinatorial effects of the tissue-restricted and signal-activated transcription factors that converge at the MHE. In the absence of inductive signaling, Yan would repress eve by binding to the Ets sites. Since eve is a muscle and heart identity gene, the regulatory mechanisms are inferred to have a more general function in determining progenitor fates. Additional complexity attendant upon the control of RTK activity in this system derives from positive feedback regulation of the Ras/MAPK cascade and from reciprocal regulatory interactions between the Ras and Notch pathways (Halfon, 2000).

The beta3 tubulin gene of Drosophila is expressed in the major mesodermal derivatives during their differentiation. The gene is subject to complex stage- and tissue-specific transcriptional control by upstream as well as downstream regions. Analysis of the vm1 enhancer, which is responsible for tissue-specific expression in the visceral mesoderm and is localized in an intron, reveals a complex modular arrangement of regulatory elements. In vitro and in vivo experiments uncovered two binding sites [termed UBX1 and UBX2, for the product of the homeotic gene Ultrabithorax(Ubx)] that are required for high-level expression in pPS6 and PS7. Further analysis of the vm1 enhancer has revealed that deletion of a specific element, termed element 7 (e7), abolishes transcription of the lacZ reporter gene in all parasegments except pPS6/PS7. Gel-retardation and footprint analysis has identified a binding site for the homeodomain protein Tinman, which is essential for the specification of the dorsal mesoderm, within e7. Simultaneous deletion of two further sequence blocks in the vml enhancer, named elements 3 (e3), and 6 (e6), results in a reduction analogous to that caused by removal of e7. The e6 sequence contains conserved motifs also found in the visceral enhancer of the Ubx gene. It is therefore concluded that these elements act in concert with the Tinman binding site to achieve high expression levels. Thus the vm1 enhancer of the beta3 tubulin gene contains a complex array of elements that are involved in transactivation by a combination of tissue- and position-specific factors, including Tinman and UBX (Kremser, 1999b).

biniou (FoxF), a central component in a regulatory network controlling visceral mesoderm development and midgut morphogenesis in Drosophila

The subdivision of the lateral mesoderm into a visceral (splanchnic) and a somatic layer is a crucial event during early mesoderm development in both arthropod and vertebrate embryos. In Drosophila, this subdivision leads to the differential development of gut musculature versus body wall musculature. biniou, the sole Drosophila representative of the FoxF subfamily of forkhead domain genes, has a key role in the development of the visceral mesoderm and the derived gut musculature. biniou expression is activated in the trunk visceral mesoderm primordia downstream of dpp, tinman, and bagpipe and is maintained in all types of developing gut muscles (Zaffran, 2001).

bagpipe-expressing domains are defined by the intersecting dorsal activities of dpp/tin, which act positively, and segmentally modulated activities of wg/slp, which have repressing effects. bin also requires tin activity for normal expression in the trunk visceral mesoderm primordia. Whereas bap expression is virtually absent in these cells upon loss of tin activity, residual bin expression is observed in small clusters of cells. To test the possibility that residual expression of bin in tin mutant embryos is due to direct inputs from Dpp, bin expression was examined in embryos in which dpp expression was induced ectopically in the entire mesoderm. Ectopic dpp in a wild-type background, which causes tin expression to be expanded ventrally, results in an analogous expansion of the bin domains. Notably, ventral expansion of the bin domains is also observed upon ectopic dpp expression in the absence of tin activity, although the domains are narrow. Thus, Dpp is able to induce bin in the absence of tin, although tin activity is required for normal expression levels. The residual expression of bin in tin mutant embryos is unstable and not maintained in later stages of development (Zaffran, 2001).

Similar to tin, bap activity is also required for normal bin expression. This result is in agreement with the temporal sequence of bap and bin expression and with the observed expansion of bin throughout most of the dorsal mesoderm upon ectopic bap expression in the mesoderm. These data suggest that bin is furthest downstream within a mesoderm-intrinsic cascade of gene activation: twist -> tin -> bap -> bin. Moreover, bin itself is required for normal bin expression. Although bin expression initiates normally in stage 10 bin mutant embryos, it disappears at early stage 11 in the trunk visceral mesoderm primordia of bin mutants, except for those in PS1 and 2. bin expression in these two parasegments is also less sensitive to the loss of tin and bap activity. Furthermore, the expression of bin in foregut, hindgut, and caudal visceral mesoderm does not depend on any of the genes examined in the present study (Zaffran, 2001).

Whereas the above data show that maintenance of bin expression in most of the presumptive trunk visceral mesoderm requires positive autoregulation, they do not establish whether this autoregulatory loop is direct or indirect. Of note, maintenance of bap during stage 11 (but not its initiation during stage 10) also requires bin activity. Therefore, it is possible that, at least during stage 11, bin and bap maintain each other's expression through a cross-regulatory feedback loop (Zaffran, 2001).

Jelly belly: A Drosophila LDL receptor repeat-containing signal required for mesoderm migration and differentiation

A screen was performed to identify genes that are transcriptionally regulated by the homeodomain protein Tinman. Tin, a member of the NK family of homeodomain proteins, is required for organogenesis of the embryonic heart and visceral mesoderm. The screening method relies on genetic selection in yeast for a protein-DNA interaction. A library was screened that represents 15% of the Drosophila genomic DNA and six DNA fragments were obtained that satisfied genetic criteria in yeast for Tin binding sites. Most of the genomic DNA fragments were isolated multiple times. Sequence analysis has confirmed the presence of core recognition sites for NK class homeodomains in all of the fragments. To show that these fragments function as Tin-responsive enhancers in vivo, it was asked if they could drive expression of a reporter gene in patterns consistent with Tin regulation (Weiss, 2001).

The screen is surprisingly specific for genes regulated by Tinman (or closely related genes), as demonstrated both by the reporter-construct results and the genes that are located adjacent to the Tinman binding sites. Four fragments identified in the screen were inserted upstream of a lacZ reporter. Three of the four reporter constructs, tested as transgenes, are active in patterns consistent with Tin regulation. One fragment lies adjacent to jelly belly (jeb), a gene expressed in ventral, early mesoderm. The Tin binding site that led to the identification of jeb contains two Tin/NK2 class homeodomain recognition sites oriented as an imperfect inverted repeat. This genomic fragment was mapped to interval 48E9 of polytene chromosome 2R by in situ hybridization and based on the Drosophila genome sequence. The Tin binding sites lie adjacent to a P element insertion within a large intron of the jeb gene (Weiss, 2001).

jeb expression in tin mutant embryos is scarcely different from wild-type, though it may be somewhat reduced. Tin activation of jeb transcription is likely to be redundant with other regulators of mesoderm development. To test the sufficiency of Tin for activating jeb, embryos in which tin was ectopically expressed were assessed for ectopic jeb expression. Misexpression of tin in the ectoderm with an engrailed GAL4 driver does not alter jeb expression. Misexpression of tin throughout the mesoderm is sufficient to activate jeb expression at a late time (stage 12) when it is not expressed in wild-type embryos, and in cells where jeb is not normally expressed. A cofactor in the mesoderm may be required for Tin-mediated activation of jeb transcription. The expression domains of tin and jeb imply that Tin's role in the regulation of jeb is restricted to the earliest stages of jeb expression, since at late stage 10, Tin is only in dorsal mesoderm and Jeb is in ventral mesoderm (Weiss, 2001).

The ability of Tin to activate jeb transcription ectopically in the mesoderm implies that Tin plays an early and redundant function in the regulation of jeb. Other regulators that may play roles in the regulation of jeb include the bHLH protein Twist and the Pax domain protein Pox Meso (Weiss, 2001).

The T-box genes midline and H15 are conserved regulators of heart development: The expression of midline and H15 is dependent on Wingless signaling and tinman and pannier

The Drosophila melanogaster genes midline and H15 encode predicted T-box transcription factors homologous to vertebrate Tbx20 genes. All identified vertebrate Tbx20 genes are expressed in the embryonic heart and both midline and H15 are expressed in the cardioblasts of the dorsal vessel, the insect organ equivalent to the vertebrate heart. The midline mRNA is first detected in dorsal mesoderm at embryonic stage 12 in the two progenitors per hemisegment that will divide to give rise to all six cardioblasts. Expression of H15 mRNA in the dorsal mesoderm is detected first in four to six cells per hemisegment at stage 13. The expression of midline and H15 in the dorsal vessel is dependent on Wingless signaling and the transcription factors tinman and pannier. The selection of two midline-expressing cells from a pool of competent progenitors is dependent on Notch signaling. Embryos deleted for both midline and H15 have defects in the alignment of the cardioblasts and associated pericardial cells. Embryos null for midline have weaker and less penetrant phenotypes while embryos deficient for H15 have morphologically normal hearts, suggesting that the two genes are partially redundant in heart development. Despite the dorsal vessel defects, embryos mutant for both midline and H15 have normal numbers of cardioblasts, suggesting that cardiac cell fate specification is not disrupted. However, ectopic expression of midline in the dorsal mesoderm can lead to dramatic increases in the expression of cardiac markers, suggesting that midline and H15 participate in cardiac fate specification and may normally act redundantly with other cardiogenic factors. Conservation of Tbx20 expression and function in cardiac development lends further support for a common ancestral origin of the insect dorsal vessel and the vertebrate heart (Miskolczi-McCallum, 2005).

In order to determine where mid and H15 fit in the genetic hierarchy controlling heart development, their expression was examined in several mutant backgrounds. The initiation of mid expression in the dorsal mesoderm in early stage 12 occurs after the expression of tin and pnr, as well as after the period of Wg signaling in the dorsal mesoderm, suggesting that mid and H15 are regulated downstream of the factors that confer cardiac fate. Indeed, the dorsal vessel expression of mid and H15 is completely lost in both wg^cx4 and tin^ec40 mutant embryos, which fail to specify dorsal mesoderm. Embryos mutant for pnr have greatly decreased numbers of cardioblasts. Accordingly, mid and H15 expression is variably lost in pnr^vx6 null mutant embryos, with most embryos completely lacking mid expression in the dorsal mesoderm. Ectopic expression of pnr throughout the mesoderm using the GAL4/UAS system is able to induce ectopic expression of mid and H15. These results indicate that the initiation of mid and H15 in the dorsal mesoderm is downstream of factors required for the specification of cardiac fate (Miskolczi-McCallum, 2005).

A second study by Qian (2005) provides more detailed information on a cellular fate switch accompanying loss- and gain-of-function of this gene pair. Referring to midline and H15 by the alternative name neuromancer (nmr) the Qian study shows that gene function causes a switch in cell fates in the cardiogenic region, in that the progenitors expressing the homeobox gene even skipped (eve) are expanded, accompanied by a corresponding reduction of the progenitors expressing the homeobox gene ladybird (lbe). As a result, the number of differentiating myocardial cells is severely reduced whereas pericardial cell populations are expanded. Conversely, pan-mesodermal expression of nmr represses eve, while causing an expansion of cardiac lbe expression, as well as ectopic mesodermal expression of the homeobox gene tinman. In addition, nmr mutants with less severe penetrance exhibit cell alignment defects of the myocardium at the dorsal midline, suggesting nmr is also required for cell polarity acquisition of the heart tube. In exploring the regulation of nmr, it was found that the GATA factor Pannier is essential for cardiac expression, and acts synergistically with Tinman in promoting nmr expression. Moreover, reducing nmr function in the absence of pannier further aggravates the deficit in cardiac mesoderm specification. Taken together, the data suggest that nmr acts both in concert with and subsequent to pannier and tinman in cardiac specification and differentiation. It is proposed that nmr is another determinant of cardiogenesis, along with tinman and pannier (Qian, 2005).

Integration of positive Dpp signals, antagonistic Wg inputs and mesodermal competence factors and thir impact of Bagpipe expression during Drosophila visceral mesoderm induction

Tissue induction during embryonic development relies to a significant degree on the integration of combinatorial regulatory inputs at the enhancer level of target genes. During mesodermal tissue induction in Drosophila, various combinations of inductive signals and mesoderm-intrinsic transcription factors cooperate to induce the progenitors of different types of muscle and heart precursors at precisely defined positions within the mesoderm layer. Dpp signals are required in cooperation with the mesoderm-specific NK homeodomain transcription factor Tinman (Tin) to induce all dorsal mesodermal tissue derivatives, which include dorsal somatic muscles (the dorsal vessel and visceral muscles of the midgut). Wingless (Wg) signals modulate the responses to Dpp/Tin along anteroposterior positions by cooperating with Dpp/Tin during dorsal vessel and somatic muscle induction while antagonizing Dpp/Tin during visceral mesoderm induction. As a result, dorsal muscle and cardiac progenitors form in a pattern that is reciprocal to that of visceral muscle precursors along the anteroposterior axis. The present study addresses how positive Dpp signals and antagonistic Wg inputs are integrated at the enhancer level of bagpipe (bap), a NK homeobox gene that serves as an early regulator of visceral mesoderm development. An evolutionarily conserved bap enhancer element requires combinatorial binding sites for Tin and Dpp-activated Smad proteins for its activity. Adjacent binding sites for the FoxG transcription factors encoded by the Sloppy paired genes (slp1 and slp2), which are direct targets of the Wg signaling cascade, serve to block the synergistic activity of Tin and activated Smads during bap induction. In addition, binding sites for yet unknown repressors are essential to prevent the induction of the bap enhancer by Dpp in the dorsal ectoderm. These data illustrate how the same signal combinations can have opposite effects on different targets in the same cells during tissue induction (Lee, 2005).

To investigate whether the bap regulators identified genetically, including tin, dpp, slp (downstream of wg) and biniou (bin), can act directly on the early TVM regulatory element of bap, DNaseI protection experiments with recombinant Tin, Bap, Smad (Mad and Medea), Slp and Bin proteins were performed on the 180 bp bap3.2.1 DNA sequence from D. melanogaster. The DNA footprinting results demonstrate that both Tin and Bap proteins can bind to the predicted Tin-binding site, which includes a perfect match to the canonical Tin-binding motif TCAAGTG. In addition to the Tin-binding site, a site with a TAAG core motif can strongly bind Bap but not Tin (CTTA in opposite strand; note that the same core motif is found in binding sites of a Bap ortholog, Nkx3.2). With regard to Dpp signaling mediators, there are five Mad-protected regions, three of which are also protected by recombinant Medea (Mad/Medea-1 to -3). Site 1 includes an AGAC motif that was initially identified as a Smad binding motif in vertebrates whereas sites 3-5 contain GC-rich sequences with CGGC motifs that were first shown to bind Smad proteins in Drosophila. Site 2 may be a combination of the two types (TGAC motif and CG-rich sequences). No clear correlation of either type of site was observed with the binding of Mad versus Medea. Finally, recombinant Slp proteins protect a wide stretch that includes an inverted repeat of core binding motifs for forkhead transcription factors (TAAACA), but extends further downstream. Slp can bind to tandem repeats of CAAA sequences, which are present in three copies in the 3' region of the protected region. Gel mobility shift and competition assays with Slp using wild-type oligonucleotides and a version in which the TAAACA motifs were mutated indicate that Slp can bind to both the TAAACA and the CAAA motifs with roughly equal affinity. In addition, the FoxF family protein Bin binds to the TAAACA inverted repeat region, but less well to the CAAA repeat region when compared with Slp. Taken altogether, these binding data are consistent with the hypothesis that the known mesodermal regulators of bap, namely Tin and Bin (and possibly autoregulatory Bap), as well as the signaling inputs from Dpp and Wg (through Smads and Slp, respectively) are integrated via direct binding to the early TVM enhancer of bap (Lee, 2005).

The present study describes an example of an enhancer whose response to Dpp is suppressed by Wg signals. A comparison of the functional organization of these enhancers provides new insight into molecular strategies of nuclear signal integration to produce differential developmental responses. The data show that bap is a direct target of Dpp signals. Thus, an indirect pathway of bap being activated solely by tin, whose mRNA expression is known to depend on Dpp inputs during the time of bap activation, can be ruled out. Rather, tin acts simultaneously and synergistically with Dpp. In fact, recent data with tin alleles lacking the Dpp-responsive enhancer show that bap can be induced in the absence of Dpp-induced tin products, as long as the twist-activated tin products are present. The molecular basis for this observed synergism of tin and dpp relies on the combinatorial binding of Tin and Dpp-activated Smad proteins to the bap enhancer. Several possible molecular mechanisms could underlie the strict requirement for combinatorial binding of Tin and Smads. For example, the relatively low binding affinity and specificity of Smads might be enhanced by bound Tin, which can engage in protein interactions with Mad and Medea. The combined presence of Tin and Smads in close vicinity or in complexes may also be a prerequisite for the assembly of higher order complexes with transcriptional co-activators such as CBP/p300. In addition, Tin may counteract the function of yet unknown repressors of nuclear Dpp signaling activity so that they can only repress in the ectoderm (Lee, 2005).

Unlike Dpp, Wg signals act indirectly upon the early bap enhancer. Previous genetic and molecular data have shown that Wg induces the expression of the forkhead domain-encoding gene slp via crucial dTCF/Lef-1 binding sites in both mesoderm and ectoderm. slp, in turn, functions as a repressor of bap. The present data show that slp products exert this function by direct binding to the Dpp-responsive bap enhancer, which obviously results in a suppression of the synergistic activity of bound Tin and Smad complexes. Slp proteins contain eh1 motifs that can potentially bind the Groucho co-repressor and Slp has known repressor activities in other contexts. In addition, the vertebrate counterpart of Slp, FoxG (BF-1), is known to interact with Groucho and histone deacetylases (Yao, 2001). Thus, it is proposed that Slp overrides nuclear Dpp signaling activities by dominantly establishing an inactive state of the chromatin at the bap locus (Lee, 2005).

Why would induction of tin and bap in the mesoderm require Tin as a co-factor of Smads, whereas in the ectoderm, which lacks Tin, the induction of tin and bap needs to be actively repressed? In the case of the tin enhancer, the ectodermal repressor elements are overlapping with the Tin-binding sites. Based upon this situation, a model is proposed in which the repressor would be present in both germ layers, but in cells of the mesoderm it is competed away from binding to the enhancer by Tin. This model is compatible with data showing that ectopic expression of Tin in the ectoderm is able to activate the Dpp-responsive enhancer of tin, even in the presence of the putative repressor binding elements. However unlike full-length Tin, an N-terminally truncated version with an intact homeodomain is not able to allow induction of the tin enhancer in the ectoderm. Furthermore, the putative repressor binding sites in the bap enhancer are separate from the Tin site. Hence, Tin does not compete for binding but may rather block or override the repressor factor(s) functionally. Thus, the positive activity of Tin would dominate over the negative action of this repressor in the mesoderm. By contrast, the repressing activity of Slp dominates over the positive action of Tin. Through this intricate balance of positive and negative switches, Tin could ensure that bap is induced by Dpp only in the mesoderm, while bound Slp prevents Tin from promoting Dpp inputs towards bap in striped domains within this germ layer. However, it can still not be fully explained why the absence of both the functional Tin and ectodermal repressor sites allows enhancer induction in the ectoderm, while preventing it in the mesoderm. The additional positive and negative binding factors involved will need to be identified to gain a full understanding of the germ layer-specific induction of these Dpp-responsive enhancers (Lee, 2005).

The bap enhancer described in this study represents the third example of well-characterized Dpp-responsive enhancers from mesodermal control genes. The other two are from tin, which is induced in the entire dorsal mesoderm, and eve, which is active in a small number of somatic muscle founder cells and pericardial progenitors in the dorsal mesoderm. The activities of the bap and eve enhancers along the anteroposterior axis are reciprocal, which is due to the fact that the eve enhancer requires inputs from Wg, whereas bap enhancer activity is suppressed by Wg. A comparison of the molecular architecture of these three enhancers reveals that they all share a number of important features. Most notably, all three enhancers feature several Tin- and Smad-binding sites in close vicinity that are essential for the activation of the enhancer in the mesoderm. Each enhancer includes both types of known Smad-binding motifs, which have 'AGAC' and 'CG'-rich cores, respectively. Hence, the basic activation mechanisms of each of the three enhancers downstream of Dpp are likely to be closely related. In the enhancers of both tin and bap, binding sites for a nuclear repressor of Dpp signals are key for the germ layer specificity of the inductive response. Although it is not known whether the same repressive mechanism operates at the eve enhancer, it is noted that motifs related to the presumed repressor binding motifs are present and their function can now be tested in vivo. As in the case of bap, the tin enhancer includes also additional sites that are required for Dpp-inducible enhancer activity, which may bind essential Smad co-factors. However, based upon the divergent sequences of these sites (C1 site in the bap and 'CAATGT' motifs in the tin enhancer), they appear to bind different types of factors in each case (Lee, 2005).

On top of this basic arrangement that allows the enhancer to be active in the dorsal mesoderm, the enhancers from bap and eve, but not tin, include binding sites that make them respond to Wg inputs in an opposite fashion. In the case of bap, Wg-induced Slp binds and dominantly suppresses the activity of bound Smad effectors. For the eve enhancer it has been proposed that there is an analogous repressive activity; however, in this case, it is exerted by bound Wg signal effectors, i.e., dTCF/Lef-1, in the absence of Wg signals. In the domains with active Wg signaling, the repressive activity of dTCF/Lef-1 is neutralized by the Wg signaling cascade, which allows the Dpp effectors to be active at the eve enhancer (since it lacks Slp binding sites). Through these switches, the bap and eve enhancers become induced in reciprocal AP patterns. In addition, the eve enhancer includes binding sites for activators and repressors downstream of receptor tyrosine kinases and Notch, respectively, which serve to restrict eve activity to specific subsets of cells within the domains of overlapping Dpp and Wg activities. Clearly, many of the molecular details still need to be clarified. Nevertheless, the basic principles of how differential inputs from inductive signals and tissue-specific activities can be integrated at the enhancer level to achieve distinct patterns of target gene expression during early tissue induction in the Drosophila mesoderm are now beginning to be understood (Lee, 2005).

The Dorsocross T-box genes are key components of the regulatory network controlling early cardiogenesis in Drosophila; Dorsocross genes act in parallel with tinman to activate the expression of pannier

Cardiac induction in Drosophila relies on combinatorial Dpp and Wg signaling activities that are derived from the ectoderm. Although some of the actions of Dpp during this process have been clarified, the exact roles of Wg, particularly with respect to myocardial cell specification, have not been well defined. The present study identifies the Dorsocross T-box genes as key mediators of combined Dpp and Wg signals during this process. The Dorsocross genes are induced within the segmental areas of the dorsal mesoderm that receive intersecting Dpp and Wg inputs. Dorsocross activity is required for the formation of all myocardial and pericardial cell types, with the exception of the Eve-positive pericardial cells. In an early step, the Dorsocross genes act in parallel with tinman to activate the expression of pannier, a cardiogenic gene encoding a Gata factor. Loss- and gain-of-function studies, as well as the observed genetic interactions among Dorsocross, tinman and pannier, suggest that co-expression of these three genes in the cardiac mesoderm, which also involves cross-regulation, plays a major role in the specification of cardiac progenitors. After cardioblast specification, the Dorsocross genes are re-expressed in a segmental subset of cardioblasts, which in the heart region develop into inflow valves (ostia). The integration of this new information with previous findings has allowed drawing a more complete pathway of regulatory events during cardiac induction and differentiation in Drosophila (Reim, 2005b).

In vertebrate species, genetic studies with loss-of-function alleles have implicated Tbx1, Tbx2, Tbx5 and Tbx20 in the control of heart morphogenesis and the regulation of cardiac differentiation markers. In the case of Tbx5, a small number of cardiac differentiation genes have been identified as direct downstream targets. However, owing to the complexity of the system, the respective positions of these genes within a regulatory network during early cardiogenesis are still poorly understood (Reim, 2005b).

Drosophila offers a simpler system to study regulatory networks in cardiogenesis. The Tbx20-related T-box genes mid and H15 have been shown to play a role in cardiac development downstream of the early function of the NK homeobox gene tin and the Gata gene pannier (pnr). Whereas the role of these genes in the morphogenesis of the cardiac tube is minor, they are involved in processes of cardiac patterning and differentiation during the second half of cardiogenesis, which includes the activation of tin expression in myocardial cells (Reim, 2005a). The present report characterizes the roles of the Tbx6-related Dorsocross T-box genes (which may actually have arisen from a common ancestor of the vertebrate Tbx4, Tbx5 and Tbx6 genes), in Drosophila cardiogenesis. The Doc genes have a fundamental early role; they are required for the specification of all cardiac progenitors that generate pure myocardial and pericardial lineages. They are not required for generating dorsal somatic muscle progenitors and lineages with mixed pericardial/somatic muscle, even though their early expression domains also include cells giving rise to these lineages (Reim, 2005b).

The new information on the regulation and function of Doc fills a major gap in the understanding of early Drosophila cardiogenesis. Previous data have shown that the combinatorial activities of Wg and Dpp are required for the formation of both myocardial and pericardial cells. In addition, the homeobox gene even-skipped (eve) is a direct target of the combined Wg and Dpp signaling inputs in specific pericardial cell/dorsal somatic muscle progenitors. Current data identify the Doc genes as downstream mediators and potential direct targets of combined Wg and Dpp signals during the induction of myocardial and Eve-negative pericardial cell progenitors. The induction of Doc expression by Wg and Dpp occurs concurrently with the induction of tin by Dpp alone, at a time when the mesoderm still consists of a single layer of cells. As a result, tin and Doc are co-expressed in a segmental subset of dorsal mesodermal cells that include the presumptive cardiogenic mesoderm. Conversely, in the intervening subset of dorsal mesodermal cells (the presumptive visceral mesoderm precursors) tin is co-expressed with bagpipe (bap) and biniou (bin), which are both negatively regulated by Wg via the Wg target sloppy paired (slp). Ultimately, these shared responses to Dpp, differential responses to Wg and the specific genetic activities of Doc versus bap and bin lead to the reciprocal arrangement of cardiac versus visceral mesoderm precursors in the dorsal mesoderm (Reim, 2005b and references therein).

Although the Dpp signaling pathway (and likewise, the Wg pathway) is activated in both ectodermal and mesodermal germ layers, tin and bap respond to it only in the mesoderm. The germ layer-specific response of these genes to Dpp relies on two probably interconnected mechanisms. The first of these involves the additional requirement for Tin protein as a mesodermal competence factor for Dpp signals, which is initially produced in the mesoderm downstream of twist. The second involves the specific repression of the responses of tin and bap to Dpp in the ectoderm by yet unidentified factors that bind to the Dpp-responsive enhancers of these two genes. By contrast, the Doc genes are induced by Dpp and Wg with the same spatial and temporal expression patterns in both germ layers. This implies that the (yet unknown) Dpp and Wg-responsive enhancer(s) of the Doc genes are not subject to the ectodermal repressor activities acting on the tin and bap enhancers, and fits with the observation that induction of Doc in the mesoderm does not require Tin as a mesodermal competence factor. However, because of the distinct roles of Doc in the ectoderm and mesoderm, this situation also implies that Doc must act in combination with germ layer-specific co-factors to exert its respective functions. These data suggest that, in the early mesoderm, Doc acts in combination with tin (Reim, 2005b).

A key gene requiring combinatorial Doc and Tin activities for its activation in the cardiac mesoderm is the Gata factor-encoding gene pannier (pnr). pnr expression is activated in the cardiac mesoderm shortly after the induction of Doc and tin, at a time when Doc expression has narrowed to the mesodermal precursors giving rise to pure cardiac lineages. The mechanisms restricting Doc expression to the cardiac mesoderm are currently not known, but as a consequence, pnr expression is also limited to the cardiac mesoderm. It is conceivable that Doc receives continued inputs during this period from the ectoderm through Dpp, whose expression domain narrows towards the dorsal leading edge by then. Together with the observed feedback regulation of pnr on tin and Doc, this situation leads to a prolonged co-expression of Tin, Doc and Pnr in the cardiac mesoderm of stage 11 to stage 12 embryos. Based upon the onset of the expression of early markers such as mid and svp, this is precisely the period when cardiac progenitors become specified (Reim, 2005b).

It is anticipated that the activation of some downstream targets in presumptive cardiac progenitors requires the combination of two, or perhaps all three, of these cardiogenic factors. Potential target genes include mid, svp and hand. However, none of these candidates is essential for generating cardiac progenitors, although mid and svp are known to be required for the normal diversification of cardioblasts within each segment (Reim, 2005b).

The observation that forced expression of Pnr in the absence of any Doc partially rescues cardiogenesis could indicate that the early, combinatorial functions of tin and Doc are primarily mediated by pnr. Alternatively, or in addition, this observation and the fact that a few cardioblasts can be generated without Doc could point to the existence of some degree of functional redundancy among these three factors. In the context of the latter possibility, it is tempting to speculate that the functional redundancy among T-box, Nkx and Gata factors during early cardiogenesis has further increased during the evolution of the vertebrate lineages. This would explain the less dramatic effects of the functional ablation of Tbx5, Nkx2-5 and Gata4/5/6 on vertebrate heart development as compared to the severe effects of Doc, tin or pnr mutations on dorsal vessel formation in Drosophila. Like the related Drosophila genes, these vertebrate genes are co-expressed in the cardiogenic region and developing heart of vertebrate embryos, which at least for Nkx2.5 and Gata6 also involves cross-regulatory interactions that reinforce their mutual expression (Reim, 2005b).

The observed co-expression of Doc, Tin and Pnr allows for the possibility that, in addition to combinatorial binding to target enhancers, protein interactions among these factors play a role in providing synergistic activities during cardiac specification. Physical interactions of Tbx5 with Gata4 and Nkx2-5, as well between Nkx2-5 and Gata4 in vitro as well as synergistic activities cell culture assays have been demonstrated in mammalian systems and may be relevant to human heart disease. In Drosophila, the genetic interactions between Doc, tin and pnr observed both in loss- and gain-of-function experiments reveal similar synergistic activities of the encoded factors during early cardiogenesis. Altogether, these observations make it likely that these Drosophila factors also act through combinatorial DNA binding and mutual protein interactions to turn on target genes required for the specification of cardiac progenitors (Reim, 2005b).

Whereas pnr is expressed only transiently during early cardiogenesis, tin and Doc continue to be expressed in developing myocardial cells, suggesting that they act both in specification and differentiation events. Recently it was shown that the T-box gene mid is required for re-activating tin in cardioblasts (Reim, 2005a). Of note, owing to the action of svp, Doc and tin are expressed in complementary subsets of cardioblasts within each segment. This mutually exclusive expression of tin and Doc implies that they are not acting combinatorially but, instead, act differentially during later stages of myocardial development. Hence, their activities could result in the differential activation of some differentiation genes such as Sulfonylurea receptor (Sur), which is specifically expressed in the four Tin-positive cardioblasts in each hemisegment (Nasonkin, 1999; Lo, 2001), and wingless (wg), which is only turned on in the two Doc-positive cells in each hemisegment of the heart that generate the ostia. Surprisingly, even the activation of some genes that are expressed uniformly in all cardioblasts has turned out to result from differential regulation within the Tin-positive versus Doc-positive cardioblasts. For example, regulatory sequences from the Mef2 gene for the two types of cardioblasts are separable and those active within the four Tin-positive cells are directly targeted by Tin. Likewise, regulatory sequences from a cardioblast-specific enhancer of Toll have been shown to receive differential inputs from Doc and Tin, respectively, in the two types of cardioblasts. In parallel with this differential regulation, it is anticipated that yet unknown differentiation genes are activated uniformly in all cardioblasts downstream of mid/H15 and hand. The integration of the new information on the roles of Doc in cardiogenesis has now provided a basic framework of signaling and gene interactions through all stages of embryonic heart development, which in the future can be further refined upon the identification of new components and additional molecular interactions (Reim, 2005b).

Expression, regulation, and requirement of the Toll transmembrane protein during dorsal vessel formation; The Toll transcriptional enhancer is regulated by both Doc and Tin

Early heart development in Drosophila and vertebrates involves the specification of cardiac precursor cells within paired progenitor fields, followed by their movement into a linear heart tube structure. The latter process requires coordinated cell interactions, migration, and differentiation as the primitive heart develops toward status as a functional organ. In the Drosophila embryo, cardioblasts emerge from bilateral dorsal mesoderm primordia, followed by alignment as rows of cells that meet at the midline and morph into a dorsal vessel. Genes that function in coordinating cardioblast organization, migration, and assembly are integral to heart development, and their encoded proteins need to be understood as to their roles in this vital morphogenetic process. The Toll transmembrane protein is expressed in a secondary phase of heart formation, at lateral cardioblast surfaces as they align, migrate to the midline, and form the linear tube. The Toll dorsal vessel enhancer has been characterized, with its activity controlled by Dorsocross and Tinman transcription factors. Consistent with the observed protein expression pattern, phenotype analyses demonstrate Toll function is essential for normal dorsal vessel formation. Such findings implicate Toll as a critical cell adhesion molecule in the alignment and migration of cardioblasts during dorsal vessel morphogenesis (Wang, 2005).

At the time dorsal-ventral polarity is established during early Drosophila development, Toll is associated with the plasma membrane around the entire syncytial blastoderm embryo. Thereafter, Toll exhibits zygotic expression on several cell surfaces, including a specific dorsal cell type in late-stage embryos. These were identified at first as leading-edge cells of the two-epidermal sheets moving toward the dorsal midline. Toll expression in dorsal aspects of the embryo has been reevaluated and, to the contrary, it has now been concluded the gene is expressed in cardioblasts of the developing and formed dorsal vessel (Wang, 2005).

Initially, Toll mRNA accumulation was analyzed by in situ hybridization, with gene transcripts first detected in dorsal cell populations in stage 12 embryos and later in two converging rows of cells during the process of dorsal closure. The likelihood of the Toll-positive cells being cardioblasts was strongly implied by the pattern of mRNA accumulation in stage 16 embryos. Toll expression was detected in roughly 50 cell pairs, and the organization of said cells was reminiscent of cardioblasts within structurally identifiable aorta and heart regions of the assembled dorsal vessel. The pattern of Toll protein expression was also investigated, with results comparable to those obtained in the RNA analysis. The transmembrane protein was detected in dorsal cells in late stage 12/early stage 13 embryos. Thereafter, it showed a clear presence on lateral surfaces of all cells aligned within two contiguous rows as they migrate toward the dorsal midline. By stage 16, the Toll-positive cells populate the core of the dorsal vessel, again within defined aorta and heart subregions. Toll was found exclusively on cardioblast surfaces, while organ-associated pericardial, lymph gland, and ring gland cells failed to express the protein. High-resolution analysis by confocal microscopy demonstrated Toll presence at lateral points of contact between all cardioblasts of the mature dorsal vessel (Wang, 2005).

Toll zygotic transcription is complex based on the numerous cell and tissue types that express the gene. Through efforts to identify a regulatory sequence controlling Toll expression in central nervous system (CNS) midline glial cells, Wharton (1993) located three regions upstream of the gene that possessed transcriptional enhancer activity. Relevant to the demonstration of Toll expression in the dorsal vessel, a 6.5-kb DNA was fortuitously found to direct lacZ reporter expression in all cardioblasts, and in pharyngeal and body wall muscles as well. Due to the interest in understanding how this expression might be regulated, the Toll cardioblast enhancer was delimited within the defined upstream region. At first, the analysis involved testing Toll 5'-flanking DNAs for the ability to drive lacZ expression in embryos of transgenic strains. A 7.1-kb region located between ~9.3 and ~2.2 upstream of the gene showed strong enhancer function in all cardioblasts of the dorsal vessel. The DNA was subdivided into five overlapping segments, and only the most distal 1.7-kb DNA maintained cardioblast activity. Subsequently, five fragments spanning this 1.7-kb interval were tested for enhancer function, and dorsal vessel activity was mapped to a 305-bp sequence located between ~8.3 and ~8.0 relative to the Toll transcription start site. Consistent with the timing of Toll mRNA and protein accumulation in cardioblasts, the 305-bp enhancer becomes active during stage 12 and maintains its activity through all subsequent events of dorsal vessel morphogenesis. It is noteworthy that this small DNA also functions in amnioserosa cells from stage 11 through stage 15 (Wang, 2005).

Since Toll encodes a transmembrane protein with leucine-rich repeats in its extracellular domain, a prediction was made that Toll could function as a homophilic cell adhesion molecule, in addition to its well-characterized role as a signal-transducing receptor. In support of this hypothesis, induced expression of the protein in the nonadhesive Schneider 2 cell line causes cellular aggregation, with Toll accumulating at sites of cell-cell interaction. Such a localization property is characteristic of cellular adhesion molecules. Given the highly specialized localization, and structural and functional features of the protein, it is likely that Toll contributes prominently to the molecular environment that aligns and stabilizes cardioblasts on their path toward assembly within the dorsal vessel (Wang, 2005).

The observation of structurally defective dorsal vessels within Toll mutant embryos is consistent with the pattern of Toll expression in cardioblasts. D-MEF2 serves as a marker for all cardioblasts, from their early appearance through their organization within the mature organ. Based on D-MEF2 staining, it appears appropriate numbers of cardioblasts are specified in mutant embryos, but deviations are observed from the normal process of cardioblast alignment and synchronous migration as two contiguous rows of 52 cells. Several other markers for the formed dorsal vessel identified random gaps in the linear organ due to missing and/or abnormally located cardioblasts. Such cardiac phenotypes are reminiscent of those presented by faint sausage (fas) mutant embryos; mutations of the immunoglobulin-like cell adhesion molecule also led to cardioblast alignment problems. Whether Toll and Fas work in combination for the proper alignment and migration of these cells remains to be investigated. Additionally, while structural and phenotypic properties are consistent with its role as a cardioblast adhesion molecule, a function for Toll in mediating signaling events between neighboring cardiac cells cannot be ruled out. So far, no indicators exist for the latter possibility; it was not possible to demonstrate expression of potential Toll transcriptional effectors (Dorsal and Dif) in cells of the dorsal vessel. Either way, these molecular and genetic findings identify Toll as a vital player in dorsal vessel formation (Wang, 2005).

The regulation of Toll expression in cardioblasts was pursued due to an interest in further defining the transcriptional network controlling heart development in Drosophila. The studies demonstrated Toll heart expression is controlled by a 305-bp DNA located 8.0 kb upstream of the transcription start site. This regulatory module contains multiple binding sites for Doc T-box proteins and a single recognition site for the Tin homeodomain protein. The Toll dorsal vessel enhancer contains a single TCAAGTG sequence at nucleotides 163 to 169. The evidence is strong for the transcriptional enhancer being regulated by both of these cardiogenic factors. Doc and Tin are expressed in adjacent but nonoverlapping sets of cardioblasts within segments of the dorsal vessel; together, they make up the complete population of inner cardiac cells. A deletion of the distal part of the Toll 305-bp enhancer that removes the strong Doc-A footprint sequence, which likely binds multiple Doc molecules through T-box domain recognition of GTG motifs, eliminates enhancer function in Svp/Doc cells while maintaining activity in Tin cells. Systematically adding back T-box core binding elements to partially, then fully, reestablish the Doc-A binding site restores enhancer function in the Svp/Doc population (Wang, 2005).

As for Tin, mutation of its recognition element in the Toll 305 DNA leads to decreased and variable enhancer activity in both Tin and Svp/Doc cardioblasts. This result suggests that Tin is required not only for the activation of Toll expression in the four cardioblasts per hemisegment that are Tin positive after stage 12 but also for its initiation in all six cardioblasts in each hemisegment during early stage 12. The residual activity of the mutated Toll 305 DNA may reflect some degree of Tin regulation through cryptic, low-affinity binding sites present in the enhancer. Indeed, perusal of the Toll sequence identifies three candidate Tin elements that match the binding consensus at six of seven nucleotide pairs, and other Tin-regulated enhancers of genes such as D-mef2, ß3-tubulin, and pnr also employ more than one Tin binding site (Wang, 2005).

In contrast to the Toll 305 enhancer element, mutation of the exact Tin site in the Toll 258 DNA completely silenced the enhancer in the normally Tin-active cells. This result strongly implies that Tin, and at least one other factor working through the distal 47 bp of DNA, are required for activating the Toll gene. Candidates for such factors are the Doc T-box proteins, which are initially expressed in all cardioblast progenitors during mid stage 12, as well as products of the T-box genes H15 and Midline (Mid), which are expressed in all cardioblasts from mid stage 12 onward. Mid can bind to the same regions of Toll DNA as Doc, although the relevance of such interactions remains to be investigated. A combinatorial requirement for T-box proteins and Tin during the initiation and/or maintenance of Toll expression is further supported by the observation that derivatives of the enhancer containing only the Doc-A sequences fail to show activity in Svp/Doc cells. Together, these molecular data point to a mechanism wherein T-box proteins, in combination with Tin, initially activate the Toll gene in all cardioblast progenitors. After stage 12, Doc and Tin (perhaps in cooperation with H15 and/or Mid) activate Toll in two complementary subsets of cardioblasts of the dorsal vessel (Wang, 2005).

Unfortunately, a genetic requirement for these two factors in the regulation of the Toll enhancer cannot be proven at this time since Doc and tin mutant embryos fail to produce cardioblasts. Such an analysis could be attempted with the generation of specialized Doc or tin genetic backgrounds that allow for cardioblast specification early on, while lacking protein functions in later stages of dorsal vessel formation. However, forced-expression studies have demonstrated that individual expression of Tin or Doc2 leads to expanded enhancer activity, while simultaneous expression of the cardiac factors results in a robust activation of Toll transcription. These findings convincingly support the model of Doc and Tin being positive transcriptional regulators of the Toll dorsal vessel enhancer (Wang, 2005).

In addition to the demonstration of Doc and Tin as activators of Toll expression in the dorsal vessel, the regulatory analysis has generated important reagents that should facilitate the discovery of novel cardiac-functioning genes of Drosophila. That is, the Toll-cGFP and Toll-nGFP transgenes serve as sensitive markers for assessing distinct aspects of dorsal vessel morphogenesis in living animals. In stage 16 to 17 embryos and thereafter, Toll-cGFP expression can be used to monitor the formation and function of the three pairs of valvelike ostia within the heart region of the dorsal vessel. Likewise, Toll-nGFP can be used to determine the exact number and diversification status of cardioblasts, as larger nuclei are present within Tin-determined cells while smaller nuclei are found in Svp/Doc-determined cells. Such sensitive and easy-to-use reagents will be valuable in genomewide screens to discover new genes involved in Drosophila heart development (Wang, 2005).

Hand is a direct target of Tinman and GATA factors during Drosophila cardiogenesis and hematopoiesis

The Hand gene family encodes highly conserved basic helix-loop-helix (bHLH) transcription factors that play crucial roles in cardiac and vascular development in vertebrates. In Drosophila, a single Hand gene is expressed in the three major cell types that comprise the circulatory system: cardioblasts, pericardial nephrocytes and lymph gland hematopoietic progenitors. Drosophila Hand functions as a potent transcriptional activator, and converting it into a repressor blocks heart and lymph gland formation. Disruption of Hand function by homologous recombination also results in profound cardiac defects that include hypoplastic myocardium and a deficiency of pericardial and lymph gland hematopoietic cells, accompanied by cardiac apoptosis. Targeted expression of Hand in the heart completely rescues the lethality of Hand mutants, and cardiac expression of a human HAND gene, or the caspase inhibitor P35, partially rescues the cardiac and lymph gland phenotypes. These findings demonstrate evolutionarily conserved functions of HAND transcription factors in Drosophila and mammalian cardiogenesis, and reveal a previously unrecognized requirement of Hand genes in hematopoiesis (Han, 2006).

The existence of hemangioblasts, which serve as common progenitors for hematopoietic cells and cardioblasts, has suggested a molecular link between cardiogenesis and hematopoiesis in Drosophila. However, the molecular mediators that might link hematopoiesis and cardiogenesis remain unknown. This study shows that the highly conserved bHLH transcription factor Hand is expressed in cardioblasts, pericardial nephrocytes and hematopoietic progenitors. The homeodomain protein Tinman and the GATA factors Pannier and Serpent directly activate Hand in these cell types through a minimal enhancer, which is necessary and sufficient to drive Hand expression in these different cell types. Hand is activated by Tinman and Pannier in cardioblasts and pericardial nephrocytes, and by Serpent in hematopoietic progenitors in the lymph gland. These findings place Hand at a nexus of the transcriptional networks that govern cardiogenesis and hematopoiesis, and indicate that the transcriptional pathways involved in development of the cardiovascular, excretory and hematopoietic systems may be more closely related than previously appreciated (Han, 2005).

To search for cis-regulatory elements capable of conferring the specific expression pattern of Hand in cardioblasts, pericardial nephrocytes and lymph gland hematopoietic progenitors, a series of reporter genes were generated containing lacZ and the hsp70 basal promoter linked to genomic fragments within a 13 kb genomic region encompassing the gene, and reporter gene expression was examined in transgenic embryos. A 513 bp minimal enhancer was identified referred to as Hand cardiac and hematopoietic (HCH) enhancer, between exons 3 and 4 of the Hand gene. HCH is both necessary and sufficient to direct lacZ expression in the entire embryonic heart and lymph gland in a pattern identical to that of the endogenous Hand gene. Further deletions of this enhancer caused either a partial or complete loss of activity. The 513 bp HCH enhancer showed the same expression pattern in the heart and lymph gland as larger genomic fragments that were positive for enhancer activity. It is concluded that this enhancer fully recapitulates the temporal and spatial expression pattern of Hand transcription in the distinct cell types derived from the cardiogenic region (Han, 2005).

The homeobox protein Tinman is essential for the formation of the cardiac mesoderm, from which the heart and blood progenitors arise. However, its potential late functions remain unknown. It is believed that Tinman is not required for the entirety of heart development in flies, because it is not maintained in all the cardiac cells at late stages. The data reveal at least one function for the late-embryonic Tinman expression, which is to maintain Hand expression. The fact that ectopic Tinman can turn on Hand expression dramatically in the somatic muscles is striking and suggests the existence of a Tinman-co-factor in muscle cells that can cooperate with Tinman to activate Hand expression; this co-factor would not be expected to be expressed in pericardial cells or the lymph gland. This co-factor should also be expressed in Drosophila S2 cells, since transfected Tinman can increase activity of the HCH enhancer in S2 cells by more than 100-fold. The generally reduced activity of the HCH enhancer that results from mutation of the Tinman-binding sites also suggests that Tinman activity is required to fully activate the Hand enhancer (Han, 2005).

Although Pannier and Serpent bind to the same consensus sites, these GATA factors produce distinct phenotypes when overexpressed in the mesoderm. Ectopic Pannier induces cardiogenesis, shown by the extra number of cardioblasts and pericardial nephrocytes, but does not affect the lymph gland hematopoietic progenitors. Ectopic Serpent, however, induces ectopic lymph gland hematopoietic progenitors, but reduces the number of cardioblasts and pericardial cells. Interestingly, pericardial cells with ectopic Serpent expression have a tendency to form cell clusters such as the lymph gland progenitors, suggesting a partial cell fate transformation. These results suggest that Pannier functions as a cardiogenic factor, whereas Serpent functions as a hematopoietic factor. Although both can activate Hand expression, Pannier and Serpent activate the HCH enhancer in different cell types. This assumption is also supported by the specific expression pattern of Serpent and Pannier in late embryos. Serpent is detected specifically in the lymph gland hematopoietic progenitors but not in any cardiac cells. Pannier expression in the cardiogenic region of late embryos is not clear because of the interference by the high level Pannier expression from the overlaying ectoderm. However, the lymph gland was examined in late stage embryos and no Pannier expression was detected in these cells. Together with the evidence from loss-of-function and gain-of-function experiments with Serpent, it is concluded that the HCH-5G-GFP transgene is not expressed in the lymph gland because Serpent could not bind to the mutant enhancer in the lymph gland cells; whereas the lack of HCH-5G-GFP expression in cardiac cells is due to the inability of Pannier to bind the mutant enhancer in these cardiac cells (Han, 2005).

Since tin and pnr are not expressed in all the cardiac cells of late stage embryos but the Hand-GFP transgene is expressed in these cells, it is likely that additional factors control Hand expression in the heart. One group of candidates is the T-box family. Since Doc1, Doc2 and Doc3 genes (Drosophila orthologs to vertebrate Tbx5) are expressed in the Svp-positive cardioblasts where tin is not expressed, but H15 and midline (Drosophila orthologs to vertebrate Tbx-11) are expressed in most of the cardiac cells in late embryos, it is likely that the T-box genes activate Hand expression in cells that do not express tin and pannier. However, the enhancer lacking GATA and Tinman sites has no activity, indicating that the additional factors that may activate Hand expression in the heart and lymph gland also requires these crucial Tinman and GATA sites, probably through protein interaction between Tinman and the GATA factors (Han, 2005).

In mammals, the adult hematopoietic system originates from the yolk sac and the intra-embryonic aorta-gonad-mesonephros (AGM) region. The AGM region is derived from the mesodermal germ layer of the embryo in close association with the vasculature. Indeed, the idea of the hemangioblast, a common mesodermal precursor cell for the hematopoietic and endothelial lineages, was proposed nearly 100 years ago without clear in vivo evidence. Recently, this idea was substantiated by the identification of a single progenitor cell that can divide into a hematopoietic progenitor cell in the lymph gland and a cardioblast cell in the dorsal vessel in Drosophila (Mandal, 2004). In addition to providing the first evidence for the existence of the hemangioblast, this finding also suggested a close relationship between the Drosophila cardiac mesoderm, which gives rise to cardioblasts, pericardial nephrocytes and pre-hemocytes, and the mammalian cardiogenic and AGM region, which gives rise to the vasculature (including cardiomyocytes), the excretory systems (including nephrocytes) as well as adult hematopoietic stem cells. In fact, in both Drosophila and mammals, the specification of the cardiogenic and AGM region requires the input of Bmp, Wnt and Fgf signaling. In addition to the conserved role of the NK and GATA factors, GATA co-factors (U-shaped in Drosophila and Fog in mice) also play important roles in cardiogenesis and hematopoiesis in both Drosophila and mammals. Recent studies have shown that the Notch pathway is required for both cardiogenic and hematopoietic progenitor specification in Drosophila. It is likely that Notch also plays an important role in mammalian hematopoiesis (Han, 2005).

This study found that Drosophila Hand is expressed in cardioblasts, pericardial nephrocytes and pre-hemocytes, and is directly regulated by conserved transcription factors (NK and GATA factors) that control both cardiogenesis and hematopoiesis. The bHLH transcription factor Hand is highly conserved in both protein sequence and expression pattern in almost all organisms that have a cardiovascular system. In mammals, Hand1 is expressed at high levels in the lateral plate mesoderm, from which the cardiogenic region and the AGM region arise, in E9.5 mouse embryos. Functional studies of Hand1 and Hand2 using knockout mice have demonstrated the essential role of Hand genes during cardiogenesis, whereas the functional analysis of Hand genes during vertebrate hematopoiesis has not yet been explored. It will be interesting to determine whether mammalian Hand genes are also regulated in the AGM region by GATA1, GATA2 and GAT3 (vertebrate orthologs to Drosophila Serpent), and whether they play a role in mammalian hematopoiesis (Han, 2005).

In summary, this study places Hand at a pivotal point to link the transcriptional networks that govern cardiogenesis and hematopoiesis. Since the Hand gene family encodes highly conserved bHLH transcription factors expressed in the cardiogenic region of widely divergent vertebrates and probably in the AGM region in mouse, these findings open an avenue for further exploration of the conserved transcriptional networks that govern both cardiogenesis and hematopoiesis, by studying the regulation and functions of Hand genes in vertebrate model systems (Han, 2005).

U-shaped protein domains required for repression of cardiac gene expression in Drosophila

U-shaped is a zinc finger protein that functions predominantly as a negative transcriptional regulator of cell fate determination during Drosophila development. In the early stages of dorsal vessel formation, the protein acts to control cardioblast specification, working as a negative attenuator of the cardiogenic GATA factor Pannier. Pannier and the homeodomain protein Tinman normally work together to specify heart cells and activate cardioblast gene expression. One target of this positive regulation is a heart enhancer of the Drosophila mef2 gene and U-shaped has been shown to antagonize enhancer activation by Pannier and Tinman. Protein domains of U-shaped required for its repression of cardioblast gene expression were mapped. Such studies showed GATA factor interacting zinc fingers of U-shaped are required for enhancer repression, as well as three small motifs that are likely needed for co-factor binding and/or protein modification. These analyses have also allowed for the definition of a 253 amino acid interval of U-shaped that is essential for its nuclear localization. Together, these findings provide molecular insights into the function of U-shaped as a negative regulator of heart development in Drosophila (Tokusumi, 2007).

Through the use of an established assay to monitor Pannier-dependent cardioblast gene activity, and the generation and analysis of 20 different versions of the U-shaped protein, six U-shaped domains required for its repression of mef2 gene expression were identified. Three previously identified GATA-interacting zinc fingers of U-shaped are critical for this inhibitory property, which likely reflects the necessity of multiple zinc fingers forming a strong and stable interaction with the Pannier GATA factor. Whether Pannier-U-shaped complex formation interferes with the physical interaction of Pannier and Tinman in the synergistic activation of D-mef2 target sequences remains to be determined (Tokusumi, 2007).

U-shaped may also directly antagonize Pannier function as has been shown in the process of sensory bristle formation. Heterodimerization of U-shaped with Pannier converts the GATA transcriptional activator into a transcriptional repressor, an event that leads to the non-activation of target genes such as ac, sc, and wg in the dorsal notum of the wing disc. It is noteworthy that the results demonstrated the requirement of a binding site for the CtBP transcriptional co-repressor protein. In the context of the cardiogenic mesoderm, the combination of Pannier, U-shaped, and CtBP may prevent mesodermal cells from initiating gene expression programs needed for the specification of the cardioblast fate. In contrast, the combination of Pannier, Dorsocross, and Tinman is known to activate a regulatory network programming heart cell specification and cardioblast differentiation. Additional studies will be needed to elucidate the potential role of CtBP as an antagonist of cardiac gene expression and heart development. If U-shaped-CtBP interaction plays a crucial inhibitory role, then one would predict comparable dorsal vessel phenotypes for CtBP and U-shaped in loss- and gain-of-function genetic backgrounds (Tokusumi, 2007).

Finally, these studies have defined a 253 amino acid region required for nuclear localization of U-shaped. Within this interval, two highly basic amino acid sequences have been defined as being essential for U-shaped ability to inhibit Pannier-mediated cardiac gene expression. Perhaps, these motifs are required to facilitate the binding and stable interaction of co-repressor proteins with U-shaped. Another possibility is that these sequences serve as sites for post-translational modification, such as acetylation and/or methylation. Selective protein modification(s) may be a requisite for U-shaped to act as a negative modulator of Pannier transcription factor function during cardiogenesis in Drosophila (Tokusumi, 2007).

The ATP-sensitive potassium (K_ATP) channel-encoded dSUR gene is required for Drosophila heart function and is regulated by tinman

The homeobox transcription factor Tinman plays an important role in the initiation of heart development. Later functions of Tinman, including the target genes involved in cardiac physiology, are less well studied. Focus was placed on the dSUR gene, which encodes an ATP-binding cassette transmembrane protein that is expressed in the heart. Mammalian SUR genes are associated with K_ATP (ATP-sensitive potassium) channels, which are involved in metabolic homeostasis. Experimental evidence is provided that Tinman directly regulates dSUR expression in the developing heart. A cis-regulatory element was identified in the first intron of dSUR that contains Tinman consensus binding sites and is sufficient for faithful dSUR expression in the fly’s myocardium. Site-directed mutagenesis of this element shows that these Tinman sites are critical to dSUR expression, and further genetic manipulations suggest that the GATA transcription factor Pannier is synergistically involved in cardiac-restricted dSUR expression in vivo. Physiological analysis of dSUR knock-down flies supports the idea that dSUR plays a protective role against hypoxic stress and pacing-induced heart failure. Because dSUR expression dramatically decreases with age, it is likely to be a factor involved in the cardiac aging phenotype of Drosophila. dSUR provides a model for addressing how embryonic regulators of myocardial cell commitment can contribute to the establishment and maintenance of cardiac performance (Akasaka, 2006; full test of article).

Because cardiac dSUR expression depends on Tin, 40 kb of the dSUR locus was scanned for Tin consensus binding sites (TNAAGTG). Three large genomic fragments were chosen based on the high density of potential Tin-binding sites (En1, 4,095 bp; En2, 2,151 bp; and En3, 2,291 bp). The enhancer activity of these En fragments was then examined in transgenic flies. Two fragments located upstream of the ATG start (En1 and En2) do not show any reporter activity in the embryonic heart. In contrast, En3 exhibits a pattern of reporter gene expression identical to the endogenous dSUR pattern. This En3 fragment is downstream of the ATG start and contains six Tin sites. To determine whether these Tin sites are required for cardiac expression, they were mutated. Of the mutated Tin sites, only a mutated T3 site reduced the enhancer’s transcriptional activity. Mutations in both T2 and T3 (241 bp apart) abolished reporter gene expression in the cardiac progenitor cells, suggesting that Tin is capable of directly activating dSUR expression in the appropriate myocardial cells. Shorter fragments (S, 890 bp; SS, 359 bp; and SSS, 297 bp) containing both T2- and T3-binding sites were tested for enhancer activity. These three fragments mimicked the cardiac dSUR expression and showed a similar expression level as seen with En3. Within the context of the short SSS fragment, the T3-binding site is absolutely essential for reporter gene activation. The En3 fragment was also scanned for Mad/Media (Dpp pathway)-binding sites (GCCGCGACG). No Mad/Media sites were found with appreciable conservation within this enhancer, which is consistent with Dpp signaling only indirectly regulating dSUR expression, possibly by means of tin. However, a direct regulation by Dpp by means of degenerate or not well conserved sites cannot be excluded (Akasaka, 2006).

An EMSA was performed to test whether Tin can directly bind to the T3 site. A DNA template (28 bp) composed of dSUR genomic sequence containing the T3-binding site produced a specific Tin-binding complex. Thus, Tin can directly associate with the T3 element in dSUR, which is consistent with the possibility that dSUR expression is directly controlled by Tin (Akasaka, 2006).

The Tin expression pattern varies by developmental stage, and, likewise, its downstream target genes may also change during development. In vertebrates, GATA-4 provides the binding efficiency to Nkx2.5 in cardiomyocytes; therefore, these two transcription factors can act cooperatively to activate cardiac genes. Similarly, the Drosophila counterparts Pnr and Tin physically interact and synergistically control cardiac gene expression of genes such as Dmef2. To further characterize the role of Pnr in dSUR activation, Pnr was expressed panmesodermally and the expression of dSUR was compared to that of dHand, which marks all cardiac lineages. Panmesodermally expressed Pnr activates both ectopic dHand and dSUR expression but only to a moderate extent. In contrast, a dominant-negative Pnr (Pnr-EnR) did not induce, and instead reduced, dSUR and dHand expression. Moreover, both dSUR and dHand were strongly activated when tin and pnr were coexpressed, suggesting that, like dHand, dSUR activation depends on genetic synergy between Tin and Pnr (Akasaka, 2006).

Next, whether a reduction of Pnr activity could be compensated for by overexpression of tin was examined. tin and pnr-EnR were co-expressed throughout the mesoderm and it was found that the reduced dSUR and dHand expression, which was due to Pnr-EnR, was not rescued by forced panmesodermal tin expression. This finding suggests that dSUR activation requires not only Tin but also Pnr activity (Akasaka, 2006).

Furthermore, Pnr consensus binding site (WGATAR) was sought within the En3 fragment to explain synergistic activation by Tin and Pnr. There are two well conserved Pnr sites in the SS fragment. However, when the enhancer activity was examined when both of these Pnr consensus sites were mutated [SS(P2P3)], it was found that this enhancer was equivalent to the WT SS fragment. This finding implies that Pnr could bind to Tin directly or to other nonconsensus Pnr-binding sites, such as TGATA (which exists in the SSS fragment), to activate dSUR expression in the embryonic heart (Akasaka, 2006).

To address the possibility that Tin and Pnr may be acting in a complex in regulating cardiac dSUR transcription, an in vitro reporter assay with a luciferase plasmid was used, in which expression was driven by six concurrent T3. Cotransfection of the T3 reporter plasmid with Tin but not the Pnr expression vector into Drosophila Schneider cells resulted in a 3-fold activation of luciferase activity compared with the reporter construct alone. In contrast, when Tin and Pnr were cotransfected, the luciferase activity was elevated 9-fold compared with the reporter construct alone (or with a mutant T3-binding site), suggesting that Pnr acts as a cofactor with Tin to synergistically activate dSUR transcription (Akasaka, 2006).

It has been shown that in corpora cardiaca (CC) cells of Drosophila, dSUR controls glucose homeostasis by increasing secretion of adipokinetic hormone (AKH) in response to low glucose concentration in the hemolymph. Evidence indicates that AKH release likely is increased by the SUR inhibitor sulfonylurea and is decreased by ectopic expression of constitutively active (and thus ATP-independent) ion channel Kir2.1, suggesting striking parallels between endocrine cells in Drosophila and mammals in controlling blood glucose. Therefore, the role of dSUR was examined in cardiac physiology and heart homeostasis in adults. The findings suggest that there are striking functional similarities between Drosophila and mammalian SUR in heart function. In the mammalian heart, there are two types of K_ATP channels, sarcK_ATP and mitochondrial K_ATP, which are candidate regulators of acute hypoxia and IPC. Impairing sarcK_ATP channel activity by genetic manipulation of mouse Kir6.2 results in compromised recovery of contractile function after hypoxia. The data are consistent, with dSUR in Drosophila providing a similar protective mechanism against hypoxia. Moreover, a recent study in goldfish K_ATP channel function revealed that the involvement of K_ATP in IPC is widely conserved, including in highly hypoxia-tolerant species (Akasaka, 2006).

To further address dSUR function, external electrical pacing of the heart was performed in dSUR knock-down mutants. Rapid electrical pacing per se is a nonhypoxic stimulus that may induce an IPC effect in mammals by activating K_ATP channels. Indeed, Kir6.2 mutant hearts exhibit diminished electrical tolerance against catecholamine-induced ventricular arrhythmia because of a failure to achieve action potential shortening and by causing early after-depolarization. Thus, the elevated heart failure rate in dSUR knock-down hearts may be due to K_ATP channel insufficiency. Interestingly, IPC is no longer observed in older human patients, and in female guinea pigs, SUR2A expression is reduced in old ventricular tissue compared with young ventricular tissue. Moreover, human SUR2 mutations found in two independent families were recently shown to cause dilated cardiomyopathy, with an onset around middle age. These mutations result in the structural abnormalities of the K_ATP channel and impair the ATP-dependent channel gating. Patients carrying these mutations showed ventricular tachycardia with normal coronary angiography, suggesting that human cardiomyocyte K_ATP channels play a role in maintaining membrane electrical stability and that the reduction of the K_ATP channel activity causes electrical disturbance, especially in older hearts. In this study it was observed that pacing-induced heart failure steeply increases in aging flies, which can be reversed by exposure to the K_ATP channel activator pinacidil. These observations, together with the drastically reduced dSUR expression in old flies, suggest that dSUR serves as an indicator of cardiac aging. Given the experimental advantages of Drosophila, such as a small genome size and short life span, dSUR and cardiac aging provide a unique model not only for assessing the control of physiological heart functions, such as the response to hypoxia, but also for the analysis of age-related human diseases (Akasaka, 2006).

The NK homeodomain transcription factor Tinman is a direct activator of seven-up in the Drosophila dorsal vessel

A complex regulatory cascade is required for normal cardiac development, and many aspects of this network are conserved from Drosophila to mammals. In Drosophila, the seven-up (svp) gene, an ortholog of the vertebrate chick ovalbumin upstream promoter transcription factors (COUP-TFI and II), is initially activated in the cardiac mesoderm and is subsequently restricted to cells forming the cardiac inflow tracts. This study investigated svp regulation in the developing cardiac tube. Using bioinformatics, a 1007-bp enhancer of svp was identified which recapitulates its entire expression in the embryonic heart and other mesodermal derivatives; this enhancer is initially activated by the NK homeodomain factor Tinman (Tin) via two conserved Tin binding sites. Mutation of the Tin binding sites significantly reduces enhancer activity both during normal development and in response to ectopic Tin. This is the first identification of an enhancer for the complex svp gene, demonstrating the effectiveness of bioinformatics tools in assisting in unraveling transcriptional regulatory networks. These studies define a critical component of the svp regulatory cascade and place gene regulatory events in direct apposition to the formation of critical cardiac structures (Ryan, 2007).

In order to identify the svp cardiac enhancer (SCE), an initial goal was to identify a transcription factor whose function was required for svp expression in the dorsal vessel. Published data demonstrated that tinman (tin) and seven-up (svp) are co-expressed at stage 11, and become mutually exclusive shortly thereafter. Given the temporal and spatial coincidence of tin and svp expression during early cardiogenesis, and given the role that Tin plays in the expression of other important cardiac genes, it was hypothesized that tin function might be required for svp expression. Thus, Svp accumulation was evaluated in heterozygotes and homozygotes for the tin null allele tin^EC40. svp expression was monitored in this experiment using the svp-lacZ enhancer trap line. The normal complement of seven bilateral pairs of Svp cells was visible in heterozygous embryos, whereas homozygous tin^EC40 sibling embryos lacked cardiac svp expression, although Svp was still present in the ring gland. This result supported Tin as an upstream regulator of svp, however, whether Tin itself bound to the svp enhancer was to be determined. The possibility was considered that svp expression was not initiated in tin mutants simply because cardiac specification had failed to occur. However, since svp expression is initiated relatively early during cardiac specification, it was felt that Tin would still be a strong candidate activator (Ryan, 2007).

Given the dependence of svp expression on the presence of Tin, the svp gene was examined for consensus Tin binding sites. Seven sites, all located within the very large first intron of the svp-RC transcript variant, were identified. Since Tin targets have been shown to contain two closely apposed Tin binding sites, focus was placed upon the genomic regions where at least two sites lay within 300 bp of each other. Six of these putative Tin binding sites were present as three such pairs (Ryan, 2007).

It was then determined if any of the three genomic regions containing putative Tin binding sites were conserved in other Drosophila species. Only one region, encompassing the Tin sites at 8092711/8092853, showed strong sequence similarity between species. When compared with five additional Drosophila species, the putative Tin binding sites within this region were 100% conserved. To determine if Tin protein can bind to the putative Tin sites, an electrophoretic mobility shift assay, using Tin protein and radioactively labeled double-stranded oligonucleotides corresponding to each of the two putative Tin sites was performed. Tin protein, generated in vitro, bound to both of the radioactive DNA sequences, more strongly in the case of the Tin 1 site. Binding was effectively competed with identical respective unlabeled sequences, but not with respective mutated sequences. The higher affinity of the Tin1 binding was further reflected in the corresponding competition assay, in which a light band was still evident when competed with wild-type probe. These results further supported the notion that Tin might directly regulate svp gene expression via this genomic region (Ryan, 2007).

This study showd that Tin is an essential regulator of svp gene expression in the cardiac mesoderm, via activation of a an ~1 kb enhancer (termed the SCE) located in the first intron of the svp gene. Thus, it appears that Tin mediates this 'cardiac context' (Ryan, 2007).

The role of Tin in the initial activation of svp reflects the critical role tin plays in cardiac development in Drosophila. tin function is essential for cardiac specification, and a number of genes expressed in the dorsal vessel have been identified as direct transcriptional targets of Tin. It is anticipated that the SCE will ultimately provide greater insight into developmental patterning processes, since further analysis of the enhancer should identify how both Hox and Hh signals impact svp gene expression, as a model of how they impact cardiac patterning in general (Ryan, 2007).

Once svp expression is initiated, it soon becomes mutually exclusive with tin expression, and a svp-lacZ enhancer trap line is active in the Svp cells all through development to adulthood. In situ hybridization of SCE-lacZ embryos showed that the enhancer is active during embryogenesis through stage 14, although the activity had waned by stage 16. Thus, the SCE is responsible for the initial activation of cardiac svp gene expression at stages 12 to 14, yet other regulatory sequences mediate subsequent sustained svp expression. Since SCE activity is strong at stage 14, a time at which tin and svp expression do not overlap, it is reasonable to suggest that additional enhancer sequences must contribute to this period of expression. In contrast, mutation of the Tin sites also affects enhancer activity at stage 14, when Tin is absent from Svp cells. How can the integrity of Tin sites be required for enhancer activity at a stage when Tin is no longer present? One possibility is that initial binding of Tin to the enhancer might induce epigenetic changes to the genomic region, which can facilitate subsequent enhancer activity. In support of this notion, Tin has been shown to interact directly with the transcription cofactor P300 (Ryan, 2007).

Previous studies have demonstrated the importance of both the svp gene and its vertebrate ortholog, COUP-TFII, in cardiac development. Each factor is expressed in the cardiac inflow tracts: in the case of Drosophila, the inflow tracts are represented by the ostia, which form from the Svp cells and which require svp function for their formation; in vertebrates, COUP-TFII is expressed in and required for the formation of the atria. While neither upstream regulatory factors nor downstream targets of svp and COUP-TFII have been characterized to date, it is reasonable to speculate that such genes showing conserved functions might also share common upstream regulators. Thus, it is predicted that activation of COUP-TFII in vertebrate atrial cells might be mediated at least in part by NKX2.5, although no studies have directly assessed the expression of COUP-TFII in NKX2.5 mutants. Since this study used the dependence of svp expression upon tin function to predict the location of the SCE, such an approach might also be used to identify the COUP-TFII cardiac enhancer based upon the presence of conserved binding sites for NKX2.5. Given that COUP-TFII lies within a large, gene-poor genomic region in both mice and humans, this approach may facilitate the still significant task of illuminating the genetic regulation of COUP-TFII (Ryan, 2007).

A core transcriptional network for early mesoderm development in Drosophila consists of Twist, Mef2, Tinman and Dorsal

Embryogenesis is controlled by large gene-regulatory networks, which generate spatially and temporally refined patterns of gene expression. This study reports the characteristics of the regulatory network orchestrating early mesodermal development in the fruitfly, where the transcription factor Twist is both necessary and sufficient to drive development. Through the integration of chromatin immunoprecipitation followed by microarray analysis (ChIP-on-chip) experiments during discrete time periods with computational approaches, >2000 Twist-bound cis-regulatory modules (CRMs) were identified and almost 500 direct target genes. Unexpectedly, Twist regulates an almost complete cassette of genes required for cell proliferation in addition to genes essential for morophogenesis and cell migration. Twist targets almost 25% of all annotated Drosophila transcription factors, which may represent the entire set of regulators necessary for the early development of this system. By combining in vivo binding data from Twist, Mef2, Tinman, and Dorsal an initial transcriptional network was constructed of early mesoderm development. The network topology reveals extensive combinatorial binding, feed-forward regulation, and complex logical outputs as prevalent features. In addition to binary activation and repression, it is suggested that Twist binds to almost all mesodermal CRMs to provide the competence to integrate inputs from more specialized transcription factors (Sandmann, 2007).

Twist-bound enhancers and direct Twist target genes

ChIP-on-chip was performed at two consecutive developmental time periods: 2-4 h (stages 5-7) and 4-6 h (stages 8-9), covering the stages of gastrulation, mesoderm expansion, migration, and early subdivision into different primordia. For each time period, four independent ChIPs were performed using two different anti-Twist antibodies to reduce possible off-target effects (Sandmann, 2007).

To systematically identify Twist-bound regions in an unbiased, global manner, a high-density microarray tiling across the Drosophila melanogaster genome was designed with ~380,000 60mer oligonucleotide probes. Twist binds to E-box motifs: As a degenerate E-box (CANNTG) is expected to occur every ~256 base pairs (bp) in the Drosophila genome, a 60mer oligonucleotide was designed for each E-box motif within the nonrepetitive, noncoding regions of the genome. This design made no assumptions about the specificity of the E-box bound by Twist, yet ensured all putative E-boxes were covered and that each Twist-bound sequence was detected by at least two neighboring 60mers (Sandmann, 2007).

These experiments identified 2096 nonoverlapping genomic regions significantly bound by Twist within one or both developmental time periods. This set includes all known Twist-bound enhancers tested, except the eve-cardiac enhancer that is regulated outside the period of development assayed. The majority of Twist-bound regions are found within introns of gene loci, rather than noncoding 5' and 3' regions. A similar positional bias was also observed for p53 and Krüppel, suggesting that introns close to the transcriptional start site represent hotspots for active CRMs. Intronic binding of Twist correlates significantly with the misregulation of these genes' expression in twist loss-of-function mutant embryos and their expression within the ventral blastoderm and mesoderm (Sandmann, 2007).

One of the major challenges for ChIP-on-chip studies is to accurately link the TF-bound enhancers to their appropriate target gene. Rather than simply taking the closest 5' or 3' gene, a more stringent approach was taken and a Twist-bound region was not assigned to a gene based on proximity alone. The results demonstrate that Twist binds more frequently to gene loci genetically downstream from the TF and/or expressed in the same cells as the TF. These criteria to systematically match all 2096 Twist-bound regions (intronic or intergenic) to their likely targets, leading to a high-confidence gene assignment for 854 Twist-bound sequences. This increased the number of Twist direct targets from the previously known 11 to 494 genes. All Twist-bound regions and surrounding genes can be visualized and searched at http://furlonglab.embl.de (Sandmann, 2007).

The RedFly database contains a comprehensive collection of previously described Drosophila enhancers, mainly characterized through single gene studies. Of the 2096 Twist-bound regions, 143 overlap with known enhancers for 62 genes, confirming that these regions have regulatory potential in vivo. Twist was not known to bind to many of these enhancers; this overlap therefore provides strong evidence for a regulatory link between Twist and the 62 target genes (e.g., Abd-A, Abd-B, aop, Brd, slp1, and bap). To further examine the regulatory potential of Twist-bound regions, reporter constructs of new putative enhancer sequences were tested in transgenic animals. Six Twist-bound regions within or close to the following gene loci were assayed: T48, trbl, retn, CG4221, CG8788, and CG32372. All regions proved sufficient to function as enhancers in vivo and could reproduce all or part of the endogenous spatio-temporal gene-expression pattern (Sandmann, 2007).

The T48, tribbles, retained, CG4221, and CG8788 enhancers initiate expression within the early blastoderm. The T48 module mirrors the expression of the endogenous gene within the presumptive mesoderm. The zygotic expression of tribbles is highly dynamic, which is reflected by the assayed CRM. This enhancer drives expression very transiently in the ventral blastoderm and quickly becomes ubiquitously expressed. The relatively small enhancer region for retained is activated in the anterior and posterior ventral blastoderm, where it is coexpressed with Twist, and its expression extends into the dorsal blastoderm. The CRMs for CG4221 and CG8788 initiate expression in the presumptive mesoderm, and continue to drive expression throughout the trunk mesoderm at later stages. The expression of the CG32372 module initiates after gastrulation in the head mesoderm, a domain that overlaps with twist expression. It is interesting to note that Twist binds to multiple enhancer regions for many of these genes. This feature is also evident more globally: Almost 50% of Twist target genes have two or more Twist-bound enhancers, reflecting the complexity of their regulation (Sandmann, 2007).

In summary, these results demonstrate that ChIP-on-chip experiments provide a sensitive and accurate global map of Twist-bound regulatory regions during key stages of early mesoderm development (Sandmann, 2007).

Twist activity is essential for target gene expression

To assay the requirement of Twist function for target gene expression, the expression was examined of six novel direct targets in twist mutant embryos. These genes are expressed in the presumptive mesoderm prior to gastrultion, and therefore at stages when the role of twist function can be assessed. Mesodermal cells are absent in twist mutant embryos later in development due to a block in gastrulation. Triple-fluorescent in situ hybridization was performed using probes directed against twist (blue channel; while twist¹ is a protein-null allele, twist RNA is still expressed), inflated (red channel; this gene is dependent on twist for its expression and was used as a marker to distinguish homozygous mutant embryos from their siblings), and a probe directed against one of the six direct target genes (green channel). The spatial expression of all six targets overlaps with twist within the presumptive mesoderm (Sandmann, 2007).

Importantly, twist activity is essential for the expression of five out of six genes examined. Note, for CG32982 and CG9005, residual expression remains outside the twist expression domain in the dorsal and posterior blastoderm, respectively. These results, in combination with in vivo binding data, indicate that Twist binding to a CRM is a prerequisite to activate target gene expression for a large percentage of its targets. The role of Twist binding to the NetA enhancer remains unclear. Twist may act redundantly with other TFs, or alternatively may function in a more subtle manner to modulate the levels of expression (Sandmann, 2007).

Twist and Dorsal collaborate much more extensively than previously predicted

One of the earliest functions of Twist within the pregastrula embryo is the coregulation of D-V patterning with the NFkappaB ortholog Dorsal. Dorsal acts as a morphogen by regulating its target genes at (at least) three threshold concentrations along the D-V axis. Type I-regulated Dorsal enhancers receive high levels of Dorsal, contain low-affinity Dorsal sites and drive expression in ventral mesodermal domains (e.g., sna, htl, twi). Type II enhancers receive intermediate levels of Dorsal and drive expression in mediolateral domains of different sizes (e.g., sim, brk, vn), while Type III enhancers receive low levels of Dorsal, contain high-affinity Dorsal sites, and can be either activated (sog, ths) or repressed (dpp, tld, zen) by Dorsal. This system has been studied so intensively that the level of knowledge is sufficient for quantitative modeling of cis-regulatory interactions. It was therefore of interest to determine whether global analysis could reveal new insights into this process. The data identified in vivo binding of Twist to both Type I and II Dorsal enhancers, as expected. The boundaries of Twist binding are in remarkable agreement with the limits of characterized minimal enhancers (e.g., htl, rho, and ths). More importantly, new CRMs were identified for several of these well-characterized genes (Sandmann, 2007).

Seven novel enhancers for D-V patterning genes reveal the regulatory complexity of Twist-bound CRMs: The cactus, stumps, and wntD enhancers drive expression in a domain overlapping Twist within the ventral blastoderm and likely represent Type I enhancers. Cactus, an IkappaB ortholog, is expressed both maternally and zygotically and sequesters Dorsal within the cytoplasm. While the regulation of zygotic cactus expression was previously not understood, these data reveal a Twist-bound CRM that is sufficient to drive expression in the presumptive mesoderm. Twist also binds to a CRM of Toll. Although the function of cactus'and Toll's zygotic regulation remains unclear in Drosophila, positive feedback regulation of zygotic Toll-receptor expression is required to refine the Dorsal nuclear gradient in the flour beetle Tribolium castaneum (Sandmann, 2007).

The stumps CRM is expressed in a subset of Twist-expressing cells, yielding a salt and pepper pattern that may reflect the requirement for a second, partially redundant enhancer (e.g., the 'stumps_early' enhancer) to give robust expression. The wntD CRM is highly expressed at the anterior and posterior poles of the ventral blastoderm, but is very weakly expressed within the central region. This mirrors the transient expression of the endogenous gene at this stage of development. This single enhancer reflects the regulatory logic deduced from genetic studies: The inputs from Twist and Dorsal activate WntD, while Snail represses its transcription within the presumptive mesoderm. The CRM for crumbs reproduces the endogenous genes expression. This 480-bp region can function as an enhancer in the ectoderm while acting as a silencer within the ventral blastoderm. This ventral repression is most likely due to direct input from Snail on this CRM. Therefore, even at the same stage of development, these four Twist-bound CRMs drive expression in different spatial patterns within a small population of cells. This complexity is clearly mediated by context-dependent integration of additional inputs. Three additional CRMs for mir-1 (Type I), vn, and sim (Type II Dorsal targets) drive expression later in development, reproducing part of the endogenous gene’s expression (Sandmann, 2007).

Unexpectedly, Twist also binds to characterized Dorsal Type III enhancers known to regulate dpp, ind, and ths. Dorsal and its associated corepressors Cut, Retained, and Capicua recruit Groucho to repress dpp, confining its expression to the dorsal blastoderm. The cobinding of Twist and Dorsal to Type III CRMs suggests that these factors may also collaborate in transcriptional repression. Interestingly, Twist binds to regulatory regions of all three Dorsal corepressors, providing another level at which Twist may modulate Dorsal-mediated repression. Overall, this exhaustive map of new CRMs for D-V patterning genes greatly extends the previous knowledge and will likely improve predictive models for this system (Sandmann, 2007).

Twist targets functional modules required for diverse aspects of mesoderm development

Twist is not only required for D-V patterning. The 494 direct target genes are significantly enriched in functional groups of genes involved in cell communication, signal transduction, cell motility, and cell adhesion. Genes in these categories are essential for multiple aspects of development, including gastrulation and directed migration of mesodermal cells. Genetic studies have demonstrated a requirement for twist in these processes; however, the molecular mechanism remained ill-defined. These data reveals Twist binding to CRMs for entire functional modules necessary for both gastrulation and migration (the FGF pathway) (Sandmann, 2007).

The present study highlights a new direct connection between Twist and many key components involved in cell cycle progression and cell growth. Members of both the Cdk2/CyclinA/B and Cdk2/CyclinE complexes are targeted, as well as modifiers of their activity and genes involved in cytokinesis and replication. In many cases, Twist binds to several CRMs of these genes (e.g., cyclinE and E2f), revealing the complexity of their regulation. This surprising link between Twist and the cell cycle is highly likely to be of regulatory significance; twist mutant embryos have proliferative defects that can be genetically separated from the block in mesoderm gastrulation (Sandmann, 2007).

These three functional groups of target genes (involved in morphogenesis, migration, and cell proliferation) have been defined as essential developmental network 'plug-ins.' Twist orchestrates early mesoderm development by binding to CRMs of virtually all genes within functional groups essential for gastrulation, mesoderm proliferation, migration, and specification. In contrast, few CRMs for genes involved in terminal differentiation (e.g., sarcomere structure) are targeted by Twist (Sandmann, 2007).

Twist is a highly connected hub targeting a large repertoire of TFs

This global map of Twist-bound CRMs provides a first glimpse of Twist’s connectivity to the rest of the regulatory genome. Remarkably, TFs represent the largest group of Twist targets: Twist binds to CRMs of a striking 25% (113/454) of all sequence-specific Drosophila TFs. Among these are TFs essential for mesoderm development, including gap (hb, hkb, kr, kni), pair rule (eve, slp, opa, odd, prd, run), and segmentation genes (en, hh, ptc, wg), as well as homeotic genes (pb, Scr, Antp, Abd-A, Abd-B, Ubx). These classes of target genes implicate a new role for Twist in the establishment or maintenance of anterior-posterior patterning within the mesoderm in addition to its known role in D-V axis formation. Although the function of many of the remaining TFs is unknown, this data links these regulators to mesoderm development. The sheer number of TFs regulated by Twist does not support a simple hierarchical network, where Twist regulates a small set of TFs, which in turn control another layer of regulators, and so forth. Rather, the data suggests a model for Twist contributing to the regulation of the majority of TFs involved in every aspect of early mesoderm development (Sandmann, 2007).

Temporal enhancer occupancy by Twist reveals stage-specific coregulators

Although Twist is expressed during both developmental time periods assayed, it binds to CRMs in a temporally regulated manner. Approximately half of the enhancer regions are detected at both time periods, indicating continuous binding of Twist throughout these developmental stages. In contrast, 23% of Twist-CRMs are only bound in early development (2-4 h), while 26% are specific to later time periods (4-6 h). This dynamic occupancy reveals that the ability of Twist to bind to CRMs is tightly controlled beyond the mere presence of a suitable binding site, and is likely regulated by other TFs that aid or inhibit binding. To identify additional regulators that could differentiate between temporally bound CRMs, a search was performed for overrepresented sequence motifs, using two complementary computational approaches: statistical enrichment of position weight matrices (PWMs) for characterized TFs, and the de novo detection of overrepresented motifs (Sandmann, 2007).

Twist and Snail consensus motifs are significantly overrepresented in all three groups of CRMs, indicating a potential for extensive cobinding between these two TFs. In contrast, Dorsal motifs are exclusively enriched in the early-bound CRMs, and not in the late group. While Tinman motifs are specifically overrepresented in the continuous and late-bound CRMs. A number of other motifs were also uncovered, including sites for potential Twist/Daughterless heterodimers, suggesting additional mechanisms to generate diverse outputs from Twist-CRMs (Sandmann, 2007).

These data reveals Twist binding to almost all previously characterized Dorsal enhancers. Twist and Dorsal are known to interact physically and to coregulate enhancers in the early, but not the late, time window of this experiment. It is therefore hypothesized that Dorsal may be coregulating many of the newly discovered Twist CRMs, in keeping with the specific enrichment of Dorsal consensus motifs within these enhancers. To experimentally test Dorsal's presence on predicted sites in vivo, ChIP experiments were performed at 2-4 h of development. Significant binding of Dorsal was detected by quantitative real-time PCR to all seven predicted sites tested. Similarly, since Tinman consensus sites were significantly enriched in 4-6-h CRMs, the in vivo occupancy of predicted sites by Tinman was tested at this stage of development. ChIP experiments detected significant binding of Tinman to 10 of 11 sites tested. Given the large number of early and late CRMs, the enrichment of these motifs highlights extensive combinatorial binding of Dorsal and Twist at 2-4 h, and Tinman and Twist at 4-6 h. A substantial part of Twist's temporal specificity likely stems from its association with these upstream and downstream coregulators (Sandmann, 2007).

A core transcriptional network for early mesoderm development

To delineate the combinatorial relationships between Twist and other TFs, an initial transcriptional network was generated for early mesoderm development. The temporal binding map for Twist was integrated with in vivo binding data for Mef2, Dorsal, and Tinman. A previous study of Mef2-bound enhancers offers the largest collection of regulatory regions bound at this stage of development to date. As it is difficult to visualize all 494 Twist target genes, focus was placed on TFs whose CRMs are cobound by two or more regulators during these stages of development. Therefore, all links in this network represent direct connections to the same CRM at the same stages of development (Sandmann, 2007).

The resulting core network of 51 TFs is already relatively complex, with nine genes [nau, E(spl), eve, bap, Ubx, lbe, odd, hth, and Ptx1] being targeted by three out of the four examined regulators. The topology of the network provides several insights into how Twist functions to regulate multiple aspects of early mesoderm development. Extensive combinatorial binding and feed-forward regulation are abundant features. Dorsal activates twist, which in turn coregulates the majority of known direct Dorsal targets. This network motif is even more prominent within the mesoderm: Twist regulates the expression of Mef2 and tinman, and cobinds with these TFs to many of their targets' enhancers. In fact, Twist co-occupies 42% of all Mef2-bound enhancers during early mesoderm development. Depending on the logical inputs from the two upstream regulators (transcriptional repression or activation), feed-forward loops can aid in cellular decision making by filtering out noisy regulatory inputs or control the timing of a transcriptional response. For example, early gene expression in the mesoderm (e.g., activation of tin) depends on Twist alone, while transcription of other genes initiated at a later stage may require the input from both Twist and Tinman proteins (Sandmann, 2007).

Twist-bound CRMs correspond to silencers as well as enhancers of transcription

Through the integration of ChIP-on-chip analysis with expression profiling data during early stages of Drosophila development, this study has identified >2000 Twist-bound regulatory regions and almost 500 direct target genes. This data, in combination with in vivo binding data for other TFs, lays the foundation of a transcriptional network describing early mesoderm development. The resulting network view reveals regulatory features that form the basis of Twist's functional versatility (Sandmann, 2007).

The data revealed extensive Twist binding to characterized Dorsal enhancers and also, surprisingly, to Dorsal-regulated silencers (e.g., dpp). Moreover, many of the new regulatory regions identified for D-V patterning genes can function either as enhancers or integrated enhancer-silencer modules (e.g., WntD and crumbs). This ability of Twist to act within the context of silencers, as well as enhancers, may partially explain the widespread recruitment of Twist to many regulatory regions and its ability to regulate diverse developmental processes (Sandmann, 2007).

An attractive molecular explanation for this bifunctionality is the potential of Twist to form both homodimers and heterodimers. Twist homodimers drive gene activation in Drosophila, while Twist-Daughterless heterodimers are associated with transcriptional repression. This model is supported by the significant overrepresentation of the Twist/Daughterless heterodimer consensus motif in both 2-4-h and 4-6-h CRMs. Direct protein-protein interactions between Twist and Dorsal is an alternative mechanism for Twist's incorporation into repressive complexes (Sandmann, 2007).

A network with unexpected topology governs early mesoderm development

Although the network generated in this study is far from complete, it represents the largest set of combinatorial-bound CRMs during these stages of development described to date, and therefore provides a comprehensive resource to decipher general regulatory principles. The resulting network topology was surprising. Instead of Twist regulating a restricted group of TFs, which in turn regulate a successive wave of transcription in a relay model, Twist directly impinges on CRMs for the vast majority of genes expressed in the early mesoderm (Sandmann, 2007).

The extent of combinatorial binding was also unanticipated. There is extensive cobinding of Twist and Dorsal to early 2-4-h CRMs. In fact, the presence of Dorsal binding may be a general prerequisite for Twist binding to enhancers specific for early development. The cooperative binding of Dorsal and Twist to the rho and sim CRMs supports this model. At 4-6 h of development, the composition of TFs impinging on Twist-bound CRMs changes. Although genome wide ChIP-on-chip data is currently not available for Tinman, the significant overrepresentation of Tinman motifs in Twist-bound CRMs and the ability of Tinman to bind to the majority of sites tested indicates prevalent combinatorial binding between these two TFs during 4-6 h of development. Comparing Twist-bound CRMs with a previously generated data set for Mef2 revealed extensive cobinding to enhancers early in development. Converging regulatory connections through combinatorial binding can produce diverse logical outputs, depending on the nature of the TFs. The cobinding of several pan-mesodermal TFs (Twist, Tinman, and Mef2) may ensure robust gene expression. While in other contexts (for example, the WntD-enhancer) the combined inputs of Twist and Snail allow for spatial fine-tuning of gene expression (Sandmann, 2007).

The core network also revealed an abundance of feed-forward loops, providing directionality during early mesoderm development. This is prevalent with both upstream regulators of Twist (Dorsal and Twist) and downstream regulators (Tinman and Twist and Mef2 and Twist). This network motif will likely become even more widespread as additional ChIP-on-chip data becomes available. Twist targets an astounding number of TFs, which may represent an almost complete repertoire of TFs required for early mesoderm development. It is tempting to speculate that Twist participates in feed-forward regulation, with many of these factors through combinatorial binding to different subsets of the ~2000 Twist-bound CRMs (Sandmann, 2007).

Temporal network dynamics reflect developmental progression

Both the composition and connectivity of regulatory networks describing developmental progression will naturally change over time. To capture dynamic changes within the early mesodermal network, these experiments were performed at consecutive time periods. The data reveals temporally regulated binding of Twist to three classes of CRMs: early, continuous, and late. Similar temporally restricted enhancer occupancy has also been observed for other regulators with broad temporal expression, suggesting that this may be a general feature of developmental networks—e.g., MyoD, PHA-4, and Mef2 (Sandmann, 2007).

The temporal occupancy of specific CRMs by Twist reflects the development of this tissue. At 2-4 h of development, Twist and Dorsal coregulate genes essential for D-V patterning. Twist also targets an almost complete set of genes essential for gastrulation and is required to progress to the next phase of development, mesoderm maturation. During this developmental window, the predominant target genes are part of functional modules essential for the cell migration, proliferation, patterning, and specification events occurring within the mesoderm at these stages. As expected for a TF essential for early aspects of mesoderm development, Twist does not bind to significant numbers of CRMs for genes involved in terminal differentiation. This is in sharp contrast to Mef2, which first co-occupies CRMs involved in early mesoderm development with Twist, and later selectively regulates an alternative group of CRMs driving genes involved in later aspects of differentiation; e.g., sarcomere structure or muscle attachment (Sandmann, 2007).

Conserved regulation of functional classes of genes by Twist

Integrating these data with genetic evidence from other species suggests that the regulation of several functional gene cassettes by Twist is conserved throughout evolution, from flies to man. These include (1) the FGF signaling pathway: Mutations in human FGF receptors phenocopy mutants in human twist (Htwist). (2) Genes implicated in epithelial-mesenchymal transitions (EMTs): In mice and humans, Twist facilitates tumor metastasis through the promotion of EMTs. (3) Cell proliferation and apoptosis: Htwist has been classified as a potential oncogene, since it maintains tissue culture cells in a proliferative state. Interestingly, ectopic expression of Htwist in Drosophila also induces proliferation and inhibits p53-dependent apoptosis, indicating that the ability to regulate these processes is conserved. However, for each process, only a few Twist-regulated genes have been known. Extrapolating from the current findings in flies points toward a role for Twist in the direct regulation of entire gene modules required for each process in vertebrates (Sandmann, 2007).

An emerging model for Twist as a global competence factor for mesoderm development

The results provide an initial global view of the transcriptional network describing early mesoderm development within the metazoan Drosophila. Twist resides at the top of this network and binds to CRMs for the vast majority of genes that need to be expressed during these stages. In many cases, Twist is essential and sufficient to drive expression of the target gene. In other cases, however, the contribution of Twist remains unclear (e.g., crumbs and NetA) . Rather than acting as a binary switch, Twist may act redundantly with other TFs. Alternatively, Twist may provide the competence for more specific TFs to bind to these CRMs; for example, by acting as a pioneer TF to facilitate chromatin remodeling (Sandmann, 2007).

In species as diverse as flies, jellyfish, and mice, Twist is only expressed in mesodermal cells when they are in an immature state, and loss of twist expression correlates with the initiation of differentiation. Moreover, overexpression of Twist-1 in mice is sufficient to block osteoblast differentiation. It is suggested that Twist provides the mesoderm with the competence to be pluripotent: first, by providing these cells with the components necessary to respond to inductive cues directing further specification; and second, by providing an almost universal repertoire of mesodermal CRMs with the competence to respond to other TFs. Once bound by Twist, these regulatory regions may be primed for activation by more specialized TFs, and thereby allow rapid developmental progression at the appropriate time (Sandmann, 2007).

Cardiac expression of the Drosophila Sulphonylurea receptor gene is regulated by an intron enhancer dependent upon the NK homeodomain factor Tinman

Cardiac development proceeds via the activation of a complex network of regulatory factors which both directly and indirectly impact downstream cardiac structural genes. In Drosophila, the NK homeodomain transcription factor Tinman is critical to cardiac specification and development via the activation of a number of key regulatory genes which mediate heart development. Tinman also functions in Drosophila to directly activate transcription of the ATP binding cassette gene Sulphonylurea receptor (Sur). Cardiac expression of Sur is regulated by Tinman via an intron enhancer which first becomes active at stage 12 of embryogenesis, and whose function is restricted to the Tin cardial cells by the end of embryogenesis. Cardiac Sur enhancer activity subsequently persists through larval and adult development, but interestingly becomes modulated in several unique subsets of Tin-expressing cardial cells. The cardiac enhancer contains four binding sites for Tinman protein (consensus 5'-TYAAGTG); mutation of two of these sites significantly reduces enhancer activity at all stages of development, and activation of the wild-type enhancer by ectopic Tinman protein confirms Sur is a direct target of Tinman transcriptional activation. These findings delineate at the molecular level specific sub-types of Tin cardial cells, and define an important regulatory pathway between two Drosophila genes for which mutations in human homologs have been shown to result in cardiac disease (Hendren, 2007).

A transcription factor collective defines cardiac cell fate and reflects lineage history

Cell fate decisions are driven through the integration of inductive signals and tissue-specific transcription factors (TFs), although the details on how this information converges in cis remain unclear. This study demonstrates that the five genetic components essential for cardiac specification in Drosophila, including the effectors of Wg and Dpp signaling, act as a collective unit to cooperatively regulate heart enhancer activity, both in vivo and in vitro. Their combinatorial binding does not require any specific motif orientation or spacing, suggesting an alternative mode of enhancer function whereby cooperative activity occurs with extensive motif flexibility. A fraction of enhancers co-occupied by cardiogenic TFs had unexpected activity in the neighboring visceral mesoderm but could be rendered active in heart through single-site mutations. Given that cardiac and visceral cells are both derived from the dorsal mesoderm, this 'dormant' TF binding signature may represent a molecular footprint of these cells' developmental lineage (Junion, 2012).

Dissecting transcriptional networks in the context of embryonic development is inherently difficult due to the multicellularity of the system and the fact that most essential developmental regulators have pleiotropic effects, acting in separate and sometimes interconnected networks. This study presents a comprehensive systematic dissection of the cis-regulatory properties leading to cardiac specification within the context of a developing embryo. The resulting compendium of TF binding signatures, in addition to extensive in vivo and in vitro analysis of enhancer activity, revealed a number of insights into the regulatory complexity of developmental programs (Junion, 2012).

Nkx (Tinman in Drosophila), GATA (Pannier in Drosophila), and T box factors (Doc in Drosophila) regulate each others expression in both flies and mice, where they form a recursively wired transcriptional circuit that acts cooperatively at a genetic level to regulate heart development across a broad range of organisms. The data demonstrate that this cooperative regulation extends beyond the ability of these TFs to regulate each others expression. All five cardiogenic TFs (including dTCF and pMad) converge as a collective unit on a very extensive set of mesodermal enhancer elements in vivo (Tin-bound regions) and also in vitro (in DmD8 cells). Importantly, this TF co-occupancy occurs in cis, rather than being mediated via crosslinking of DNA-looping interactions bringing together distant sites. Examining enhancer activity out of context, for example, in transgenic experiments and luciferase assays, revealed that the TF collective activity is preserved in situations in which these regions are removed from their native genomic 'looping' context (Junion, 2012).

In keeping with the conserved essential role of these factors for heart development, the integration of their activity at shared enhancer elements may also be conserved. Recent analyses of the mouse homologs of these TFs (with the exception of the inductive signals from Wg and Dpp signaling) in a cardiomyocyte cell line support this, revealing a signifcant overlap in their binding signatures (He, 2011; Schlesinger, 2011), although interestingly not in the collective 'all-or-none' fashion observed in Drosophila embryos. This difference may result from the partial overlap of the TFs examined, interspecies differences, or the inherent differences between the in vivo versus in vitro models. Examining enhancer output for a large number of regions indicates that this collective TF occupancy signature is generally predictive of enhancer activity in cardiac mesoderm or its neighboring cell population, the visceral mesodermexpression patterns that cannot be obtained from any one of these TFs alone (Junion, 2012).

There are currently two prevailing models of how enhancers function. The enhanceosome model suggests that TFs bind to enhancers in a cooperative manner directed by a specific arrangement of motifs, often having a very rigid motif grammar. An alternative, the billboard model, suggests that each TF (or submodule) is recruited independently via its own sequence motif, and therefore the motif spacing and relative orientation have little importance. The results of this study indicate that cardiogenic TFs are corecruited and activate enhancers in a cooperative manner, but this cooperativity occurs with little or no apparent motif grammar to such an extent that the motifs for some factors do not always need to be present. This is at odds with either the enhanceosome (cooperative binding; rigid grammar) or billboard (independent binding; little grammar) models and represents an alternative mode of enhancer activity, which was termed a 'TF collective' (cooperative binding; no grammar), and likely constitutes a common principle in other systems (Junion, 2012).

The data suggest that the TF collective operates via the cooperative recruitment of a large number of TFs (in this case, at least five), which is mediated by the presence of high-affinity TF motifs for a subset of factors initiating the recruitment of all TFs. The occupancy of any remaining factor(s) is most likely facilitated via protein-protein interactions or cooperativity at a higher level such as, for example, via the chromatin activators CBP/ p300, which interact with mammalian GATA and Mad homologs. This model allows for extensive motif turnover without any obvious effect on enhancer activity, consistent with what has been observed in vivo for the Drosophila spa enhancer and mouse heart enhancers (Junion, 2012).

Integrating the TF occupancy data for all seven major TFs involved in dorsal mesoderm specification (the five cardiogenic factors together with Biniou and Slp) revealed a very striking observation: the developmental history of cardiac cells is reflected in their TF occupancy patterns. Visceral mesoderm (VM) and cardiac mesoderm (CM) are both derived from precursor cells within the dorsal mesoderm. Once specified, these cell types express divergent sets of TFs: Slp, activated dTCF, Doc, and Pnr function in cardiac cells, whereas Biniou and Bagpipe are active in the VM. Despite these mutually exclusive expression patterns, the cardiogenic TFs are recruited to the same enhancers as VM TFs in the juxtaposed cardiac mesoderm. Moreover, dependent on the removal of a transcriptional repressor, these combined binding signatures have the capacity to drive expression in either cell type. This finding provides the exciting possibility that dormant TF occupancy could be used to trace the developmental origins of a cell lineage. It also explains why active repression in cis is required for correct lineage specification, which is a frequent observation from genetic studies. At the molecular level, it remains an open question why the VM-specific enhancers are occupied by the cardiac TF collective. It is hypothesized that this may occur through chromatin remodeling in the precursor cell population. An 'open' (accessible) chromatin state at these loci in dorsal mesoderm cells, which is most likely mediated or maintained by Tin binding prior to specification, could facilitate the occupancy of cell type-specific TFs in both CM and VM cells. Such early 'chromatin priming' of regulatory regions active at later stages has been observed during ES cell differentiation. The current data provide evidence that this also holds true for TF occupancy and not just chromatin marks. On a more speculative level, this developmental footprint of TF occupancy may reflect the evolutionary ancestry of these two organs. Visceral and cardiogenic tissues are derived from the splanchnic mesoderm in both flies and vertebrates. These complex VM-heart enhancers may represent evolutionary relics containing functional binding sites that reflect enhancer activity in an ancestral cell type (Junion, 2012).

Taken together, the collective TF occupancy on enhancers during dorsal mesoderm specification illustrates how the regulatory input of cooperative TFs is integrated in cis, in the absence of any strict motif grammar. This more flexible mode of cooperative cis regulation is expected to be present in many other complex developmental systems (Junion, 2012).

Org-1 is required for the diversification of circular visceral muscle founder cells and normal midgut morphogenesis

The T-Box family of transcription factors plays fundamental roles in the generation of appropriate spatial and temporal gene expression profiles during cellular differentiation and organogenesis in animals. This study reports that the Drosophila Tbx1 orthologue optomotor-blind-related-gene-1 (org-1) exerts a pivotal function in the diversification of circular visceral muscle founder cell identities in Drosophila. In embryos mutant for org-1, the specification of the midgut musculature per se is not affected, but the differentiating midgut fails to form the anterior and central midgut constrictions and lacks the gastric caeca. It was demonstrate that this phenotype results from the nearly complete loss of the founder cell specific expression domains of several genes known to regulate midgut morphogenesis, including odd-paired (opa), teashirt (tsh), Ultrabithorax (Ubx), decapentaplegic (dpp) and wingless (wg). To address the mechanisms that mediate the regulatory inputs from org-1 towards Ubx, dpp, and wg in these founder cells, known visceral mesoderm specific cis-regulatory-modules (CRMs) of these genes were dissected. The analyses revealed that the activities of the dpp and wg CRMs depend on org-1, the CRMs are bound by Org-1 in vivo and their T-Box binding sites are essential for their activation in the visceral muscle founder cells. It is concluded that Org-1 acts within a well-defined signaling and transcriptional network of the trunk visceral mesoderm as a crucial founder cell-specific competence factor, in concert with the general visceral mesodermal factor Biniou. As such, it directly regulates several key genes involved in the establishment of morphogenetic centers along the anteroposterior axis of the visceral mesoderm, which subsequently organize the formation of midgut constrictions and gastric caeca and thereby determine the morphology of the midgut (Schaub, 2013).

The analysis of org-1 expression and function during visceral mesoderm development defined this gene as a new and essential lineage specific regulator of circular visceral muscle founder cell identities and midgut patterning in Drosophila. The data add new insights into the developmental regulatory mechanisms responsible for the diversification of the circular visceral muscle founder cell lineage and midgut morphogenesis (Schaub, 2013).

The initial expression of org-1 occurs in the segmented trunk visceral mesoderm (TVM), where it is coexpressed with tin, bap, bin and Alk. It has been documented that the induction of tin and bap in the dorsal mesoderm involves the combined binding of Smad proteins (Medea and Mad) and Tin to Dpp-responsive enhancers of the tin and bap genes, whereas the segmental repression of bap is mediated by binding of the sloppy paired (slp) gene product. Genetic analysis of org-1 has shown that org-1 is activated downstream of tin but independently of bap and bin, and that dpp provides the key signals for its induction. This suggests a regulatory mechanism analogous to that of bap, in which the combined binding of Smads and Tin activates a Dpp-responsive org-1 enhancer, whereas Wg activated Slp is required for its mutual segmental repression (Schaub, 2013).

The similarities in the early expression patterns of bap, bin, Alk and org-1 in the trunk visceral mesoderm primordia raise the question of the contribution of org-1 to the early development of the TVM as such. Whereas bap and bin are crucially required for the specification of the trunk visceral mesoderm and visceral musculature, loss of org-1 function, like the loss of Alk, has no obvious impact on the specification of the early TVM. Therefore, it is notable that during the subdivision of the visceral mesoderm primordia into founder and fusion-competent myoblasts (cFCs and FCMs), org-1 expression is extinguished in the FCMs and only sustained in the cFC lineage of the circular visceral musculature. This lineage-specific restriction and maintenance of org-1 expression crucially depends on Jeb mediated Alk/Ras/MAPK signaling and points toward a possible cFC lineage specific function of org-1. The genetic analysis demonstrates that org-1 is not required for cFC specification, but plays a decisive role in the induction of the visceral mesoderm specific expression of patterning genes in the founder cells of the circular musculature. Thus, org-1 is critical for the processes of cell fate diversification that provide individual fields of cells along the anteroposterior axis of the visceral mesoderm with their specific identities (Schaub, 2013).

Proper anteroposterior patterning of the trunk visceral mesoderm and the formation of localized organizer fields are prerequisites for eliciting the morphogenetic events that shape the midgut. The formation of these organizer fields depends on the appropriate spatial expression domains of the homeotic selectors Scr, Antp, Ubx and abd-A, the secreted factors dpp and wg, as well as the zinc finger proteins opa and tsh, which are required for the formation of the midgut constrictions as well as the gastric caeca. The regulatory mechanisms responsible for the establishment of the spatial, temporal and tissue-specific expression patterns of these genes in the TVM are only partially understood. Genetic and molecular analyses with the FoxF gene bin, which is expressed in all trunk visceral mesoderm precursors and their descendents, have demonstrated that bin is a direct upstream regulator of dpp in PS7 and is also required for the expression of wg in PS8 of the TVM. Thus, Bin serves as an essential TVM-specific competence factor in conjunction with the dpp/wg signaling feedback loop. The current findings have defined Org-1 as an additional tissue-specific regulator with an even broader range of downstream patterning genes in the TVM, but with a narrower spatial range of action. org-1 acts specifically within the visceral muscle founder cell lineage as a positive regulator upstream of opa, tsh, Ubx, dpp as well as wg (Schaub, 2013).

This combination of genetic data and functional enhancer analyses provides convincing evidence that both dpp and wg are direct transcriptional targets of Org-1 in the cFCs. Prior dissections of the dpp visceral mesoderm (VM) enhancer had shown that it is also regulated by the direct binding of Ubx, Exd, dTCF (a Wg effector) and Bin, and that minimal synthetic variants that contain only the binding motifs for Ubx, Exd, Bin, and dTCF within conserved sequence contexts (which happen to include the Org-1 motif) are active as VM enhancers. Likewise, the wgXC enhancer fragment integrates Org-1 with the direct regulatory inputs of Abd-A as well as CREB and Smad (Mad/Medea) proteins mediating Dpp signaling (Schaub, 2013).

Org-1 is the first transcription factor known to be required for Ubx expression in PS7 of the visceral musculature. Extensive work on an Ubx visceral mesoderm CRM (UbxRP) indicated that dpp and wg regulate Ubx through indirect autoregulation. Of note, in bin embryos, which also lack visceral mesodermal dpp and wg expression, Ubx is still expressed. Genetic data show that the UbxRP element, while requiring org-1, is not directly regulated by Org-1, since mutation of its four predicted T-Box binding sites did not have any effects. Taking into account that no UbxRP reporter activity was detected in the cFCs at pre-fusion stages, it is suggested that UbxRP represents a late enhancer element and responds to dpp and wg only after they are activated by Org-1 in the founder cells. To clarify whether the regulation of Ubx by Org-1 is direct or indirect, the identification and dissection of a founder cell specific CRM will be required (Schaub, 2013).

tsh and opa were described as homeotic target genes of Antp in PS4-6 (tsh) and PS4-5 (opa) as well as of abd-A in PS8 (tsh) and PS9-12 (opa) of the visceral musculature. The current data show that tsh and opa expression is already activated in the respective cFCs of the visceral parasegments where it requires org-1. The later activation of tsh in PS8 during muscle fusion follows the org-1 dependent founder cell specific initiation of wg in PS8, which acts upstream of tsh. Thus it was conceivable that the regulation of tsh by org-1 is indirect. However, ectopic activation of wg in an org-1 loss of function background is not able to rescue tsh expression and Antp and abd-A expression is not altered upon loss of org-1. These observations suggest that Org-1 acts directly on tsh and opa, e.g., via functional cooperation with Antp and Abd-A, respectively, during the early activation of tsh and opa in the founder cells (Schaub, 2013).

It was reported that the absence of Jeb/Alk signaling causes loss of dpp expression in the founder cells in PS7 of the visceral mesoderm. In light of the current findings that org-1 loss-of-function produces a similar phenotype, and of the previous demonstration that org-1 expression is downstream of Jeb/Alk, this observation could simply be explained by the action of a linear regulatory cascade from Jeb/Alk via org-1 towards dpp. Alternatively, Jeb/Alk may provide additional inputs towards dpp (and other patterning genes) in parallel to org-1, which could explain the slightly stronger phenotype of Alk as compared to org-1 mutations with respect to dpp. A possible candidate for an additional effector of Jeb/Alk signals in this pathway is extradenticle (exd), which is known to be required for normal dpp expression in PS7 of the visceral mesoderm, presumably through direct binding of Exd in a complex with Hox proteins and Homothorax (Hth) to a PS7-specific enhancer element (a derivative of which was used in this study). Like org-1, exd is also needed for the expression of tsh and wg in the visceral mesoderm (Additionally, it represses dpp in PS4-6 through sequences not contained in the minimal PS7 enhancer). It is thought that Exd complexed with Hox proteins and Hth increases the binding preference of these Hox complexes for specific binding sites within visceral mesodermal enhancers of their target genes (Schaub, 2013).

Since exd is expressed in both founder and fusion-competent cells in the visceral mesoderm, it is unlikely that it fulfills its roles in the regulation of dpp, wg, and tsh in the founder cells as a downstream gene of org-1. However, it is known that Exd requires nucleocytoplasmic translocation for it to be functiona and, interestingly, it has been shown that Jeb/Alk signals trigger nuclear localization of Exd specifically in the cFCs of the visceral mesoderm. Because nuclear Exd appears to be hyperphosphorylated as compared to cytoplasmic Exd, nuclear translocation of Exd may be triggered by Alk-mediated phosphorylation. Alternatively, Jeb/Alk signals may induce the expression of hth in the cFCs and Hth could then translocate Exd to the nuclei, as has been shown in other contexts. This would be compatible with the observation that Hth is upregulated in the founder cells in an org-1-independent manner (Schaub, 2013).

The combined data show that Jeb/Alk signals exert at least two parallel inputs towards patterning genes in the cFCs, which are the induction of org-1 and the nuclear translocation of Exd. Taken altogether, a model is suggested in which combinatorial binding of Org-1, nuclear Exd/Hth and the homeotic selector proteins to the corresponding visceral mesoderm specific CRMs is required for the initiation of lineage specific expression of opa, tsh, dpp, Ubx and wg in the founder cells of the respective parasegments. As shown in the examples of dpp (PS7) and wg (PS8), accessory Bin is required for the activation as a general visceral mesodermal competence factor, whereas Dpp and Wg effectors mediate autoregulatory stabilization of their expression (Schaub, 2013).

Extensive work has shown that during somatic muscle development individual founder myoblasts acquire distinct identities, which are adopted by the newly incorporated nuclei upon myoblast fusion, thus leading to the morphological and physiological diversification of the differentiating muscles. It is proposed that the same principle is active during visceral muscle development. In this view, Org-1 acts as a muscle identity factor in both the somatic and visceral mesoderm. In the visceral mesoderm, Org-1 helps diversifying founder cell identities and, after myoblast fusion, their differential identities are transmitted to the respective differentiating circular gut muscles. The activation of downstream targets of this identity factor in the developing muscles leads to the observed morphogenetic differentiation events of the midgut and the establishment of the signaling center in PS7/8 that is also required for Dpp and Wg mediated induction of labial in the endodermal germ layer. As is the case for identity factors in the somatic muscle founders, Org-1 in the visceral mesoderm acts in concert with other, spatially restricted activities such as Hox factors and signaling effectors to achieve region-specific outputs. The main difference is that, in the trunk visceral mesoderm, Org-1 is present in all founder cells whereas in the somatic mesoderm this identity factor (like others) is expressed in a particular subset of founder myoblasts. Thus, in contrast to the somatic mesoderm, the spatial expression of Org-1 does not contribute to its function in visceral muscle diversification and instead, it solely relies on spatially-restricted co-regulators during this process (Schaub, 2013).

The pool of trunk visceral mesodermal fusion-competent cells contributes to the formation of both circular and longitudinal midgut muscles, depending on whether they fuse with resident founder cells of the trunk visceral mesoderm or with founders that migrated in from the caudal visceral mesoderm. The restricted expression of the identity factor Org-1 in the founder myoblasts in the trunk visceral mesoderm and its exclusion from the FCMs represents an elegant mechanism to ensure that the respective patterning events only occur in the developing circular musculature but not in the longitudinal muscle fibers, which extend as multinucleate syncytia throughout the length of the midgut (Schaub, 2013).

Tup/Islet1 integrates time and position to specify muscle identity in Drosophila

The LIM-homeodomain transcription factor Tailup/Islet1 (Tup) is a key component of cardiogenesis in Drosophila and vertebrates. This study reports an additional major role for Drosophila Tup in specifying dorsal muscles. Tup is expressed in the four dorsal muscle progenitors (PCs) and tup-null embryos display a severely disorganized dorsal musculature, including a transformation of the dorsal DA2 into dorsolateral DA3 muscle. This transformation is reciprocal to the DA3 to DA2 transformation observed in collier (col) mutants. The DA2 PC, which gives rise to the DA2 muscle and to an adult muscle precursor, is selected from a cluster of myoblasts transiently expressing both Tinman (Tin) and Col. The activation of tup by Tin in the DA2 PC is required to repress col transcription and establish DA2 identity. The transient, partial overlap between Tin and Col expression provides a window of opportunity to distinguish between DA2 and DA3 muscle identities. The function of Tup in the DA2 PC illustrates how single cell precision can be reached in cell specification when temporal dynamics are combined with positional information. The contributions of Tin, Tup and Col to patterning Drosophila dorsal muscles bring novel parallels with chordate pharyngeal muscle development (Boukhatmi, 2012).

The pattern of rp298lacZ (a general marker of PCs/FCs) expression and the three-dimensional arrangement of founder cells (FCs) distinguished four groups: dorsal, dorsolateral, lateral and ventral. Whether this topology reflects specific genetic programs has remained unclear. Tup and Col are expressed in the four dorsal and three DL PCs, respectively, supporting the notion of DV regionalization of the somatic mesoderm. This notion was evoked by regional Pox meso (Poxm) expression in most ventral and lateral FCs. As other known iTFs are only expressed in subsets of dorsal PCs/FCs, it raised the possibilty that Tup could reside at the top of dorsal 'identity' transcription factor (iTF) cascades. The data show that this is not the case, as Tup, although required for Kr expression in the DO1 PC and for Col repression in the DA2 PC, is not required for expression of Eve, Runt and Vg in the DA1, DO2 and DA2 and DA1 lineages, respectively. Likewise, Col is required for expression of some iTFs but not others in DL PCs. Together, the patterns of Col, Eve, Kr, Poxm, Runt, Tup and Vg expression in wild-type and tup or col mutant conditions underline the intertwined, combinatorial nature of transcriptional regulatory networks specifying muscle identity. The DA2 PC gives rise to the DA2 muscle/DL adult muscle precursor (AMP) mixed lineage. Each abdominal hemisegment features six AMPs at stereotypical positions. Other AMPs originate from mixed lineages, e.g. the ventral VA3/AMP and lateral SBM/AMP lineages. The VA3/AMP and SBM/AMP PCs express Poxm and S59, and Lb, respectively. Tup expression in the DA2/AMP lineage confirms that different AMPs express different iTFs at the time of specification. Whether, as for somatic muscles, the iTF code confers specific properties to each AMP remains an unresolved issue (Boukhatmi, 2012).

How PCs born at similar positions in the somatic mesoderm come to express different combinations of iTFs and acquire distinct identities has remained elusive. For example, what distinguishes the fate of the two Eve-expressing PCs, which are sequentially born from the same dorsal cluster, is unknown. One other example is the expression of S59 and Lb, each in one of two abutting ventrolateral PCs: the LO1/VT1 and SBM PCs. Activation of both Lb and Slo expression in the two PCs is controlled by the same upstream regulator, Org-1 (Schaub, 2012). Subsequent reciprocal secondary cross-repression results in exclusive S59 or Lb expression, but the nature of the presumed positional bias responsible for the oriented resolution of this cross-repression has not been explored. In the case of the adjacent DA2 and DA3 PCs, this study shows that Tup activation by Tin in the DA2 PC is instrumental in distinguishing between DA2 and DA3 identities. The DA2 PC is selected from a small group of cells at the intersection between Tin and Col expression domains. Thus, the relative registers of tin and col expression along the DV axis provide precise positional information. Another key is timing. The overlap between Tin and Col expression is only transient, such that only the earlier-born Col-expressing PC expresses Tin. This provides a unique temporal window for Tup activation and Col repression. The transient overlap is due to the dorsal restriction of Tin expression to cardial cells during stage 11 This dynamic process is controlled by JAK-STAT signalling activity in the mesoderm, which is itself modulated by Tin activity. The key function of Tup in the DA2 PC, which is to distinguish between two muscle identities, illustrates how cell identity can be specified with single-cell precision when temporal dynamics are combined with positional information (Boukhatmi, 2012).

Some iTFs are expressed during all steps of myogenesis, from promuscular stage to muscle attachment. Schematically, two major phases of expression can be distinguished: (1) PC specification when multiple iTFs are expressed in different PCs and extensive cross-regulation occurs, leading to FC-specific iTF patterns; and (2) muscle differentiation when the FC pattern is maintained and propagated into the syncytial fibre via transcriptional activation of the iTF code in newly fused FCMs . Analysis of col regulation in the DA3 lineage showed that these two phases rely on two separate, early (CRM276) and late (4_0.9) CRMs, the activity of the late CRM requiring Col provided under the control of the early CRM. This auto-regulatory loop has been termed a CRM handover mechanism. This study has provided evidence that tup transcriptional regulation in the DA2 muscle follows the same rule. On the one hand, tup activation by Tin is mediated by an early CRM, IsletH; on the other, tup expression is maintained in differentiating muscles via a late CRM, DME, the activity of which depends upon Tup. It is proposed that this handover relay mechanism could be a widespread mode of iTF regulation, as it efficiently links early steps of muscle specification in response to positional information with final muscle identity (Boukhatmi, 2012).

Tup and Tin are key components of the transcriptional regulatory cascade that controls early cardiogenesis, with Tin acting to activate Tup, the expression of which then persists after Tin has ceased to be expressed. This study now showns that a similar cascade operates in the somatic muscle mesoderm. Tup and Tin expression in both the heart and dorsal somatic muscles recalls Nkx2.5 (Tin ortholog) and Islet1 expression in the pharyngeal mesoderm, which contributes to some head muscles and part of the vertebrate heart. Nkx2.5 is required for deployment of the second heart field (SHF) and Islet1 marks SHF progenitors that contribute both to the right ventricle and the arterial pole of the forming heart and a subset of skeletal pharyngeal muscles. Similarly, in the simple chordate Ciona intestinalis, Nk4 (Tin/NKx2.5) marks the cardio-pharyngeal mesoderm at the origin of the heart and atrian siphon muscles (ASMs) that are evocative of vertebrate pharyngeal muscles. Islet-expressing cells also contribute to ASMs, suggesting an evolutionarily conserved link between cardiac and pharyngeal muscle development. Interestingly, the ascidian Col/EBF ortholog Ci-COE, is expressed in ASM precursors and is a crucial determinant of the ASM fate, reminiscent of Xenopus XCoe2 expression and requirement in pharyngeal arches for aspects of jaw muscle development. It is now well established that distinct genetic networks govern skeletal myogenesis in the vertebrate head and trun. The repertoire of TFs differentially deployed in the head mesoderm includes Tbx1, the Drosophila ortholog of which, Org-1, has recently been shown to act as a muscle iTF . Tin/Nkx2.5, Tup/Isl1, Org-1/Tbx1 and Col/EBF may thus be part of a repertoire of transcription factors co-opted and diversified to regulate muscle patterning in Drosophila trunk and head muscle patterning in chordates (Boukhatmi, 2012).

Subtle changes in motif positioning cause tissue-specific effects on robustness of an enhancer's activity

Deciphering the specific contribution of individual motifs within cis-regulatory modules (CRMs) is crucial to understanding how gene expression is regulated and how this process is affected by sequence variation. But despite vast improvements in the ability to identify where transcription factors (TFs) bind throughout the genome, the ability to relate information on motif occupancy to function from sequence alone is limited. This study engineered 63 synthetic CRMs to systematically assess the relationship between variation in the content and spacing of motifs within CRMs to CRM activity during development using Drosophila transgenic embryos. In over half the cases, very simple elements containing only one or two types of TF binding motifs were capable of driving specific spatio-temporal patterns during development. Different motif organizations provide different degrees of robustness to enhancer activity, ranging from binary on-off responses to more subtle effects including embryo-to-embryo and within-embryo variation. By quantifying the effects of subtle changes in motif organization, it was possible to model biophysical rules that explain CRM behavior and may contribute to the spatial positioning of CRM activity in vivo. For the same enhancer, the effects of small differences in motif positions varied in developmentally related tissues, suggesting that gene expression may be more susceptible to sequence variation in one tissue compared to another. This result has important implications for human eQTL studies in which many associated mutations are found in cis-regulatory regions, though the mechanism for how they affect tissue-specific gene expression is often not understood (Erceg, 2014).

While quantifying the activity of a simple 'two-TF motif' CRM (pMad-Tin), the results show that enhancer activity can exhibit very different sensitivity to motif organization in one tissue compared to another. Several mechanisms could account for this interesting effect, including different concentrations of the TF (i.e. pMad or Tin) in the different tissues, the availability of tissue-specific co-factors, or tissue-specific priming of the enhancer, which may increase the ease by which the enhancer is activated (Erceg, 2014).

An elegant dissection of the endogenous sparkling enhancer has demonstrated that completely rearranging the relative order and spacing of TF binding sites could switch its cell type-specific activity from cone cells to photoreceptors in the eye. In comparison, the changes in motif organisation introduced in the current study were much more subtle such that the relative order of motifs was completely preserved. Yet only changing the spacing or orientation of motifs altered the robustness of enhancer activity in a tissue-specific manner. This result indicates that small insertions or deletions in CRMs, that do not affect the TF motifs themselves, could still have significant effects on gene expression in one tissue while having no effect in another. A study examining the activity of neuroectoderm enhancers between Drosophila species supports this model, where reduced spacing between Dorsal and Twist sites results in broader neuroectodermal stripes of CRM activity, while increased motif spacing resulted in progressively narrower stripes. Studies of both endogenous enhancers and the synthetic CRMs described in this study provide compelling evidence that the exact positioning of motifs within CRMs is crucial for the robustness of their activity in one tissue, while it may be largely dispensable in another. Different cell types can therefore interpret the same motif content of a given enhancer in different manners (Erceg, 2014).

The Drosophila heart is composed of two cell types, cardioblasts and pericardial cells, each of which requires the integration of many regulatory proteins for proper specification and diversification. A characterized pericardial enhancer, eve MHE, for example, contains pMad and Tin binding sites in addition to sites for dTCF, Twi, Ets proteins, and Zfh1. Given this complexity, it was surprising that a simple element built from pMad and Tin sites alone was sufficient to drive expression in the heart, albeit at a later developmental stage. This analyses indicate that this activity is due to cooperativity binding between Tin and pMad, facilitated by a very specific motif arrangement. Using crystal structure data from close homologues of pMad, the two TFs interaction on DNA were modelled, using a similar range of motif spacing. This 3D structural model indicates that it is possible for the DNA binding domains of these two proteins to both bind to DNA at a 2 bp spacing and to physically interact at a 2 bp and 4 bp spacing, but not at 6 bp spacing. Although done by homologue mapping, this structural data is consistent with the functional analyses of CRM activity, and further supports direct DNA binding cooperativity between these two TFs (Erceg, 2014).

It is interesting to note, that although pMad and Tin sites are sufficient to drive expression in the heart from stage 13 to 14 (when placed in a limited motif arrangement), nature appears to use other enhancer configurations to regulate this critical function. There are two important aspects to this finding. First, heart activity arising from CRMs containing pMad and Tin sites alone is not robust. The enhancers are on 'the edge' of activation, where subtle changes in motif positioning or enhancer location switch activity between embryos and within embryos. Second, endogenous enhancers that are bound only by pMad and Tin - with no known input from other factors - direct expression in the dorsal mesoderm and not in the heart, at stage 10. In the synthetic situation, pMad and Tin sites also drive robust expression in the dorsal mesoderm, in addition to variable weak expression in the heart. Therefore, although pMad and Tin sites alone are sufficient to drive heart activity in limited motif contexts, this mechanism is most likely not robust enough to be generally used to drive heart expression in vivo. This is consistent with recent studies showing that heart enhancer activity is elicited by the collective action of many TFs, which can occupy enhancers with considerable flexibility in terms of their motif content and configuration. The pMad-Tin synthetic elements uncovered a very simple, although not very robust, alternative mechanism to regulate heart activity, and represent a nice example of how combinatorial regulation can lead to emergent expression profiles more than the simple sum of its parts (Erceg, 2014).

The expression of key developmental genes is generally buffered against variation in genetic backgrounds and environmental conditions. This may occur at many levels including RNA polymerase II pausing and the presence of partially redundant enhancers. However, robust expression may also be buffered by the motif content within an enhancer to ensure a stable regulatory function. CRMs, for example, often include additional binding sites to those that are minimal and necessary. In the context of the pMad-Tin synthetic CRMs, the motif organization can also act to ensure robust activity. The results demonstrate that even in situations where the composition of motifs and their relative arrangement are maintained, subtle changes in the spacing between the motifs could have dramatic effects on enhancer output. Interestingly, this effect seems to be very tissue-specific, with some tissues maintaining robust activity whilst others lost all enhancer activity (Erceg, 2014).

Taken together, the data presented in this study demonstrate that subtle alterations in motif organization can affect the ability of different tissues to 'read' an enhancer, which in turn may allow each tissue to fine-tune enhancer activity based on fluctuations in its molecular components (Erceg, 2014).

Enhancer modeling uncovers transcriptional signatures of individual cardiac cell states in Drosophila

This study used discriminative training methods to uncover the chromatin, transcription factor (TF) binding and sequence features of enhancers underlying gene expression in individual cardiac cells. Machine learning with TF motifs and ChIP data for a core set of cardiogenic TFs and histone modifications were used to classify Drosophila cell-type-specific cardiac enhancer activity. The classifier models can be used to predict cardiac cell subtype cis-regulatory activities. Associating the predicted enhancers with an expression atlas of cardiac genes further uncovered clusters of genes with transcription and function limited to individual cardiac cell subtypes. Further, the cell-specific enhancer models revealed chromatin, TF binding and sequence features that distinguish enhancer activities in distinct subsets of heart cells. Collectively, these results show that computational modeling combined with empirical testing provides a powerful platform to uncover the enhancers, TF motifs and gene expression profiles which characterize individual cardiac cell fates (Busser, 2015).

A previous study designed and applied a meta-analysis of gene expression profiles derived from purified mesodermal cells obtained from wild-type (WT) and informative mutants to characterize and predict gene activity in the Drosophila heart. In addition, recent studies have chromatin immunoprecipitation followed by massively parallel sequencing (ChIP-seq) of numerous cardiac TFs to uncover the cis regulatory elements and genes which characterize the cardiac lineage. In order to compile a more comprehensive list of genes with confirmed expression in the Drosophila heart, this analysis consisted of a large-scale validation of these predictions using whole embryo in situ hybridization. Out of 103 tested genes, an additional 50 genes were uncovered with previously uncharacterized expression in the cardiac mesoderm (CM) and/or mature heart. Combining these newly-identified cardiac genes with a complete curation of the literature reveals a total of 284 genes with verified expression in the heart (Busser, 2015).

GO analysis followed by the generation of a condensed summary of the list that was initially obtained was assembled by removing redundant GO terms. The purpose of this analysis was to uncover the functions of this large battery of cardiac genes. Indeed, the non-redundant GO terms revealed a diversity of functions for these genes, identifying both upstream (signaling, transcription, etc.) and downstream (adhesion, chemotaxis, metabolic processes, etc.) components of the heart gene regulatory network. In fact, a more detailed categorization revealed that 165 of these 284 genes are upstream components, with 82 of these being sequence-specific TFs. As there are presently only eight described cardiac cell subtypes (five PC and three CC; referring to paracardial cells and cardial cells), this shows that there are at least 10x the number of TFs than previously characterized cell states, suggesting that there is more extensive diversity in the combinations of TFs utilized to achieve specificity of cardiac gene expression than had been appreciated in prior studies. The diversity of TFs required to achieve cellular specificity of gene expression seems to be mirrored in the enhancers they regulate, since similar diversity was found in the combinations of motifs regulating and TFs binding myogenic enhancers. In total, this work uncovers a large battery of cardiac genes, and both the diversity of their inferred functions and the large number of TFs identified suggest that these genes are under complex combinatorial transcriptional regulation (Busser, 2015).

The molecular mechanisms underlying the coordinate regulation of these heart genes ≈ previous study characterized the motifs, enhancers and TFs that discriminate the two broad populations of the Drosophila heart, PCs and CCs. This study sought to model enhancers with cardiac activity of individual cardiac cell states to gain insights into both the similarities and differences in sequence and chromatin features amongst the eight individual cardiac cell subtypes that are known to exist. To do so, compiled a list of enhancers with previously reported activity in the Drosophila heart was compiled, including those from a preceding study, and transgenic reporter assays were performed to confirm and refine prior findings at the level of single cells of defined identities. To avoid the confounding effects of reporter variability due to insertion site, these reporters were inserted at a specific genomic locus that permits robust and reproducible activity in the mesoderm. In vivo transgenic reporter assays were performed with the 95 curated cardiac enhancer sequences and it was confirmed that 73 are active in the CM and/or heart, with the majority of the enhancer sequences with non-cardiac activity showing activity in the neighboring amnioserosa cells (Busser, 2015).

The activity of these 73 cardiac reporters was monitored in the differentiated heart to compile training sets of enhancers with activity in the different cardiac cell subtypes. As the cells of the heart can be subdivided into individual identities based on morphological differences and the expression pattern of distinct TFs, the expression of Tin, which marks a subset of CCs and PCs, and Zinc Finger Homeobox 1 (Zfh1) which labels all PCs, with anatomical and morphological differences of the cells was used to identify every distinct cardiac cell type. Using these markers and monitoring reporter activity in the differentiated heart, a set of enhancers was uncovered with activity in all the PCs (22 total sequences, hereafter referred to as 'pan-PC') and/or all the CCs (33 total sequences, hereafter referred to as 'pan-CC'). Of these 73 cardiac reporters, 6 to 7 enhancers were identified with activity restricted to the subsets of the CCs (hereafter referred to as 'Tin-CC', 'Tin-Lb-CC' or 'Svp-CC') which is an insufficient quantity to serve as a training set for a machine learning analysis without over-fitting the data. Many enhancer sequences with activity in the different PC subtypes of the heart were identified, including the Svp-PCs, Odd-PCs and Eve-PCs. However, it was not possible to individualize the activity of enhancer sequences in the Tin-alone or Tin-Lb-PCs, with only one enhancer sequence (that associated with the Lb genes) with activity restricted to the Tin-Lb PCs but not the Tin-alone PCs. As enhancer sequences are active in both of these cell types, this class is referred as the 'Tin-PC' enhancers. In total, these results identified sets of enhancers with activities in different subsets of cardiac cells, including pan-PC, pan-CC, Eve-PC, Tin-PC, Odd-PC and Svp-PC (Busser, 2015).

A machine learning approach was used to uncover associated regulatory elements and the discriminating characteristics (sequence motifs and epigenetic features) that differentiate these individual heart cells. Previous work has shown that the distribution of epigenetic modifications of the histone proteins and in vivo binding profiles of relevant TFs can be used to predict cis regulatory elements and gene activity. A recent study has described the distribution of a series of histone modifications in sorted mesodermal nuclei from Drosophila embryos at a developmental stage in which the cardiac precursor cells are being specified. In addition, another study examined the in vivo binding sites of a series of conserved cardiogenic TFs at different developmental time points. These include the T-box TFs (Doc), the GATA4 ortholog Pnr, the Nkx2.5 ortholog Tin and the TFs downstream of the signaling pathways for Wnt (dTCF in Drosophila) and Bmp (phosphorylated Mad (pMad) in Drosophila). In addition to the aforementioned TFs and histone marks, this study also included over 1000 binding motifs from available databases to identify sequence features critical for categorizing enhancer activities. The binding motifs and in vivo binding profiles for cardiogenic TFs and relevant histone modifications were mapped onto the training set and control sequences and a support vector machine (SVM) was used to discriminate the training set from controls. To model cell-type-specific cardiac enhancer activity, separate SVM models were built for pan-PC, pan-CC, Eve-PC, Tin-PC, Svp-PC and Odd-PC sequences (Busser, 2015).

Attempts were made to classify the different cell subtypes against each other. However, this approach failed to discriminate the training set sequences from controls as the area under the receiver operator characteristic (AUC) curve values ranged from 0.46 to 0.67. This result is due to the overlap in the training set sequences, with most sequences showing activity in more than one cell type, which reflects a requirement for the gene products regulated by these enhancers in more than one cell type. To circumvent this issue, separate SVM models were built for training set sequences from GC and length-matched background sequence. Here reliable classification of cardiac cell subtype enhancers were observed as the AUC curve varied for the separate classifiers from 0.96 to 0.99. In addition, enhancers predicted by these models are significantly associated with known heart genes. Finally, it was shown that the enhancer predictions of cardiac cell classifications are cell-type-specific. In total, these results confirm the generation of cardiac cell subtype-specific cardiac classifiers that can reliably discriminate the training set from controls (Busser, 2015).

It was next asked if the enhancer predictions from the individual cell-specific classifications could be used to predict expression patterns of known cardiac genes, and to use these annotated gene expression patterns to uncover the functions of individual heart cells. To do so, the top-scoring cardiac cell subtype enhancer prediction were isolated from each classification for each gene with known heart expression. By focusing this analysis on genes with validated cardiac expression, it was possible to confidently associate a predicted enhancer with bona fide transcriptional targets, findings that are not always available or included in such studies, often due to the lack of known expression patterns for candidate target genes. Underscoring the utility of this approach, 278 out of 284 heart genes (97.9%) were associated with a top-scoring predicted cell-specific cardiac enhancer. Out of these 278 heart genes associated with a predicted enhancer, 196 of these predictions were found within the introns of the heart gene (70.5%), increasing the confidence in its association with this transcriptional target. Hierarchical clustering of the prediction scores was used to group related expression patterns, which uncovered distinct clusters of cell-specific cardiac gene expression. This analysis revealed gene expression clusters specific for the individual cardiac cell subtypes and also for the pan-PC, pan-CC and all cardiac cell expression patterns (Busser, 2015).

With these expression clusters, it was asked if functions associated with these individual cardiac cell subtypes could be inferred. GO analysis for the genes within these expression clusters, followed by the removal of redundant terms, revealed functions for these gene expression clusters. Genes associated with enhancers predicted to be active in all heart cells (pan-PC/pan-CC) were associated with developmental, signaling and transcriptional functions. This result is consistent with these genes playing a role in the upstream regulatory network that specifies the cardiac lineage. Furthermore, genes with predicted expression in all CCs (pan-CC) were enriched for myogenic functions including cell adhesion and the actin cytoskeleton which are expected functions for contractile cells. Interestingly, genes associated with pan-PC enhancers were associated with renal system development, which further supports their proposed role as insect nephrocytes (Busser, 2015).

This analysis also uncovered specialized functions for individual cardiac cell subtypes. For example, the Odd-PCs were enriched for chemotaxis and locomotion functions, suggesting these cells are responsive to migratory cues. Alternatively, in the anterior segments of the embryo, Odd is expressed in the PCs of the neighboring lymph gland which forms the adult blood cells, and it is this population of cells which are responsive to migratory cues. Interestingly, the genes associated with enhancers with predicted activity in Tin-PCs are associated with development of endocrine functions (the ring gland in Drosophila is an endocrine organ). Since the physiological processes of filtration, secretion and reabsorption must be coordinated, this specialized endocrine role for Tin-PCs suggests these cells may act as a cellular relay mechanism between these components of the insect excretory system. Lastly, genes associated with enhancers with predicted activity in Eve-PCs and Svp-PCs specialize in the production of extracellular matrix components, which is an essential aspect of proper filtration of the haemolymph (Drosophila blood). In total, these results confirm that modeling cell-type-specific enhancer activities can be used to both confirm and identify previously uncharacterized functions of individual cardiac cells (Busser, 2015).

To test the in vivo transcriptional activities of the predicted enhancers, transgenic reporter assays inserted at specific genomic loci were used to test 47 enhancer predictions of varying scores in the cell-specific classifications. These results revealed that 46 of these 47 candidate enhancers were active reporters in the Drosophila embryo, with 19 of these 46 active reporters (41.3%) showing activity in the differentiated heart. Analyses of cell-type-specific reporter activity uncovered a concordance between predicted and confirmed activity. For example, a predicted enhancer located within the first intron of CG5522 scores well in the pan-PC and pan-CC classifications and poorly in the classifications of individual cardiac cell subtypes. Transgenic reporter assays confirm this result as this genomic region activates reporter expression in all PCs and CCs of the differentiated heart. The distribution of prediction scores was used to reveal enhancers that are active in individual cardiac cells. For example, another enhancer prediction located within the first intron of the Dscam gene scores very well in the Eve-PC and Odd-PC classifications. In agreement with these cell-specific predictions, this enhancer prediction ws shown to be active in these two cell types with additional activity in the Svp-PCs, thereby confirming the significant but slightly less robust Svp-PC prediction score. Some successful enhancer predictions scored well in a cellular subtype classification as well as in the pan-PC and pan-CC classifications. It is possible that such regulatory elements may be composed of overlapping enhancer signatures, with one DNA segment regulating pan-PC and pan-CC activity while another DNA segment enhances transcription in a different cellular subtype. The transgenic reporter assays used to assay enhancer activity would be insensitive to detecting such minor differences in reporter activity due to in vivo perdurance of the reporter RNA and/or protein. In agreement with this possibility, previous studies uncovered multiple signatures in the enhancers regulating muscle founder cell gene expression. Taken together, these results show that the distribution of prediction scores for individual cardiac cell classifications can be used to predict enhancer activity in individual cardiac cell subtypes (Busser, 2015).

To gain an understanding of the regulatory network required for specifying individual cardiac cell fates, the sequence, TF binding and chromatin features critical for the classification of each subtype of heart cell included in this analyses was assessed. As features in the training set receive positive weights, those in the control set receive negative weights, and irrelevant features receive zero weight in linear SVMs, the classification weights associated with the histone marks, TF binding and sequence features relevant to the previously delineated cell-specific regulatory models were examined (Busser, 2015).

The in vivo binding of cardiogenic TFs was next examined as a feature at two developmental time points: (1) 4-6 h after egg laying, a time point in which the dorsal mesodermal derivatives-which includes the precursors of the CM-are specified; and (2) 6-8 h after egg laying, a time point during which the more differentiated CM is specified. Tin, the Nkx2.5 ortholog in Drosophila, is first expressed in and required to specify the dorsal mesodermal derivatives, its expression and function then become restricted to the CM and later there is a confinement of Tin to subsets of cells comprising the mature heart. Pnr (the Gata4 ortholog in Drosophila) and Doc (Tbx4 ortholog in Drosophila) expression intersect with Tin in the CM, and both of these TFs are required for the differentiation of most cardiac cells. Finally, the overlap of signaling by Wnt (whose downstream effector in Drosophila is dTCF) and Bmp (whose downstream effector in Drosophila is phosphorylated Mad, pMad) is critical for specification of the CM (Busser, 2015).

Among the TFs examined, the greatest enrichment was seen with Tin at 6-8 h, which is consistent with the central role played by Tin in the cardiac transcriptional network in Drosophila. However, this interpretation should be considered with caution as the majority of heart enhancers in the training sets were identified based on the presence of Tin binding sites or in vivo binding. The larger positive classification weight at 6-8 h than at 4-6 h for Tin supports a more critical role for Tin binding to cardiac enhancers when the CM is specified (Busser, 2015).

Surprisingly, since Pnr has previously been shown to be a key regulator of cardiogenesis, the SVM weights reveal a minor role for the GATA TF Pnr binding in regulating cardiac enhancer activity. However, this finding is consistent with a recent report which failed to identify cardiac enhancers due to Pnr binding and suggests either a non-enhancer role for such binding or an inability to accurately assess such enhancers with the transgenic reporter assays used in these studies. For example, as minimal promoters are used in transgenic reporter assays, this result could reflect a requirement for a certain promoter in vivo for enhancer activity driven by Pnr-dependent enhancers (Busser, 2015).

Positive classification weights associated with pMad, Tcf and Doc was noticed among the different cell types. Interestingly, it was found that differential SVM weights are associated with these TFs in the various cardiac subtype classifications. For example, Doc shows the greatest positive weight for the Eve-PC classification, and every newly-identified enhancer with Eve-PC activity is bound by Doc. Furthermore, pMad demonstrates a greater SVM weight amongst the classifications of individual cardiac cell subtypes than amongst the pan-PC or pan-CC classifications. This outcome suggests that differential utilization of this signaling pathway may play a role in specifying individual cardiac cell fates. As 7 out of 11 pan-PC enhancers (63.6%) and 6 out of 8 individual cardiac cell subtype enhancers (75%) of newly-identified cardiac enhancers are bound by pMad, validation of this hypothesis requires further testing. In conclusion, these data show that differential SVM weights of in vivo TF binding can be used to model cell-specific enhancer activities (Busser, 2015).

As numerous studies have shown that the epigenetic modifications of the histone proteins can be used as predictors of cis regulatory element activity, the SVM weights were examined for multiple histone mark modifications for each cardiac cell subtype classification identified in this analyses. These histone modifications were examined at the 6-8 h developmental time point (a time at which the cardiac precursors are specified) from sorted mesodermal nuclei. Surprisingly, the strongest enrichment of any modification is tri-methylation of lysine 27 on histone 3 (H3K27me3) for all cardiac cell subtypes. An enrichment of H3K27me3 on active mesodermal enhancers was shown previously; this was in disagreement with another study that revealed a depletion of H3K27me3 on active mesodermal enhancers. As the polycomb complex. which is associated with silent chromatin. primarily trimethylates lysine 27 on histone 3, the most likely explanation for these data is that they reflect the overall enhancer activity in a heterogenous rather than pure population of cells. Since the cells of the Drosophila heart only correspond to a tiny population of the entire mesoderm, and whole mesoderm was previously studied, the apparently inconsistent observation noted in this study suggests that the enhancer is repressed in the majority of the cells (non-heart mesodermal cells) and is active in the minority of cells examined (the fraction of the mesoderm which comprises the heart and its precursors). In agreement with this interpretation, the SVM weights for H3K27me3 are greater for the cardiac subpopulations than those with activity in all PCs or CCs in which a larger population of total cells would show signs of repression. Furthermore, the enrichment for acetylation of lysine 27 on histone 3 (H3K27ac) on these same enhancers suggests that they are active in a subset of cells. These results argue that an accurate interrogation of the epigenetic signatures of individual genomic loci requires isolating homogenous subpopulations of cells. This point is especially relevant when describing bivalent chromatin signatures which may reflect the presence of either a bivalent locus in a single cell or different epigenetic modifications in some but not all members of a more diverse cell population (Busser, 2015).

Monomethylation of lysine 4 on histone 3 (H3K4me1) is positively weighted amongst all classifications, consistent with its description as an enhancer mark. In contrast, trimethylation of lysine 4 on histone 3 (H3K4me3) and trimethylation of lysine 36 on histone 3 (H3K36me3) received either no weight or negative weights for all classifications, consistent with their description as marks of promoters and gene bodies, respectively. Surprisingly, the SVM weight for the active enhancer mark H3K27ac received no weight among Tin-PC enhancers, which may be due to the fact that H3K27ac was seen to only mark two out of nine training set sequences. This suggests that H3K27ac may not always associate with active enhancers in certain cell types. However, this interpretation should be regarded with caution as the training set was small for these cell types and two out of two newly-identified Tin-PC enhancers were marked by H3K27ac. Trimethylation of lysine 79 on histone 3 (H3K79me3) was positively associated with each cardiac cell subtype classification, a result that is in agreement with a recent study which observed H3K79me3 on a subset of developmental enhancers. Interestingly, H3K79me3 showed greater SVM weights associated with Svp-PC and Odd-PC classifications than with the other models, suggesting that these modifications may be differentially utilized amongst cardiac cell subtypes. A large-scale validation of enhancer activities will be required to test this hypothesis, although six out of seven (85.7%) newly-discovered enhancers with activity in Svp-PCs and/or Odd-PCs are marked by H3K79me3 while 7 out of 11 (63.6%) with pan-PC activity are marked by H3K79me3. In any event, such differential utilization of histone marks amongst cell types and regulatory elements may explain the incomplete association between a particular mark and a class of regulatory element. Furthermore, such a cell- or tissue-specific role for histone modifications likely explains the tissue-specific effects of loss-of-function mutations in histone-modifying enzymes. In total, these results uncover chromatin features that are enriched and that potentially discriminate among cardiac cell subtypes (Busser, 2015).

In order to identify DNA sequence similarities and differences amongst the cardiac cell subtype classifications, this study examined the top 500 scoring sequence motifs amongst all classifications and used hierarchical clustering of their SVM weights to reveal clusters of similarly-acting regulatory motifs. Similar to the clustering of enhancer activities, this analysis revealed motif clusters enriched amongst each cardiac cell subtype classification and depleted or irrelevant to the classification of the other cardiac cells. In addition, this analysis revealed motifs relevant for activity in all cardiac cells. The identification of cell-type-specific clusters suggests a role for these motifs in mediating particular patterns of gene expression that are specific for different subsets cardiac cells (Busser, 2015).

The preceding section identified sequence features that potentially discriminate enhancer activity in individual cardiac cells. In order to test this hypothesis, sequence features were identified that were positively weighted within a cell subtype classification(s) and that were depleted or irrelevant for the other cardiac subtype models. cis mutagenesis of a selected fraction of these sequence motifs was then used in transgenic reporter assays to monitor the effects of their targeted removal from otherwise WT enhancers. For this purpose, the activity of five separate motifs, each of which is predicted to discriminate regulatory element activity within subtypes of cardiac cells was analyzed: V$ZF5_01, V$ETS_Q4, V$TEF_01, V$EVI1_06 and V$MTF_01 (Busser, 2015).

The WT mib1 enhancer (mib1^WT) is active in the Odd-PCs and contains two V$ZF5_01 motifs. This motif has a high positive weight within the Odd-PC classification, suggesting that it plays a critical role in Odd-PC enhancer activity In agreement with this hypothesis, mutagenesis of the V$ZF5_01 motifs in the mib1 enhancer (mib1^ZF5) leads to a loss of reporter expression in Odd-PCs (Busser, 2015).

Previous studies have documented an essential role for Ets binding sites in enhancers with activity in Eve-PCs. This observation is now extended by showing that V$ETS_Q4 motifs are heavily weighted in the Eve-PC classification, and that the two V$ETS_Q4 motifs in the Doc1 enhancer are critical for activity in Eve-PCs. Interestingly, the V$ETS_Q4 motif is derived from binding sites for the ETS1 TF, whose ortholog in Drosophila is Pointed (Pnt). In prior studies it was also shown that Pnt was critical in trans for enhancer activity in Eve-PCs, a finding which further establishes that motif enrichment in enhancers can be used to reveal cell-type-specific TFs (Busser, 2015).

The V$TEF_01 motif is positively weighted amongst the Eve-PC and Odd-PC classification, suggesting that it contributes a critical function to Eve-PC and Odd-PC enhancer activities. This study now shows that mutagenesis of the two V$TEF_01 motifs in the CG13822 enhancer (CG13822^TEF) leads to a loss of reporter expression in Odd-PCs and de-repression into Eve-PCs. The V$TEF_01 motif is recognized by thyrotroph embryonic factor, which is a member of the proline and acidic amino acid-rich (PAR) subfamily of basic region/leucine zipper TFs, whose closest Drosophila ortholog is Par domain protein 1 (Pdp1). The functional role of V$TEF_01 motifs in the CG13822 enhancer suggests a role for Pdp1 in cardiogenesis. In support of this hypothesis, a previous functional genomic screen uncovered a role for Pdp1 in patterning the fly heart. Thus, both cis and trans tests of Pdp1 function are consistent with each other in establishing a key role for this TF in Drosophila cardiogenesis (Busser, 2015).

Finally, the SVM weights enriched amongst pan-PC and pan-CC classifications were used to uncover features that are essential for activity in all heart cells. The SVM weights for V$MTF1_01 and V$EVI1_06 motifs are positive amongst classifications of pan-PC and pan-CC enhancers. The WT sty enhancer (sty^WT) is active in all PCs and CCs. Mutagenesis of the one V$EVI1_06 motif (sty^EVI) or the one V$MTF1_01 motif (sty^MTF) in the sty enhancer abrogates enhancer activity in the majority of PCs and CCs, suggesting a critical role for these motifs in regulating enhancer activity in all heart cells. V$MTF1_01 is recognized by Metal regulatory factor 1 (MTF1) in vertebrates and V$EVI1_06 is recognized by EVI-1 (also known as MECOM and PRDM3) whose Drosophila orthologs correspond to MTF1 and hamlet (ham), respectively. The present identification and characterization of these TFs makes them excellent candidates for regulating cardiogenesis in Drosophila. In support of this model, targeted depletion of ham in the dorsal mesoderm using RNAi causes abnormalities in cardiogenesis (Busser, 2015).

The distribution of histone marks, in vivo TF binding, and the presence of TF binding motifs have all been exploited to reveal the enhancers that govern gene expression. This study has combined all three of these approaches using discriminative machine learning methods on a training set of enhancers with activity in distinct subtypes of cardiac cells to model cell-type-specific enhancer activity in the Drosophila heart. Using this approach, sequence, chromatin and TF binding features were uncovered that appear to underlie enhancer activity in individual cardiac cells. From these findings, it is hypothesized that such features potentially discriminate the unique enhancer specificities of single cardiac cells, which was empirically confirmed for a series of sequence motifs in regulating appropriate patterns of cardiac enhancer activity. Finally, by associating a cardiac gene expression atlas with the predicted enhancers from each cell subtype classification, this study uncovered previously unknown functions of individual cells of the Drosophila heart. Collectively, these results document the utility of computational modeling of enhancers to uncover the sequence motifs, chromatin and TF binding patterns as well as the gene expression profiles and functions of individual cells within the overall cardiac lineage (Busser, 2015).

The Interactive Fly resides on the
Society for Developmental Biology's Web server.