InteractiveFly: GeneBrief

GATAe: Biological Overview | Regulation | Developmental Biology | Effects of Mutation | References

Gene name - GATAe

Synonyms - dGATAe

Cytological map position - 89A12

Function - transcription factor

Keywords - endoderm development

Symbol - GATAe

FlyBase ID: FBgn0038391

Genetic map position - 3R

Classification - Zn-finger, GATA type

Cellular location - nuclear

NCBI link: EntrezGene

GATAe orthologs: Biolitmine
Recent literature
Okumura, T., Takeda, K., Kuchiki, M., Akaishi, M., Taniguchi, K. and Takashi, A. Y. (2015). GATAe regulates intestinal stem cell maintenance and differentiation in Drosophila adult midgut. Dev Biol [Epub ahead of print]. PubMed ID: 26719127
Adult intestinal tissues, exposed to the external environment, play important roles including barrier and nutrient-absorption functions. These functions are ensured by adequately controlled rapid-cell metabolism. GATA transcription factors play essential roles in the development and maintenance of adult intestinal tissues both in vertebrates and invertebrates. This study investigated the roles of GATAe, the Drosophila intestinal GATA factor, in adult midgut homeostasis with its first-generated knock-out mutant as well as cell type-specific RNAi and overexpression experiments. These results indicate that GATAe is essential for proliferation and maintenance of intestinal stem cells (ISCs). Also, GATAe is involved in the differentiation of enterocyte (EC) and enteroendocrine (ee) cells in both Notch (N)-dependent and -independent manner. The results also indicate that GATAe has pivotal roles in maintaining normal epithelial homeostasis of the Drosophila adult midgut through interaction of N signaling. Since recent reports showed that mammalian GATA-6 regulates normal and cancer stem cells in the adult intestinal tract, these data also provide information on the evolutionally conserved roles of GATA factors in stem-cell regulation.
Hernandez de Madrid, B. and Casanova, J. (2018). GATA factor genes in the Drosophila midgut embryo. PLoS One 13(3): e0193612. PubMed ID: 29518114
The Drosophila GATA factor gene serpent (srp) is required for the early differentiation of the anterior and posterior midgut primordia. In particular, srp is sufficient and necessary for the primordial gut cells to undertake an epithelial-to-mesenchimal transition (EMT). Two other GATA factor genes, dGATAe and grain (grn), are also specifically expressed in the midgut. On the one hand, dGATAe expression is activated by srp. Embryos homozygous for a deficiency uncovering dGATAe were shown to lack the expression of some differentiated midgut genes. Moreover, ectopic expression of dGATAe was sufficient to drive the expression of some of these differentiation marker genes, thus establishing the role of dGATAe in the regulation of their expression. However, due to the gross abnormalities associated with this deficiency, it was not possible to assess whether, similarly to srp, dGATAe might play a role in setting the midgut morphology. To further investigate this role, a dGATAe mutant was generated. On the other hand, grn is expressed in the midgut primordia around stage 11 and remains expressed until the end of embryogenesis. Yet, no midgut function has been described for grn. As for dGATAe, midgut grn expression is dependent on srp; conversely, dGATAe and grn expression are independent of each other. These results also indicate that, unlike srp, dGATAe and grn,/i> are not responsible for setting the general embryonic midgut morphology. Analysed midgut genes whose expression is lacking in embryos homozygous for a deficiency uncovering dGATAe are indeed dGATAe-dependent genes. Conversely, no midgut gene was found to be grn-dependent, with the exception of midgut repression of the proventriculus iroquois (iro) gene. In conclusion, these results clarify the expression patterns and function of the GATA factor genes expressed in the embryonic midgut.
Martinez-Corrales, G., Cabrero, P., Dow, J. A. T., Terhzaz, S. and Davies, S. A. (2019). Novel roles for GATAe in growth, maintenance and proliferation of cell populations in the Drosophila renal tubule. Development 146(9). PubMed ID: 31036543
The GATA family of transcription factors is implicated in numerous developmental and physiological processes in metazoans. In Drosophila melanogaster, five different GATA factor genes (pannier, serpent, grain, GATAd and GATAe) have been reported as essential in the development and identity of multiple tissues, including the midgut, heart and brain. This study presents a novel role for GATAe in the function and homeostasis of the Drosophila renal (Malpighian) tubule. Reduced levels of GATAe gene expression in tubule principal cells induce uncontrolled cell proliferation, resulting in tumorous growth with associated altered expression of apoptotic and carcinogenic key genes. Furthermore, the involvement of GATAe in the maintenance of stellate cells and migration of renal and nephritic stem cells into the tubule was uncovered. These findings of GATAe as a potential master regulator in the events of growth control and cell survival required for the maintenance of the Drosophila renal tubule could provide new insights into the molecular pathways involved in the formation and maintenance of a functional tissue and kidney disease.

GATA factors play an essential role in endodermal specification in both protostomes and deuterostomes. In Drosophila, the GATA factor gene serpent (srp) is critical for differentiation of the endoderm. However, the expression of srp disappears around stage 11, which is much earlier than overt differentiation occurs in the midgut, an entirely endodermal organ. Another endoderm-specific Drosophila GATA factor gene, GATAe, has been identified. Expression of GATAe is first detected at stage 8 in the endoderm , and its expression continues in the endodermal midgut throughout the life cycle. srp is required for expression of GATAe, and misexpression of srp resulted in ectopic GATAe expression. Embryos that either lack GATAe or have been injected with double-stranded RNA (dsRNA) corresponding to GATAe fail to express marker genes that are characteristic of differentiated midgut. Conversely, overexpression of GATAe induces ectopic expression of endodermal markers even in the absence of srp activity. Transfection of the GATAe cDNA also induces endodermal markers in Drosophila S2 cells. These studies provide an outline of the genetic pathway that establishes the endoderm in Drosophila. This pathway is triggered by sequential signaling through the maternal torso gene, a terminal gap gene, huckebein (hkb), and finally, two GATA factor genes, srp and GATAe (Okumura, 2005).

The endoderm gives rise to major parts of the gut tube of multicellular organisms. Regulatory mechanisms that establish the endoderm have recently received considerable attention with respect to the development of protostomes and deuterostomes. Certain important components of this genetic regulatory pathway/network have now been identified. It remains unknown whether protostomes and deuterostomes share a common genetic mechanism of endoderm specification. However, GATA factor genes are expressed throughout endodermal development in both animal groups. GATA factors have one or two characteristic zinc-finger motifs corresponding to CXNCX17CXNC, and act as transcription factors that bind to a consensus DNA sequence WGATAR of specific target genes (Okumura, 2005).

In vertebrates, GATA factor genes are classified into two groups, GATA-1/-2/-3 and GATA-4/-5/-6. While GATA-1, -2, and -3 are involved in hematopoiesis, GATA-4, -5, and -6 are essential for the development of endoderm-derived tissues, in that they activate endoderm-specific genes such as IFABP, gastric H+/K+-ATPase, HNF4, and albumin. The GATA factor genes are also essential for endodermal development in the protostomes Caenorhabditis elegans and Drosophila melanogaster. C. elegans has eleven GATA factor genes, and seven of these are known to be related to endodermal development, resulting in redundant regulatory pathways. end-1 is the earliest GATA factor gene that is expressed specifically in the endoderm lineage. end-1 mutant embryos fail to form the endoderm, whereas overexpression of end-1 can induce non-endodermal cells to switch to an endodermal fate. end-1 can also induce endoderm formation when it is expressed in Xenopus embryos, suggesting that the genetic mechanisms underlying endodermal development are at least partially shared between protostomes and deuterostomes (Okumura, 2005 and references therein).

In C. elegans, end-1 activates another GATA gene, elt-2, and end-1 expression ceases prior to overt differentiation of the endoderm. elt-2 continues to be expressed in the gut throughout life, and activates genes that are required for various gut functions (Fukushige, 1998). In the protostome D. melanogaster, serpent (srp) is a GATA gene that is essential for the specification of endoderm. srp is expressed in the anterior and posterior terminal regions of the blastoderm, which both give rise to the endoderm. In the srp mutant embryo, the prospective endodermal region differentiates into the ectodermal hindgut. In normal embryos, the prospective hindgut region abuts the prospective endoderm of the posterior terminal, and the hindgut is specified by the Brachyury ortholog, brachyenteron (byn). The initial area in which byn is expressed in the cellular blastoderm includes the prospective posterior endoderm. However, byn expression in the posterior half of this region is soon repressed by srp and the region develops into the endoderm. In the srp embryo, byn expression expands to the prospective endodermal regions. Activation of srp depends on a zygotic gap gene, huckebein (hkb), which is triggered by maternal Torso activity at the anterior and posterior terminal regions of the fertilized egg, followed by activation of Ras signaling. These events outline the genetic pathway that defines endodermal development in Drosophilato date, and represent one of the best examples of a genetic pathway that specifies organogenesis (Okumura, 2005).

Despite the significant progress that has been made in characterizing the pathway described above, another gene, as yet unknown, seems to be involved, since srp ceases to be expressed after stages 10–11 in the prospective endoderm, long before overt differentiation of midgut. Since several GATA factor genes are sequentially expressed during the development of the C. elegans endoderm, identification of a novel GATA gene in Drosophilais expected. Thus far, three GATA factor genes have been reported in Drosophila: pannier (pnr, also known as dGATAa), srp (also known as dGATAb), and grain (also known as dGATAc). By searching the Drosophilagenome sequence, several sequences were found containing novel GATA factor motifs, and two GATA factor genes, dGATAd and GATAe, were identified. While dGATAd is not expressed in the embryo, GATAe is specifically expressed in the endoderm after stage 8, and it continues to be expressed in the endodermal midgut of larvae and adult flies. In this study, the regulation and function of the GATAe was studied. GATAe, upon activation by srp, induces overt differentiation of the Drosophilaendoderm. This finding has enabled the delineation of almost the entire genetic pathway of Drosophilaendodermal development. This pathway is initiated by early maternal signals and results in the terminal differentiation of the midgut (Okumura, 2005).

srp is the first GATA factor gene to be expressed within the Drosophila endoderm, and it is essential for its endodermal specification. srp is expressed in the prospective endoderm in the cellular blastoderm stages, but its expression disappears by stages 10–11. This study shows that srp activates GATAe, and that GATAe is required for expression of specific genes in the differentiated midgut. GATAe induces the expression of late endodermal marker genes even in the absence of srp activity. Since GATAe expression in the endodermal midgut persists throughout the embryonic, larval, and adult stages of Drosophila, it seems likely that GATAe is also necessary for maintaining gene expression in the differentiated midgut. Inactivation of GATAe transcripts with dsRNA does not cause any marked morphological defects, but most of these embryos fail to hatch, suggesting that GATAe is essential for differentiated midgut and for viability of the larva. It should be noted that the endodermal midgut is subdivided into four chambers, and further arranged into 13 subdomains with distinct gene expression patterns. Homeotic genes expressed in the visceral muscle were shown to cause subdivision of the midgut into the four chambers, but the mechanisms that generate the various subdomains are still unknown (Okumura, 2005).

The sequential activation of srp and GATAe during endodermal development in Drosophila is analogous to the gene regulatory cascade that occurs during endodermal development in C. elegans. The earliest endodermal GATA factor expressed in C. elegans is end-1, which is expressed in the endoderm. end-1 then activates the subordinate GATA factor gene, elt-2, which activates late endodermal genes (Fukushige, 1998). These results suggest that the genetic mechanism underlying endodermal development is at least partially conserved between Drosophila and C. elegans. However, the molecular phylogenetic relationship of these endoderm-specific GATA factor genes has not yet been established (Okumura, 2005).

GATA factors are also essential for endodermal development in vertebrates. In the Xenopus embryo, GATA-4 and GATA-5 are expressed in the prospective endoderm, and both genes can induce formation of the endoderm. GATA-5 also plays an important role in endodermal development in zebrafish, where it functions as an upstream regulator of Sox17a, which is essential for endodermal specification. In Drosophila and C. elegans, the GATA factor genes GATAe and elt-2, respectively, continue to be expressed in the differentiated gut, as well as in the early stages of endodermal specification. In mammals, the GATA-4, and -6 proteins are expressed in the differentiated stomach and intestine, and bind to the gastric H+/K+-ATPase gene. Taken together, these findings suggest that vertebrate GATA factor genes also function in endodermal tissues after terminal differentiation (Okumura, 2005).

The Drosophila endoderm arises at the anterior and posterior terminal regions of the early blastoderm. Previous studies have revealed the gene regulatory pathway leading to endodermal development in some detail. A Torso-like protein secreted by the follicle cells covering both the anterior and posterior terminal regions of the egg triggers activation of the receptor tyrosine kinase, Torso, resulting in a graded Ras signal that peaks at both terminals. A zygotic gap gene, huckebein (hkb) is activated by high levels of the Ras signal, and hkb, in turn, activates srp. The present study revealed that srp activates another GATA factor gene, GATAe, and the latter leads to terminal differentiation of the midgut. GATAe may be necessary to maintain gene expression in the terminally differentiated midgut, since GATAe expression persists in the midgut throughout life. In addition to playing essential roles in the development of the endoderm, srp and GATAe also act to restrict the area of the adjacent hindgut. The ectodermal hindgut is specified by a Brachyury ortholog, brachyenteron (byn). The gene regulatory pathway leading to the activation of byn is closely linked with that of srp. Ras signaling in the posterior terminal region of the fertilized egg activates another gap gene, tailless (tll). tll is required for the activation of byn. The byn-positive domain in the early cellular blastoderm stages includes the prospective endoderm domain, but the expression in the prospective endoderm soon disappears in response to the repressive activity of srp. byn expression in the hindgut persists throughout life, as does GATAe expression in the midgut. These rather simple regulatory pathways lead to the activation of GATAe and byn, and consequently, to the terminal differentiation of the midgut and hindgut. It should be noted that these pathways not only delineate the process of endodermal development in Drosophila, but also highlight conserved genetic components underlying the endodermal development of multicellular animals (Okumura, 2005).

A genetic framework controlling the differentiation of intestinal stem cells during regeneration in Drosophila

The speed of stem cell differentiation has to be properly coupled with self-renewal, both under basal conditions for tissue maintenance and during regeneration for tissue repair. Using the Drosophila midgut model, this study analyzed at the cellular and molecular levels the differentiation program required for robust regeneration. The intestinal stem cell (ISC) and its differentiating daughter, the enteroblast (EB), were observed to form extended cell-cell contacts in regenerating intestines. The contact between progenitors is stabilized by cell adhesion molecules, and can be dynamically remodeled to elicit optimal juxtacrine Notch signaling to determine the speed of progenitor differentiation. Notably, increasing the adhesion property of progenitors by expressing Connectin is sufficient to induce rapid progenitor differentiation. It was further demonstrated that JAK/STAT signaling, Sox21a and GATAe form a functional relay to orchestrate EB differentiation. Thus, this study provides new insights into the complex and sequential events that are required for rapid differentiation following stem cell division during tissue replenishment (Zhai, 2017).

Key questions in stem cell biology are how the pool of stem cells can be robustly expanded yet also timely contracted through differentiation to generate mature cells according to the need of a tissue, and what are the underlying mechanisms that couple stem cell proliferation and differentiation. Over the last years, the mechanisms underlying intestinal stem cell activation have been extensively studied in both flies and mammals, while the genetic control of progenitor differentiation, especially during regeneration, has only recently begun to be understood (Zhai, 2017).

The transcription factor Sox21a has recently been the focus of studies in fly intestines. Using a Sox21a-sGFP transgene, this study uncovered its dynamic expression pattern in intestinal progenitors. Higher levels of Sox21a were found in ISC during homeostatic conditions but in EB during regeneration, supporting the roles of Sox21a in both ISC maintenance and EB differentiation at different conditions. The highly dynamic expression pattern of Sox21a revealed by this sGFP-tagged transgene per se argues against accumulation and perdurance of GFP fusion protein. Indeed, immunostaining using an antibody against Sox21a also indicated stronger Sox21a expression in ISC in homeostatic condition and global activation of Sox21a in progenitors under DSS-induced regeneration. However, Chen (2016) suggested that Sox21a levels are always higher in EB than in ISC by applying another antibody against Sox21a. The inconsistency between these studies may have arisen from the differences in EB stages examined or the sensitivity of respective detection approaches (Zhai, 2017).

This study has analyzed the cellular processes required for efficient progenitor differentiation during regeneration. Three main findings are reported revealing: i) the importance of extended contact between a stem cell and its differentiating daughter, ii) the existence of specific mechanisms allowing fast differentiation during regeneration, and iii) the characterization of a genetic program instructing the transition from EB to EC. These results together led to a proposal of a molecular framework underlying intestinal regeneration that is discussed below step by step (Zhai, 2017).

By studying the mechanisms of Sox21a-induced differentiation, this study found that ISC establishes extended contact with its differentiating daughter within a progenitor pair. Increased interface contact was not only observed upon Sox21a expression but also during regeneration after bacterial infection and DSS-feeding. Since the presence of extended contact is rare in intestinal progenitors under homeostatic conditions, it is hypothesized that extended contact between progenitors is related to increased epithelial renewal as a mechanism to elicit optimal juxtacrine Notch signaling to accelerate the speed of progenitor differentiation. The observations that down-regulation of the cell adhesion molecules E-Cadherin or Connectin suppresses rapid progenitor differentiation upon regeneration, and that overexpression of Connectin is sufficient to promote differentiation, underline the importance of increased cell-cell contact in rapid differentiation. This study shows that one early role of Sox21a is to promote the formation of this contact zone, possibly through transcriptional regulation of Connectin. Further studies should identify the signals and pathways leading to the change of contact between progenitors to adjust the rate of differentiation (Zhai, 2017).

Intestinal progenitors with extended contact in non-homeostatic midguts have been observed in some studies, but their role and significance have not been analyzed. Previous studies have also shown that progenitor nests are outlined by E-Cadherin/β-Catenin complexes, yet it was not known whether different degrees of progenitor contact are associated with their ISC versus EB fate. Consistent with these results, recent modeling analyses suggested a positive correlation between the contact area of progenitor pairs and the activation of Notch signaling. Thus, it seems that an increase in the contact area between intestinal progenitors is a hallmark of progenitors that are undergoing accelerated differentiation towards ECs. Another study has suggested an inhibitory role of prolonged ISC-EB contact to restrict ISC proliferation. Collectively, these studies and the current findings suggest that the strong contact between ISC and EB promotes on one hand the efficient differentiation of EBs into mature intestinal cells while on the other hand preventing stem cells from over-dividing. Thus, it is hypothesize that alteration in the contact zone provides a mechanism for ensuring both the appropriate speed of differentiation and the timely resolution of stem cell proliferative capacity (Zhai, 2017).

A second finding of this study consists in revealing the existence of specific mechanisms accelerating differentiation for tissue replenishment. In addition to the extended contact discussed above, a difference was observed in the pattern of ISC division between homeostatic and highly regenerative intestines. The modes of ISC division in Drosophila have been the topic of intense discussion, and the general consensus is that it is associated with an asymmetric cell fate outcome, in which one cell remains an ISC and the other engages in differentiation. In line with these previous studies, the results support the notion that asymmetric cell division is the most prevalent mode of ISC division under homeostatic conditions, where the rate of epithelial renewal is low. However, use of ISC- and EB-specific markers shows that upon rapid regeneration an ISC divides into two cells both expressing the ISC marker Dl-GFP but with one cell showing weak Notch activity. Similarly to other Notch-mediated cell-fate decision systems, this study suggests that the two resulting Dl-GFP+ cells from a symmetric division stay in close contact and compete for the stem cell fate. While this study is not the first to postulate the existence of symmetric ISC division, the use of reliable ISC- and EB-specific markers allows better visualization of this process. Applying a dual-color lineage tracing system to unravel the final fate of respective cells in a Dl+-Dl+ pair could reinforce the existence of symmetric stem cell division. This is nevertheless technically challenging to apply here since all the current available lineage-tracing settings require a heat shock to initiate the labeling, which affects intestinal homeostasis (Zhai, 2017).

Importantly, this study shows that the genetic program required for fast intestinal regeneration differs from the one involved in basal intestinal maintenance. This study indicates that GATAe, Dpp signaling, and the cell adhesion molecules E-cadherin and Connectin are not critical for progenitor differentiation when the rate of epithelial renewal is low, whereas their roles become crucial upon active regeneration. It is speculated that many discrepancies in the literature can be reconciled by taking into consideration that some factors are required only for rapid differentiation but not in basal conditions. For instance, the implication of Dpp signaling in differentiation has been disputed, since one study focused on bacterial infection-induced regeneration while two other studies dealt with basal conditions. The current study points to a clear role of Dpp signaling in the differentiation process upon regeneration. Therefore, better defining the genetic program that allows adjusting the speed of differentiation would be of great interest (Zhai, 2017).

Cell fate determination and differentiation involve extensive changes in gene expression and possibly also gradual change of cell morphology. The EB to EC differentiation in the adult Drosophila intestine provides a model of choice to study this process. This transition includes changes in cell shape, an increase in cell size, DNA endoreplication leading to polyploidy and the activation of the set of genes required for EC function. This study has integrated a number of pathways (Notch, JAK/STAT and Dpp/BMP) and transcription factors (Sox21a and GATAe) into a sequential framework. It was further shown that Sox21a contributes to the EB-EC transition downstream of JAK/STAT but upstream of Dpp signaling and GATAe. The recurrent use of several factors, namely JAK/STAT, Sox21a and GATAe at different processes including ISC self-renewal and EB-EC differentiation is likely to be a general feature during cell fate determination, and somehow also complicates the study of differentiation. Future work should analyze how each of the factors interacts with the other in a direct or indirect manner. It would be interesting as well to further study how these factors shape intestinal regionalization as the gut exhibits conspicuous morphological changes along the length of the digestive tract (Zhai, 2017).

Several of the findings described in this study are likely to apply to the differentiation program that takes place in mammals. Since Notch signaling plays major roles in stem cell proliferation and cell fate specification from flies to mammals, it would be interesting to decipher whether in mammals changes in progenitor contact also impact differentiation speed and whether a specific machinery can accelerate progenitor differentiation when tissue replenishment is required (Zhai, 2017).

Transcription factor binding affinities and DNA shape readout

An essential event in gene regulation is the binding of a transcription factor (TF) to its target DNA. Models considering the interactions between the TF and the DNA geometry proved to be successful approaches to describe this binding event, while conserving data interpretability. However, a direct characterization of the DNA shape contribution to binding is still missing due to the lack of accurate and large-scale binding affinity data. This study use a recently established binding assay to measure with high sensitivity the binding specificities of 13 Drosophila TFs, including dinucleotide dependencies to capture non-independent amino acid-base interactions. Correlating the binding affinities with all DNA shape features, this study found that shape readout is widely used by these factors. A shape readout/TF-DNA complex structure analysis validates this approach while providing biological insights such as positively charged or highly polar amino acids often contact nucleotides that exhibit strong shape readout (Schnepf, 2020).

The binding of transcription factors (TFs) to specific DNA sequences is a key event for the regulation of gene expression. The features defining a binding site have been the focus of several decades of research starting from simple consensus motif binding sites, later replaced by probabilistic models of TF binding assuming that each base contributes independently to the overall affinity, the so-called position-specific weight matrices (PWMs). With the advent of high-throughput methods, binding specificities became available for thousands of TFs and it has become clear that more complex models for binding sites using non-independent nucleotide interactions lead to more accurate predictions than PWMs. Nucleotide correlations can originate from amino acids that contact multiple bases simultaneously or from stacking interactions that determine binding through DNA shape readout. Hence, although determining binding specificities is crucial to predict binding sites in the genome, such data alone are not sufficient to fully describe TF-DNA binding interactions as they do not provide insights about the mechanism the TF employs to bind to different DNA sequences. To elucidate how the TF 'reads' the DNA is of paramount importance not only to improve algorithms predicting binding sites but also to refine fundamental understanding of how TFs are recruited to specific DNA regulatory sequences. To date, two distinct modes of protein-DNA recognition are known: base readout, which reflects the interplay at nucleobase-amino acid contacts mainly driven by the formation of hydrogen bonds, and shape readout, dominated by van der Waals interactions and electrostatic potentials (EPs), that recognizes the 3D structure of the DNA double helix. As a consequence, one can assume that, if the TF uses the shape readout, models incorporating DNA structural information should improve prediction of TF-DNA binding specificities. To test this hypothesis and thereby help model development, it would thus be highly desirable to (1) determine accurately TF-DNA binding specificities, including non-independent nucleotide interactions since deviations from linear binding can carry information about the influence of DNA shape, and (2) use these data to assess the contribution of DNA shape readout to the binding interaction. Despite the availability of techniques able to measure protein-DNA interactions at high throughput such as protein binding microarray (PBM), SELEX-seq, and SMiLE-seq, the accurate measurement of binding affinities remains problematic. Moreover, these methods require a resin- or filter-based selection step that introduces bias and/or use stringent washing protocols resulting in the loss of weak binders, which can lead to erroneously over-specific binding specificities. These limitations are critical, especially to determine higher-order binding interactions, which are intrinsically weak (Schnepf, 2020).

Evaluating the contribution to binding of DNA shape readout also poses challenges. First, although it had been known for along time from crystal structures that. TFs read out the DNA shape, it is still not possible to determine experimentally the DNA shape features at a large scale for any given DNA sequence. However, this would be necessary to quantitatively assess DNA shape influence on TF-DNA binding. This issue has been tackled by Zhou. who introduced 'DNAShape' (Zhou, 2013), an algorithm that predicts structural DNA features from nucleotide sequences, considering at each DNA position a local 5-mers nucleotide environment. The original set of four geometric shape features was later completed by Li (2017), who made tables available to calculate an expanded repertoire of 13 DNA shape features in total. Finally, Chiu (2017) added in a comparable fashion the EP, which approximates the minor-groove EPs. The EP reflects the mean charge density of the DNA back-bone sensed by positively charged amino acid residues of the binding protein. Another difficulty to analyze the influence of DNA shape to binding is that, in spite of all the advances made possible by 'DNAShape' and the succeeding studies, it is still not clear to what degree shape readout can be described as a function of the underling DNA sequence. It is indeed very difficult to tease apart whether a binding protein favors a given nucleotide sequence because it recognizes certain amino acids of this sequence or rather certain shapes features of the DNA helix. An important step was made with homeodomain TFs by Abe (2015), who was able to specifically remove the ability of the binding proteins to read a certain structural feature of DNA and to switch between different modes of DNA shape readouts. Another approach computationally dissects TF binding specificity in terms of base and shape readout (Rube, 2018). Remarkably, that study determined that 92-99% of the variance in the shape features can be explained with a model considering only dinucleotides dependencies. That study also found that interactions were much stronger between neighboring nucleotides than for non-adjacent positions, indicating that these dinucleotide features are the most important for binding. Hence, determining neighboring dinucleotide dependencies should be enough to capture most on the higher-order binding interactions. Unfortunately, although these studies shed new light on the role of DNA shape in TF-DNA recognition, they were limited to the analysis of only a few factors and used only four different shape features. This was due to the lack of quantitative data on higher-order binding specificities and to the lack of tables to calculate other shape features. Thus, a more comprehensive analysis of TF-DNA binding - especially including higher-order dependencies - is urgently needed to better understand TF-DNA binding in general and to what extent DNA shape features are recognized by TFs in particular. Recently, high-performance fluorescence anisotropy (HiP-FA) (Jung, 2018; Jung, 2019), was presented as a method that determines TF-DNA binding energies directly in solution with high sensitivity and at a large scale and allows for measuring the affinity of a TF to any given DNA sequence. These features make HiP-FA an ideal tool to measure TF-DNA binding specificities, in particular the higher-order dependencies since these interactions are generally weak and their accurate measurement is both difficult and indispensable. This study used HiP-FA to measure binding energies for 13 TFs of the Drosophila segmentation gene network belonging to 8 different binding domain families. Their 0th order of binding specificities were determined taking only into account independent base contributions (PWM) and their first order of binding specificities accounting for dinucleotide dependencies represented by the dinucleotide position weight matrices (DPWMs). This work defines DPWMs as being the scoring matrices characterizing the deviations in the dinucleotide binding energies compared to pure PWMs (Schnepf, 2020).

Correlating the affinity data with the 13 known DNA shape features and the EP, it was found that nearly all the factors extensively use shape readout for DNA recognition, independently of the binding domain family. For 11 TFs for which structural information is available, the correlations were examined between their nuclear magnetic resonance (NMR)/co-crystal structures or structures of analog proteins obtained by homology-based modeling and the shape attributes obtained from this analysis. Finally, a cluster analysis was run to test if certain shape features tend to co-occur in the DNA shape readout used by these TFs (Schnepf, 2020).

Correlation between DNA shape readout and structural information is presented for homeodomain proteins Bicoid, Goosecoid and Ocelliless, for the bZip transcription factor Giant, and for the zinc finger transcription factor GATAe (see Correlation between DNA shape readout and structural information) (Schnepf, 2020).

HiP-FA constitutes a powerful tool to quantify TF-DNA binding specificity, especially the non-independent interactions requiring to be determined with high accuracy. The throughput of the method is not sufficient to discover de novo shape motifs or to explore the large sequence space possible with sequencing-based methods like HT-SELEX or SMiLE-seq. However, this is not a major limitation since the prior knowledge that HiP-FA requires (some information about the TF's binding preferences) is known for many TFs, and dinucleotide mutations are sufficient to cover most of the non-independent amino acid-nucleotide interactions. It would also be straightforward to extend the measurements in the flanking regions of the core binding motif (Schnepf, 2020).

By combining directly TF-DNA binding affinities, DNA shape features, and structural information, this study gained insights into their correlation, a debated topic due to their intrinsic covariation. Importantly, the results suggest that DNA shape readout is widespread among the TFs. The extended use of DNA shape readout by TFs has become increasingly apparent over the past years, which comes as no surprise considering that the number of van der Waals interactions enabling shape readout account for two-third of the protein-DNA interactions (Rube, 2018). The correlation analysis of the shape readout values with protein-DNA complex structures leads to a generalization of the influence of the charged amino acids on the shape readout that has been described so far only for homeodomains in the minor groove region of the DNA. This effect is attributed to other DNA secondary structures (such asa-helixes) and to other binding domains. In addition, for the POU domain Nub non-charged but polar residues are described that can also lead to a strong DNA shape readout. These effects onDNA shape readout have not been reported previously. The difficulty to detect the effects of charged and non-charged residues, especially in the major groove, is that they are obscured by the interactions involved in the base readout. This analysis was able to resolve even subtle effects due to the high sensitivity of the binding affinity measurements, and the shape analysis was able to deconvolve, to some extent, shape from base readout. In summary, the binding specificities were determined for 13 Drosophila TFs including first-order depedencies, provided insights into the correlation between their binding affinities to DNA and the shape features of the DNA helix, and gave structural insights in the shape readout. This method could easily be extended to more factors and to different organisms to provide a refined catalog of TF-DNA shape readout landscapes (Schnepf, 2020).

Although the HiP-FA assay allows determination of accurate binding affinities at a relatively large scale, the whole sequence space cannot be covered as high-throughput methods do. To restrict the number of measurements, this study thus focussed on the core binding motif of the TFs, and to all mononucleotide and dinucleotides mutations of the consensus sequence rather that all possible mutations. This should however cover most of the TF-DNA interactions since it has been shown that dinucleotide models explain >92% of the variance for the MGW, ProT, Roll, and HelT shape features (Rube, 2018). In addition, this analysis based on the direct correlation between binding affinities and shape features can only indirectly and partially tease apart the respective contributions of base and DNA shape readouts. Note that how to achieve the deconvolution between base and shape readouts is a longstanding issue in the field (Schnepf, 2020).


serpent is required for endodermal development in Drosophila. srp is first expressed at the cellular blastoderm stage in yolk cells, and in prospective regions of the endoderm, amnioserosa, and hemocyte primordium. Endodermal expression of srp disappears by stages 10-11. Embryos lacking srp activity fail to develop endoderm; instead, the prospective endodermal region develops into the ectodermal hindgut. Since srp is expressed in the endoderm earlier than GATAe is expressed, activation of GATAe expression by srp was examined. Expression of GATAe in the prospective endoderm region is abolished in the srp mutant (srp2/srp2), whereas GATAe expression in the Malpighian tubules is not affected. Conversely, ubiquitously misexpressed srp causes strong ectopic expression of GATAe in the foregut, and in the hindgut. Note that the foregut and hindgut arise immediately anterior and posterior to the endoderm, respectively. This ectopic expression pattern is transient. During embryogenesis, GATAe is also induced ectopically in the salivary gland and segmentally in the ventral nerve cord. These results strongly suggest that srp activates GATAe in the endoderm. fork head (fkh) is expressed throughout the prospective gut, and the gut primordia degenerate during germband retraction in fkh mutants. However, GATAe expression is not affected in the fkh mutant (fkhXT6/fkhXT6) (Okumura, 2005).

GATAe expression in the Malpighian tubules is not affected in the srp mutant, indicating that this expression depends on some other gene. Krüppel (Kr) is known to be required for the development of Malpighian tubules, so whether the GATAe expression in this organ depends on Kr was investigated. GATAe expression is completely abolished in the prospective region of the Malpighian tubules of Kr mutant embryos, indicating that Kr is required for GATAe expression in the Malpighian tubules (Okumura, 2005).

Targets of Activity

Whether the GATAe plays a role in expression of an early marker gene in the endoderm was examined. The Race gene is an early marker for endoderm and amnioserosa. Race expression begins in the invaginating endoderm slightly before GATAe expression begins, and Race expression persists throughout embryogenesis. Race expression in the endoderm is not affected in either Df(3R)sbd45 embryos or in embryos treated with GATAe dsRNA. These results show that GATAe is not essential for the expression of Race during normal development. Nevertheless, misexpression of GATAe in the present study induces ectopic expression of Race in a portion of the hindgut. Since Race is a target of srp, it is likely that misexpressed GATAe activates the srp target gene because of the possible structural similarities between the protein products of GATAe and srp (Okumura, 2005).

In the posterior terminal region of the blastoderm, prospective regions of the posterior endoderm and hindgut abut each other. byn is responsible for determination of the prospective hindgut. byn is expressed throughout the prospective posterior endoderm during early stage 5, but soon disappears in this region. It is the srp gene that represses byn in the prospective endoderm. Thus, the boundary between the endoderm and hindgut is established by the repressive activity of srp on byn. Whether GATAe also represses byn was examined, since GATAe continues to be expressed after endodermal srp expression ceases. Ectopic expression of byn in the prospective endodermal domain is observed in embryos that lack GATAe, although the area of ectopic expression is much smaller than that observed in the srp mutant, in which ectopic expression of byn is observed throughout the entire prospective endoderm. Moreover, when misexpressed in the prospective hindgut domain, both GATAe and srp strongly repress byn expression. Thus, GATAe is required to maintain the endodermal identity that is initially established by srp (Okumura, 2005).



In situ hybridization on whole-mount embryos was used to determine the expression pattern of GATAe during embryogenesis. GATAe mRNA is first detected in the posterior endoderm at stages 7-8, and in the anterior endoderm at stage 8. Malpighian tubule primordia also express GATAe from stage 10 onwards. Expression of GATAe in the endoderm, in the midgut (which is exclusively derived from the endoderm) and within Malpighian tubules continues throughout embryonic development. GATAe expression is not detected in any embryonic tissues other than endoderm and Malpighian tubules (Okumura, 2005).

RT-PCR was used to examine whether GATAe expression in the endodermal midgut continues in post-embryonic stages. The midguts of third instar larvae were dissected into anterior, middle, and posterior segments, whereas adult fly midguts were dissected into anterior and posterior segments. Each of these midgut segments exhibit GATAe mRNA expression, indicating that GATAe is expressed in the midgut throughout life (Okumura, 2005).


To test whether GATAe is required for the later stages of endodermal development, leading to terminal differentiation of the midgut, the role of GATAe in the expression of midgut-specific integrin βν, and the midgut-specific gap junction gene, inx7, was examined. integrin βν and inx7 are both expressed throughout the endoderm from embryonic stage 11 onwards. RT-PCR analyses indicate that integrin βν and inx7 are expressed in the midgut of third instar larvae and adult flies. Thus, these genes can serve as markers of the differentiated midgut (Okumura, 2005).

The Df(3R)sbd45 embryo lacks GATAe and pnr loci, but, retains the srp locus. In contrast to the srp mutant, in which the prospective endodermal region develops into ectodermal tissues, including a portion of the hindgut, the Df(3R)sbd45 embryo forms apparently normal midgut primordium surrounding the yolk, with constrictions that are characteristic of the late stages of normal midgut. However, the midgut primordium fails to express integrin βν and inx7. Another midgut-specific gene, midgut expression 1 (mex1) is also not expressed in this embryo. Since the Df(3R)sbd45 strain also lacks the pnr locus, the hypothesis was tested that this phenotype is caused by the lack of GATAe, but not by the lack of pnr. Both integrin βν and inx7 were normally expressed in both pnrVX6/Df(3R)sbd45 heterozygotes and pnrVX6 (null) homozygotes. Unlike the misexpression of GATAe, misexpressed pnr does not induce integrin βν or inx7 expression in the hindgut. When GATAe is ubiquitously misexpressed in the Df(3R)sbd45 homozygotes, expression of integrin βν is restored, although inx7 expression is not restored under these conditions. To further confirm that the loss of integrin βν and inx7 expression is due to the lack of GATAe activity, GATAe transcripts were inactivated with RNA interference (RNAi) using dsRNA. The gross morphology of the midgut primordium of dsRNA-injected embryos appears to be normal, forming normal constrictions in late stages. However, embryos injected with dsRNA corresponding to GATAe fail to express either of the integrin βν and inx7. The loss of midgut markers in response to dsRNA treatment is not a secondary effect of the loss of srp activity, since Srp protein is detected at normal levels in embryos injected with dsRNA. When GATAe is ubiquitously misexpressed in the srp mutant embryos, expression is restored. In addition, misexpression of GATAe in the hindgut of wild-type embryos strongly induces ectopic expression of integrin βν and inx7. Misexpressed pnr does not induce these marker genes. Furthermore, both integrin βν and inx7 are also induced in S2 cells transfected with GATAe cDNA. These results clearly demonstrate that GATAe induces gene expression in the differentiated midgut (Okumura, 2005).

GATA factors participate in tissue-specific immune responses in Drosophila larvae

Drosophila responds to infection by producing a broad range of antimicrobial agents in the fat body and more restricted responses in tissues such as the gut, trachea, and malpighian tubules. The regulation of antimicrobial genes in larval fat depends on linked Rel/NF-kappaB and GATA binding sites. Serpent functions as the major GATA transcription factor in the larval fat body. However, the transcriptional regulation of other tissue-specific responses is less well understood. This study presents evidence that dGATAe regulates antimicrobial gene expression in the midgut. Regulatory regions for antimicrobial genes Diptericin and Metchnikowin require GATA sites for activation in the midgut, where Grain (dGATAc), dGATAd, and dGATAe are expressed in overlapping domains. Ectopic expression of dGATAe in the larval fat body, where it is normally absent, causes dramatic up-regulation of numerous innate immunity and gut genes, as judged by microarray analysis and in situ hybridization. Ectopic dGATAe also causes a host of symptoms reminiscent of hyperactive Toll (Toll10b) mutants, but without apparent activation of Toll signaling. Based on this evidence it is proposed that dGATAe mediates a Toll-independent immune response in the midgut, providing a window into the first and perhaps most ancient line of animal defense (Senger, 2006).

Previous studies have established the importance of Serpent in mediating the systemic immune response in the larval fat body. Serpent is also essential for the differentiation of the fat body during embryogenesis; srp mutants are lethal and lack fat cell differentiation. Serpent is not merely a transient determinant of fat body development. It is proposed that dGATAe plays an analogous role in the anterior midgut: it functions as a tissue determinant in the embryo but mediates immunity in larvae (Senger, 2006).

dGATAe is expressed throughout the developing midgut during embryogenesis. As seen for srp in the fat body, dGATAe expression persists in the definitive midgut of feeding larvae. Thus, both srp and dGATAe might have dual roles in development and physiology. Early expression is required for tissue differentiation, and late expression is required for the immune response. srp mediates immunity in the fat body, whereas dGATAe mediates expression of specific immunity genes in the midgut (Senger, 2006).

Evidence that dGATAe functions in the early development of the midgut stems from microarray assays. Misexpression of dGATAe in the fat body leads to ectopic induction of a number of genes required for digestion, including trypsin-like serine proteases, a sugar transporter, and genes involved in lipid metabolism. All of these genes display restricted expression in the midgut of developing embryos, in regions where dGATAe is also expressed. It is proposed that at least some of these genes are immediate and direct targets of dGATAe in the developing gut, and consequently, they are efficiently activated by dGATAe in the fat body (Senger, 2006).

dGATAe might also activate genes required for immunity gene expression in the anterior midgut of feeding larvae. By analogy to Serpent, dGATAe might activate different components of signaling pathways required for immunity. When larvae ingest pathogenic bacteria such as E. carotovora, these pathways are induced to trigger expression of Drs, Dpt, Mtk, and other immunity genes in the anterior midgut. When misexpressed in the fat body, dGATAe only moderately affects the levels of Dpt or Mtk, although the regulatory regions of these genes showed GATA-dependent activity in the midgut. This low activity is attributed to the presence of repressors in the fat body that cannot be overcome by ectopic dGATAe (Senger, 2006).

It is possible that dGATAe-mediated immunity in the anterior midgut does not depend on the Toll signaling pathway, because none of the signaling components of this pathway are up-regulated in the fat body on misexpresion of dGATAe. This misexpression is nonetheless sufficient to induce Drs and dro5 in the absence of infection or injury. It is proposed that dGATAe plays two roles in the differentiated midgut. One role is activating 'housekeeping' genes that are required for digestion. The second role of dGATAe in the midgut is triggering unknown signaling pathways that lead to the activation of immunity genes such as Drs, Dpt, and Mtk. Drs may be especially sensitive to dGATAe in the fat body because it is “poised” for induction (Senger, 2006).

In summary, it has been argued that dGATAe is critical for anterior midgut formation and function in a manner analogous to Serpent in the fat body. It is possible that the dGATAe immunity pathway is an evolutionarily ancient form of innate immunity. Under typical living conditions, the gut is the first line of defense, because ingestion is the most likely basis for contact with pathogens. The immunity signaling pathway(s) governing dGATAe activity is not yet known. However, ectopic expression of dGATAe in the fat body leads to the activation of a number of signaling components, including RhoL, Takl2, and Tetraspanins. The latter are integral membrane proteins that have been implicated in the immune responses of higher organisms, including antigen presentation. Future studies will assess the role, if any, of these genes in the gut-specific immune response (Senger, 2006).


Search PubMed for articles about Drosophila GATAe

Abe, N., Dror, I., Yang, L., Slattery, M., Zhou, T., Bussemaker, H. J., Rohs, R. and Mann, R. S. (2015). Deconvolving the recognition of DNA shape from sequence. Cell 161(2): 307-318. PubMed ID: 25843630

Fukushige, T., Hawkins, M. G. and McGhee, J. D. (1998). The GATA-factor elt-2 is essential for formation of the Caenorhabditis elegans intestine. Dev. Biol. 198(2): 286-302. 9659934

Jung, C., Schnepf, M., Bandilla, P., Unnerstall, U. and Gaul, U. (2019). High sensitivity measurement of transcription factor-DNA binding affinities by competitive titration using fluorescence microscopy. J Vis Exp(144). PubMed ID: 30799844

Jung, C., Bandilla, P., von Reutern, M., Schnepf, M., Rieder, S., Unnerstall, U. and Gaul, U. (2018). True equilibrium measurement of transcription factor-DNA binding affinities using automated polarization microscopy. Nat Commun 9(1): 1605. PubMed ID: 29686282

Li, J., Sagendorf, J. M., Chiu, T. P., Pasi, M., Perez, A. and Rohs, R. (2017). Expanding the repertoire of DNA shape features for genome-scale studies of transcription factor binding. Nucleic Acids Res 45(22): 12877-12887. PubMed ID: 29165643

Okumura, T., Matsumoto, A., Tanimura, T. and Murakami, R. (2005). An endoderm-specific GATA factor gene, dGATAe, is required for the terminal differentiation of the Drosophilaendoderm. Dev. Biol. 278(2): 576-86. 15680371

Rube, H. T., Rastogi, C., Kribelbauer, J. F. and Bussemaker, H. J. (2018). A unified approach for quantifying and interpreting DNA shape readout by transcription factors. Mol Syst Biol 14(2): e7902. PubMed ID: 29472273


Senger, K., Harris, K. and Levine, M. (2006). GATA factors participate in tissue-specific immune responses in Drosophila larvae. Proc. Natl. Acad. Sci. 103(43): 15957-62. PubMed Citation: 17032752

Zhai, Z., Boquete, J. P. and Lemaitre, B. (2017). A genetic framework controlling the differentiation of intestinal stem cells during regeneration in Drosophila. PLoS Genet 13(6): e1006854. PubMed ID: 28662029

Zhou, T., Yang, L., Lu, Y., Dror, I., Dantas Machado, A. C., Ghane, T., Di Felice, R. and Rohs, R. (2013). DNAshape: a method for the high-throughput prediction of DNA structural features on a genomic scale. Nucleic Acids Res 41(Web Server issue): W56-62. PubMed ID: 23703209

Biological Overview

date revised: 20 August 2012

Home page: The Interactive Fly © 2017 Thomas Brody, Ph.D.