InteractiveFly: GeneBrief

zelda: Biological Overview | References

Gene name - zelda

Synonyms - CG12701, vielfaltig

Cytological map position - 18F-18F2

Function - transcription factor

Keywords - cellular blastoderm, maternal-zygotic transition, mitotic cell division, maternal

Symbol - zld

FlyBase ID: FBgn0259789

Genetic map position - X: 19,670,020..19,676,961 [-]

Classification - Zinc-finger protein

Cellular location - nuclear

NCBI links: EntrezGene

Zelda orthologs: Biolitmine

Recent literature
Pires, C. V., Freitas, F. C., Cristino, A. S., Dearden, P. K. and Simoes, Z. L. (2016). Transcriptome analysis of honeybee (Apis mellifera) haploid and diploid embryos reveals early zygotic transcription during cleavage. PLoS One 11: e0146447. PubMed ID: 26751956
In honeybees, the haplodiploid sex determination system promotes a unique embryogenesis process wherein females develop from fertilized eggs and males develop from unfertilized eggs. However, the developmental strategies of honeybees during early embryogenesis are virtually unknown. Similar to most animals, the honeybee oocytes are supplied with proteins and regulatory elements that support early embryogenesis. As the embryo develops, the zygotic genome is activated and zygotic products gradually replace the preloaded maternal material. The analysis of small RNA and mRNA libraries of mature oocytes and embryos originated from fertilized and unfertilized eggs has allowed exploration of the gene expression dynamics in the first steps of development and during the maternal-to-zygotic transition (MZT). A short sequence motif identified as TAGteam motif was identified and was hypothesized to play a similar role in honeybees as in fruit flies, which includes the timing of early zygotic expression (MZT), a function sustained by the presence of the zelda ortholog, which is the main regulator of genome activation. Predicted microRNA (miRNA)-target interactions indicated that there were specific regulators of haploid and diploid embryonic development and an overlap of maternal and zygotic gene expression during the early steps of embryogenesis. Although a number of functions are highly conserved during the early steps of honeybee embryogenesis, the results showed that zygotic genome activation occurs earlier in honeybees than in Drosophila based on the presence of three primary miRNAs (pri-miRNAs) (ame-mir-375, ame-mir-34 and ame-mir-263b) during the cleavage stage in haploid and diploid embryonic development.
Blythe, S. A. and Wieschaus, E. F. (2016). Establishment and maintenance of heritable chromatin structure during early Drosophila embryogenesis. Elife 5. PubMed ID: 27879204
During embryogenesis, the initial chromatin state is established during a period of rapid proliferative activity. This study measured with three-minute time resolution how heritable patterns of chromatin structure are initially established and maintained during the midblastula transition (MBT). Regions of accessibility are established sequentially, where enhancers are opened in advance of promoters and insulators. These open states are stably maintained in highly condensed mitotic chromatin to ensure faithful inheritance of prior accessibility status across cell divisions. The temporal progression of establishment is controlled by the biological timers that control the onset of the MBT. In general, acquisition of promoter accessibility is controlled by the biological timer that measures the nucleo-cytoplasmic (N:C) ratio whereas timing of enhancer accessibility is regulated independently of the N:C ratio. These different timing classes each associate with binding sites for two transcription factors, GAGA-factor and Zelda, previously implicated in controlling chromatin accessibility at ZGA.
Crocker, J., Tsai, A. and Stern, D. L. (2017). A fully synthetic transcriptional platform for a multicellular eukaryote. Cell Rep 18(1): 287-296. PubMed ID: 28052257
Regions of genomic DNA called enhancers encode binding sites for transcription factor proteins. Binding of activators and repressors increase and reduce transcription, respectively, but it is not understood how combinations of activators and repressors generate precise patterns of transcription during development. This problem was explored using a fully synthetic transcriptional platform in Drosophila consisting of engineered transcription factor gradients and artificial enhancers. First, a gradient of transcription-activator like protein (TALEs) fused to a VP16 activator (TALEA) was engineered. The gradient of TALEA protein was generated by driving TALEA expression with the hunchback promoter (hb-TALEA), resulting in a smooth anterior-to-posterior RNA gradient. The binding site for this TALEA, 5′-CCGGATGCTCCTCTT, is not present in the Drosophila genome and allowed construction of enhancers that would respond only to the TALEA. Binding sites for a transcription factor that makes DNA accessible, Zelda, were found to be required together with binding sites for transcriptional activators to produce a functional enhancer. Only in this context can changes in the number of activator binding sites mediate quantitative control of transcription. Using an engineered transcriptional repressor gradient, it was demonstrated that overlapping repressor and activator binding sites provide more robust repression and sharper expression boundaries than non-overlapping sites. This may explain why this common motif is observed in many developmental enhancers.
Ribeiro, L., Tobias-Santos, V., Santos, D., Antunes, F., Feltran, G., de Souza Menezes, J., Aravind, L., Venancio, T. M. and Nunes da Fonseca, R. (2017). Evolution and multiple roles of the Pancrustacea specific transcription factor Zelda in insects. PLoS Genet 13(7): e1006868. PubMed ID: 28671979
Gene regulatory networks (GRNs) evolve as a result of the coevolutionary processes acting on transcription factors (TFs) and the cis-regulatory modules they bind. The zinc-finger TF zelda (zld) is essential for the maternal-to-zygotic transition (MZT) in Drosophila melanogaster, where it directly binds over thousand cis-regulatory modules to regulate chromatin accessibility. D. melanogaster displays a long germ type of embryonic development, where all segments are simultaneously generated along the whole egg. However, it remains unclear if zld is also involved in the MZT of short-germ insects (including those from basal lineages) or in other biological processes. This study shows that zld is an innovation of the Pancrustacea lineage, being absent in more distant arthropods (e.g. chelicerates) and other organisms. To better understand the zld ancestral function, its roles were examined in a short-germ beetle, Tribolium castaneum, using molecular biology and computational approaches. The results demonstrate roles for zld not only during the MZT, but also in posterior segmentation and patterning of imaginal disc derived structures. Further, it was also demonstrated that zld is critical for posterior segmentation in the hemipteran Rhodnius prolixus, indicating this function predates the origin of holometabolous insects and was subsequently lost in long-germ insects. These results unveil new roles of zld in different biological contexts and suggest that changes in expression of zld (and probably other major TFs) are critical in the evolution of insect GRNs (Ribeiro, 2017).
Moshe, A. and Kaplan, T. (2017). Genome-wide search for Zelda-like chromatin signatures identifies GAF as a pioneer factor in early fly development. Epigenetics Chromatin 10(1): 33. PubMed ID: 28676122
The protein Zelda was shown to play a key role in early Drosophila development, binding thousands of promoters and enhancers prior to maternal-to-zygotic transition (MZT), and marking them for transcriptional activation. Recent studies have shown that Zelda acts through specific chromatin patterns of histone modifications to mark developmental enhancers and active promoters. Intriguingly, some Zelda sites still maintain these chromatin patterns in Drosophila embryos lacking maternal Zelda protein. A computational method was developed to analyze and refine the chromatin landscape surrounding early Zelda peaks, using a multichannel spectral clustering. This allowed characterization of their chromatin patterns through MZT (mitotic cycles 8-14). Specifically, this study focused on H3K4me1, H3K4me3, H3K18ac, H3K27ac, and H3K27me3 and identified three different classes of chromatin signatures, matching "promoters," "enhancers" and "transiently bound" Zelda peaks. The genome was then further scanned using these chromatin patterns, and additional loci - with no Zelda binding - were identified that show similar chromatin patterns, resulting with hundreds of Zelda-independent putative enhancers. These regions were found to be enriched with GAGA factor (GAF, Trl) and are typically located near early developmental zygotic genes. Overall this analysis suggests that GAF, together with Zelda, plays an important role in activating the zygotic genome. This computational approach offers an efficient algorithm for characterizing chromatin signatures around some loci of interest and allows a genome-wide identification of additional loci with similar chromatin patterns.
Hamm, D. C., Larson, E. D., Nevil, M., Marshall, K. E., Bondra, E. R. and Harrison, M. M. (2017). A conserved maternal-specific repressive domain in Zelda revealed by Cas9-mediated mutagenesis in Drosophila melanogaster. PLoS Genet 13(12): e1007120. PubMed ID: 29261646
In nearly all metazoans, the earliest stages of development are controlled by maternally deposited mRNAs and proteins. The zygotic genome becomes transcriptionally active hours after fertilization. Transcriptional activation during this maternal-to-zygotic transition (MZT) is tightly coordinated with the degradation of maternally provided mRNAs. In Drosophila melanogaster, the transcription factor Zelda plays an essential role in widespread activation of the zygotic genome. While Zelda expression is required both maternally and zygotically, the mechanisms by which it functions to remodel the embryonic genome and prepare the embryo for development remain unclear. Using Cas9-mediated genome editing to generate targeted mutations in the endogenous zelda locus, this study determined the functional relevance of protein domains conserved amongst Zelda orthologs. A highly conserved zinc-finger domain was identified that is essential for the maternal, but not zygotic functions of Zelda. Animals homozygous for mutations in this domain survived to adulthood, but embryos inheriting these loss-of-function alleles from their mothers died late in embryogenesis. These mutations did not interfere with the capacity of Zelda to activate transcription in cell culture. Unexpectedly, these mutations generated a hyperactive form of the protein and enhanced Zelda-dependent gene expression. These data have defined a protein domain critical for controlling Zelda activity during the MZT, but dispensable for its roles later in development, for the first time separating the maternal and zygotic requirements for Zelda. This demonstrates that highly regulated levels of Zelda activity are required for establishing the developmental program during the MZT. It is proposed that tightly regulated gene expression is essential to navigate the MZT and that failure to precisely execute this developmental program leads to embryonic lethality.
Dufourt, J., Trullo, A., Hunter, J., Fernandez, C., Lazaro, J., Dejean, M., Morales, L., Nait-Amer, S., Schulz, K. N., Harrison, M. M., Favard, C., Radulescu, O. and Lagha, M. (2018). Temporal control of gene expression by the pioneer factor Zelda through transient interactions in hubs. Nat Commun 9(1): 5194. PubMed ID: 30518940
Pioneer transcription factors can engage nucleosomal DNA, which leads to local chromatin remodeling and to the establishment of transcriptional competence. However, the impact of enhancer priming by pioneer factors on the temporal control of gene expression and on mitotic memory remains unclear. This study employs quantitative live imaging methods and mathematical modeling to test the effect of the pioneer factor Zelda on transcriptional dynamics and memory in Drosophila embryos. Increasing the number of Zelda binding sites accelerates the kinetics of nuclei transcriptional activation regardless of their transcriptional past. Despite its known pioneering activities, Zelda does not remain detectably associated with mitotic chromosomes and is neither necessary nor sufficient to foster memory.It was further revealed that Zelda forms sub-nuclear dynamic hubs where Zelda binding events are transient. It is proposed that Zelda facilitates transcriptional activation by accumulating in microenvironments where it could accelerate the duration of multiple pre-initiation steps.
Mir, M., Stadler, M. R., Ortiz, S. A., Hannon, C. E., Harrison, M. M., Darzacq, X. and Eisen, M. B. (2018). Dynamic multifactor hubs interact transiently with sites of active transcription in Drosophila embryos. Elife 7. PubMed ID: 30589412
The regulation of transcription requires the coordination of numerous activities on DNA, yet how transcription factors mediate these activities remains poorly understood. This study used lattice light-sheet microscopy to integrate single-molecule and high-speed 4D imaging in developing Drosophila embryos to study the nuclear organization and interactions of the key transcription factors Zelda and Bicoid. In contrast to previous studies suggesting stable, cooperative binding, this study shows that both factors interact with DNA with surprisingly high off-rates. Both factors form dynamic subnuclear hubs, and Bicoid binding is enriched within Zelda hubs. Remarkably, these hubs are both short lived and interact only transiently with sites of active Bicoid-dependent transcription. Based on these observations, it is hypothesized that, beyond simply forming bridges between DNA and the transcription machinery, transcription factors can organize other proteins into hubs that transiently drive multiple activities at their gene targets.
Yamada, S., Whitney, P. H., Huang, S. K., Eck, E. C., Garcia, H. G. and Rushlow, C. A. (2019). The Drosophila pioneer factor Zelda modulates the nuclear microenvironment of a Dorsal target enhancer to potentiate transcriptional output. Curr Biol 29(8): 1387-1393.e1385. PubMed ID: 30982648
Connecting the developmental patterning of tissues to the mechanistic control of RNA polymerase II remains a long-term goal of developmental biology. The dorsal-ventral axis of the Drosophila embryo is determined by the graded distribution of Dorsal (Dl), a homolog of the nuclear factor kappaB (NF-kappaB) family of transcriptional activators found in humans. A second maternally deposited factor, Zelda (Zld), is uniformly distributed in the embryo and is thought to act as a pioneer factor, increasing enhancer accessibility for transcription factors, such as Dl. This study utilized the MS2 live imaging system to evaluate the expression of the Dl target gene short gastrulation (sog) to better understand how a pioneer factor affects the kinetic parameters of transcription. These experiments indicate that Zld modifies probability of activation, the timing of this activation, and the rate at which transcription occurs. The results further show that this effective rate increase is due to an increased accumulation of Dl at the site of transcription, suggesting that transcription factor "hubs" induced by Zld functionally regulate transcription.
McDaniel, S. L., Gibson, T. J., Schulz, K. N., Fernandez Garcia, M., Nevil, M., Jain, S. U., Lewis, P. W., Zaret, K. S. and Harrison, M. M. (2019). Continued activity of the pioneer factor Zelda is required to drive zygotic genome activation. Mol Cell 74(1): 185-195. PubMed ID: 30797686
Reprogramming cell fate during the first stages of embryogenesis requires that transcriptional activators gain access to the genome and remodel the zygotic transcriptome. Nonetheless, it is not clear whether the continued activity of these pioneering factors is required throughout zygotic genome activation or whether they are only required early to establish cis-regulatory regions. To address this question, an optogenetic strategy was developed to rapidly and reversibly inactivate the master regulator of genome activation in Drosophila, Zelda. Using this strategy, continued Zelda activity was shown to be required throughout genome activation. Zelda was shown to bind DNA in the context of nucleosomes; this might allow Zelda to occupy the genome despite the rapid division cycles in the early embryo. These data identify a powerful strategy to inactivate transcription factor function during development and suggest that reprogramming in the embryo may require specific, continuous pioneering functions to activate the genome.
Mahmud, A., Yang, D., Stenberg, P., Ioshikhes, I. and Nandi, S. (2019). Exploring a Drosophila transcription factor interaction network to identify cis-regulatory modules. J Comput Biol. PubMed ID: 31855461
Multiple transcription factors (TFs) bind to specific sites in the genome and interact among themselves to form the cis-regulatory modules (CRMs). They are essential in modulating the expression of genes, and it is important to study this interplay to understand gene regulation. This study integrated experimentally identified TF binding sites collected from published studies with computationally predicted TF binding sites to identify Drosophila CRMs. Along with the detection of the previously known CRMs, this approach identified novel protein combinations. High-occupancy target sites were detected, where a large number of TFs bind. Investigating these sites revealed that Giant, Dichaete, and Knirp are highly enriched in these locations. A common TAG team motif was observed at these sites, which might play a role in recruiting other TFs. While comparing the binding sites at distal and proximal promoters, it was found that certain regulatory TFs, such as Zelda, were highly enriched in enhancers. This study has shown that, from the information available concerning the TF binding sites, the real CRMs could be predicted accurately and efficiently. Although it is only possible to claim co-occurrence of these proteins in this study, it may actually point to their interaction (as known interaction proteins typically co-occur together). Such an integrative approach can, therefore, help us to provide a better understanding of the interplay among the factors, even though further experimental verification is required.
Keller, S. H., Jena, S. G., Yamazaki, Y. and Lim, B. (2020). Regulation of spatiotemporal limits of developmental gene expression via enhancer grammar. Proc Natl Acad Sci U S A 117(26): 15096-15103. PubMed ID: 32541043
The regulatory specificity of a gene is determined by the structure of its enhancers, which contain multiple transcription factor binding sites. A unique combination of transcription factor binding sites in an enhancer determines the boundary of target gene expression, and their disruption often leads to developmental defects. Despite extensive characterization of binding motifs in an enhancer, it is still unclear how each binding site contributes to overall transcriptional activity. Using live imaging, quantitative analysis, and mathematical modeling, this study measured the contribution of individual binding sites in transcriptional regulation. Binding site arrangement within the Rho-GTPase component t48 enhancer mediates the expression boundary by mainly regulating the timing of transcriptional activation along the dorsoventral axis of Drosophila embryos. By tuning the binding affinity of the Dorsal (Dl) and Zelda (Zld) sites, this study shows that single site modulations are sufficient to induce significant changes in transcription. Yet, no one site seems to have a dominant role; rather, multiple sites synergistically drive increases in transcriptional activity. Interestingly, Dl and Zld demonstrate distinct roles in transcriptional regulation. Dl site modulations change spatial boundaries of t48, mostly by affecting the timing of activation and bursting frequency rather than transcriptional amplitude or bursting duration. However, modulating the binding site for the pioneer factor Zld affects both the timing of activation and amplitude, suggesting that Zld may potentiate higher Dl recruitment to target DNAs. It is proposed that such fine-tuning of dynamic gene control via enhancer structure may play an important role in ensuring normal development.
Ceolin, S., Hanf, M., Bozek, M., Storti, A. E., Gompel, N., Unnerstall, U., Jung, C. and Gaul, U. (2020). A sensitive mNeonGreen reporter system to measure transcriptional dynamics in Drosophila development. Commun Biol 3(1): 663. PubMed ID: 33184447
The gene regulatory network governing anterior-posterior axis formation in Drosophila is a well-established paradigm to study transcription in developmental biology. The rapid temporal dynamics of gene expression during early stages of development, however, are difficult to track with standard techniques. This study optimized the bright and fast-maturing fluorescent protein mNeonGreen as a real-time, quantitative reporter of enhancer expression. Enhancer activity is derived from the reporter fluorescence dynamics with high spatial and temporal resolution, using a robust reconstruction algorithm. By comparing these results with data obtained with the established MS2-MCP system, the higher detection sensitivity of this reporter is demonstrated. The reporter activity was used to quantify the activity of variants of a simple synthetic enhancer, and observe increased activity upon reduction of enhancer-promoter distance or addition of binding sites for the pioneer transcription factor Zelda. This reporter system constitutes a powerful tool to study spatio-temporal gene expression dynamics in live embryos.
Eck, E., Liu, J., Kazemzadeh-Atoufi, M., Ghoreishi, S., Blythe, S. A. and Garcia, H. G. (2020). Quantitative dissection of transcription in development yields evidence for transcription factor-driven chromatin accessibility. Elife 9. PubMed ID: 33074101
Thermodynamic models of gene regulation can predict transcriptional regulation in bacteria, but in eukaryotes chromatin accessibility and energy expenditure may call for a different framework. This study systematically tested the predictive power of models of DNA accessibility based on the Monod-Wyman-Changeux (MWC) model of allostery, which posits that chromatin fluctuates between accessible and inaccessible states. The regulatory dynamics of hunchback by the activator Bicoid and the pioneer-like transcription factor Zelda was dissected in living Drosophila embryos; no thermodynamic or non-equilibrium MWC model could recapitulate hunchback transcription. Therefore, a model was explored where DNA accessibility is not the result of thermal fluctuations but is catalyzed by Bicoid and Zelda, possibly through histone acetylation; this model did predict hunchback dynamics. Thus, this theory-experiment dialogue uncovered potential molecular mechanisms of transcriptional regulatory dynamics, a key step toward reaching a predictive understanding of developmental decision-making.
Espinola, S. M., Gotz, M., Bellec, M., Messina, O., Fiche, J. B., Houbron, C., Dejean, M., Reim, I., Cardozo Gizzi, A. M., Lagha, M. and Nollmann, M. (2021). Cis-regulatory chromatin loops arise before TADs and gene activation, and are independent of cell fate during early Drosophila development. Nat Genet 53(4): 477-486. PubMed ID: 33795867
This study employed Hi-M, a single-cell spatial genomics approach, to detect CRM-promoter looping interactions within topologically associating domains (TADs) during early Drosophila development. By comparing cis-regulatory loops in alternate cell types, it was shown that physical proximity does not necessarily instruct transcriptional states. Moreover, multi-way analyses reveal that multiple CRMs spatially coalesce to form hubs. Loops and CRM hubs are established early during development, before the emergence of TADs. Moreover, CRM hubs are formed, in part, via the action of the pioneer transcription factor Zelda and precede transcriptional activation. This approach provides insight into the role of CRM-promoter interactions in defining transcriptional states, as well as distinct cell types.
Duan, J., Rieder, L., Colonnetta, M. M., Huang, A., McKenney, M., Watters, S., Deshpande, G., Jordan, W., Fawzi, N. and Larschan, E. (2021). CLAMP and Zelda function together to promote Drosophila zygotic genome activation. Elife 10. PubMed ID: 34342574
During the essential and conserved process of zygotic genome activation (ZGA), chromatin accessibility must increase to promote transcription. Drosophila is a well-established model for defining mechanisms that drive ZGA. Zelda (ZLD) is a key pioneer transcription factor (TF) that promotes ZGA in the Drosophila embryo. However, many genomic loci that contain GA-rich motifs become accessible during ZGA independent of ZLD. Therefore, it was hypothesized that other early TFs that function with ZLD have not yet been identified, especially those that are capable of binding to GA-rich motifs such as CLAMP. This study demonstrated that Drosophila embryonic development requires maternal CLAMP to: 1) activate zygotic transcription; 2) increase chromatin accessibility at promoters of specific genes that often encode other essential TFs; 3) enhance chromatin accessibility and facilitate ZLD occupancy at a subset of key embryonic promoters. Thus, CLAMP functions as a pioneer factor which plays a targeted yet essential role in ZGA.
Huang, S. K., Whitney, P. H., Dutta, S., Shvartsman, S. Y. and Rushlow, C. A. (2021. Spatial organization of transcribing loci during early genome activation in Drosophila. Curr Biol. PubMed ID: 34614388
The early Drosophila embryo provides unique experimental advantages for addressing fundamental questions of gene regulation at multiple levels of organization, from individual gene loci to the entire genome. Using 1.5-h-old Drosophila embryos undergoing the first wave of genome activation, This study detected ~110 discrete "speckles" of RNA polymerase II (RNA Pol II) per nucleus, two of which were larger and localized to the histone locus bodies (HLBs). In the absence of the primary driver of Drosophila genome activation, the pioneer factor Zelda (Zld) 70% fewer speckles were present; however, the HLBs tended to be larger than wild-type (WT) HLBs, indicating that RNA Pol II accumulates at the HLBs in the absence of robust early-gene transcription. This study observed a uniform distribution of distances between active genes in the nuclei of both WT and zld mutant embryos, indicating that early co-regulated genes do not cluster into nuclear sub-domains. However, in instances whereby transcribing genes did come into close 3D proximity (within 400 nm), they were found to have distinct RNA Pol II speckles. In contrast to the emerging model whereby active genes are clustered to facilitate co-regulation and sharing of transcriptional resources, the data support an "individualist" model of gene control at early genome activation in Drosophila. This model is in contrast to a "collectivist" model, where active genes are spatially clustered and share transcriptional resources, motivating rigorous tests of both models in other experimental systems (Huang, 2021).
Theodorou, V., Stefanaki, A., Drakos, M., Triantafyllou, D. and Delidakis, C. (2022). ASC proneural factors are necessary for chromatin remodeling during neuroectodermal to neuroblast fate transition to ensure the timely initiation of the neural stem cell program. BMC Biol 20(1): 107. PubMed ID: 35549704
In both Drosophila and mammals, the achaete-scute (ASC/ASCL) proneural bHLH transcription factors are expressed in the developing central and peripheral nervous systems, where they function during specification and maintenance of the neural stem cells in opposition to Notch signaling. However, the impact of ASC on chromatin dynamics during neural stem cell generation remains elusive. This study investigated the chromatin changes accompanying neural commitment using an integrative genetics and genomics methodology. ASC factors were found to bind equally strongly to two distinct classes of cis-regulatory elements: open regions remodeled earlier during maternal to zygotic transition by Zelda and less accessible, Zelda-independent regions. Both classes of cis-elements exhibit enhanced chromatin accessibility during neural specification and correlate with transcriptional regulation of genes involved in a variety of biological processes necessary for neuroblast function/homeostasis. This study identified an ASC-Notch regulated TF network that includes likely prime regulators of neuroblast function. Using a cohort of ASC target genes, it is reported that ASC null neuroblasts are defectively specified, remaining initially stalled, unable to divide, and lacking expression of many proneural targets. When mutant neuroblasts eventually start proliferating, they produce compromised progeny. Reporter lines driven by proneural-bound enhancers display ASC dependency, suggesting that the partial neuroblast identity seen in the absence of ASC genes is likely driven by other, proneural-independent, cis-elements. Neuroblast impairment and the late differentiation defects of ASC mutants are corrected by ectodermal induction of individual ASC genes but not by individual members of the TF network downstream of ASC. However, in wild-type embryos, the induction of individual members of this network induces CNS hyperplasia, suggesting that they synergize with the activating function of ASC to consolidate the chromatin dynamics that promote neural specification. This study has demonstrate that ASC proneural transcription factors are indispensable for the timely initiation of the neural stem cell program at the chromatin level by regulating a large number of enhancers in the vicinity of neural genes. This early chromatin remodeling is crucial for both neuroblast homeostasis as well as future progeny fidelity.
Harden, T. T., Vincent, B. J. and DePace, A. H. (2023). Transcriptional activators in the early Drosophila embryo perform different kinetic roles. Cell Syst 14(4): 258-272. PubMed ID: 37080162
Combinatorial regulation of gene expression by transcription factors (TFs) may in part arise from kinetic synergy-wherein TFs regulate different steps in the transcription cycle. Kinetic synergy requires that TFs play distinguishable kinetic roles. This study used live imaging to determine the kinetic roles of three TFs that activate transcription in the Drosophila embryo-Zelda, Bicoid, and Stat92E-by introducing their binding sites into the even-skipped stripe 2 enhancer. These TFs influence different sets of kinetic parameters, and their influence can change over time. All three TFs increased the fraction of transcriptionally active nuclei; Zelda also shortened the first-passage time into transcription and regulated the interval between transcription events. Stat92E also increased the lifetimes of active transcription. Different TFs can therefore play distinct kinetic roles in activating the transcription. This has consequences for understanding the composition and flexibility of regulatory DNA sequences and the biochemical function of TFs.
Gibson, T. J. and Harrison, M. M. (2023). Protein-intrinsic properties and context-dependent effects regulate pioneer-factor binding and function. bioRxiv. PubMed ID: 37066406
Chromatin is a barrier to the binding of many transcription factors. By contrast, pioneer factors access nucleosomal targets and promote chromatin opening. Despite binding to target motifs in closed chromatin, many pioneer factors display cell-type specific binding and activity. The mechanisms governing pioneer-factor occupancy and the relationship between chromatin occupancy and opening remain unclear. Yhis work studied three Drosophila transcription factors with distinct DNA-binding domains and biological functions: Zelda, Grainy head, and Twist. It was demonstrated that the level of chromatin occupancy is a key determinant of pioneering activity. Multiple factors regulate occupancy, including motif content, local chromatin, and protein concentration. Regions outside the DNA-binding domain are required for binding and chromatin opening. These results show that pioneering activity is not a binary feature intrinsic to a protein but occurs on a spectrum and is regulated by a variety of protein-intrinsic and cell-type-specific features.
Fenelon, K. D., Gao, F., Borad, P., Abbasi, S., Pachter, L. and Koromila, T. (2023). Cell-specific occupancy dynamics between the pioneer-like factor Opa/ZIC and Ocelliless/OTX regulate early head development in embryos. Front Cell Dev Biol 11: 1126507. PubMed ID: 37051467
During development, embryonic patterning systems direct a set of initially uncommitted pluripotent cells to differentiate into a variety of cell types and tissues. A core network of transcription factors, such as Zelda/POU5F1, Odd-paired (Opa)/ZIC3 and Ocelliless (Oc)/OTX2, are conserved across animals. While Opa is essential for a second wave of zygotic activation after Zelda, it is unclear whether Opa drives head cell specification, in the Drosophila embryo. It is hypothesized that Opa and Oc are interacting with distinct cis-regulatory regions for shaping cell fates in the embryonic head. Super-resolution microscopy and meta-analysis of single-cell RNAseq datasets show that opa and oc overlapping expression domains are dynamic in the head region, with both factors being simultaneously transcribed at the blastula stage. Additionally, analysis of single-embryo RNAseq data reveals a subgroup of Opa-bound genes to be Opa-independent in the cellularized embryo. Interrogation of these genes against Oc ChIPseq combined with in situ data, suggests that Opa is competing with Oc for the regulation of a subgroup of genes later in gastrulation. Specifically, it was found that Oc binds to late, head-specific enhancers independently and activates them in a head-specific wave of zygotic transcription, suggesting distinct roles for Oc in the blastula and gastrula stages.
Colonnetta, M. M., Schedl, P. and Deshpande, G. (2023). Germline/soma distinction in Drosophila embryos requires regulators of zygotic genome activation. Elife 12. PubMed ID: 36598809
In Drosophila melanogaster embryos, somatic versus germline identity is the first cell fate decision. Zygotic genome activation (ZGA) orchestrates regionalized gene expression, imparting specific identity on somatic cells. ZGA begins with a minor wave that commences at nuclear cycle (NC)8 under the guidance of chromatin accessibility factors (Zelda, CLAMP, GAF), followed by the major wave during NC14. By contrast, primordial germ cell (PGC) specification requires maternally deposited and posteriorly anchored germline determinants. This is accomplished by a centrosome coordinated release and sequestration of germ plasm during the precocious cellularization of PGCs in NC10. This study reports a novel requirement for Zelda and CLAMP during the establishment of the germline/soma distinction. When their activity is compromised, PGC determinants are not properly sequestered, and specification is disrupted. Conversely, the spreading of PGC determinants from the posterior pole adversely influences transcription in the neighboring somatic nuclei. These reciprocal aberrations can be correlated with defects in centrosome duplication/separation that are known to induce inappropriate transmission of the germ plasm. Interestingly, consistent with the ability of bone morphogenetic protein (BMP) signaling to influence specification of embryonic PGCs, reduction in the transcript levels of a BMP family ligand, decapentaplegic (dpp), is exacerbated at the posterior pole.
Brennan, K. J., Weilert, M., Krueger, S., Pampari, A., Liu, H. Y., Yang, A. W. H., Morrison, J. A., Hughes, T. R., Rushlow, C. A., Kundaje, A. and Zeitlinger, J. (2023). Chromatin accessibility in the Drosophila embryo is determined by transcription factor pioneering and enhancer activation. Dev Cell. PubMed ID: 37557175
Chromatin accessibility is integral to the process by which transcription factors (TFs) read out cis-regulatory DNA sequences, but it is difficult to differentiate between TFs that drive accessibility and those that do not. Deep learning models that learn complex sequence rules provide an unprecedented opportunity to dissect this problem. Using zygotic genome activation in Drosophila as a model, this study analyzed high-resolution TF binding and chromatin accessibility data with interpretable deep learning and performed genetic validation experiments. A hierarchical relationship was identified between the pioneer TF Zelda and the TFs involved in axis patterning. Zelda consistently pioneers chromatin accessibility proportional to motif affinity, whereas patterning TFs augment chromatin accessibility in sequence contexts where they mediate enhancer activation. It is concluded that chromatin accessibility occurs in two tiers: one through pioneering, which makes enhancers accessible but not necessarily active, and the second when the correct combination of TFs leads to enhancer activation.


In all animals, the initial events of embryogenesis are controlled by maternal gene products that are deposited into the developing oocyte. At some point after fertilization, control of embryogenesis is transferred to the zygotic genome in a process called the maternal-to-zygotic transition. During this time, many maternal RNAs are degraded and transcription of zygotic RNAs ensues. There is a long-standing question as to which factors regulate these events. The recent findings that microRNAs and Smaug mediate maternal transcript degradation have shed new light on this aspect of the problem. However, the transcription factor(s) that activate the zygotic genome remain elusive. The discovery that many of the early transcribed genes in Drosophila share a cis-regulatory heptamer motif, CAGGTAG and related sequences, collectively referred to as TAGteam sites raised the possibility that a dedicated transcription factor could interact with these sites to activate transcription. This study reports that the zinc-finger protein Zelda (Zld; Zinc-finger early Drosophila activator) binds specifically to these sites and is capable of activating transcription in transient transfection assays. Mutant embryos lacking zld are defective in cellular blastoderm formation, and fail to activate many genes essential for cellularization, sex determination and pattern formation. Global expression profiling confirmed that Zld has an important role in the activation of the early zygotic genome and suggests that Zld may also regulate maternal RNA degradation during the maternal-to-zygotic transition (Liang, 2008).

In Drosophila, an initial wave of zygotic gene transcription occurs between 1 and 2h of development during the mitotic cleavage cycles 8-13. This is followed by a major burst of activity between 2 to 3h of development (cycle 14) when the embryo is undergoing cellular blastoderm formation. Many pre-cellular genes contain TAGteam sites in their upstream regulatory regions including several direct targets of Bicoid, Dorsal and other key regulators of patterning (ten Bosch, 2006; De Renzis, 2007; Li, 2008). It has been previously demonstrated (ten Bosch, 2006) that TAGteam sites are required for the early expression of the dorsoventral gene zen, and the sex determination genes sisB (also known as sc) and Sxl. To isolate the TAGteam binding factor, a yeast one-hybrid screen was performed with a 91 base-pair (bp) fragment from the zen cis-regulatory region (zen(91)), which contains four TAGteam sites. zld, encoded by the X chromosomal gene CG12701 (also known as vfl), was selected as the only candidate of the 11 recovered that had the potential to bind specific DNA sequences because it encoded a protein with six C2H2 zinc fingers. Oligonucleotides with different TAGteam sites were tested in gel shift assays with the 357 amino acid carboxy-terminal region of Zld fused to glutathione S-transferase (GST-ZldC). All oligonucleotides tested formed complexes with GST-ZldC, although with different affinities, whereas mutations in the heptanucleotide sequence abolished binding. Notably, the site with the strongest affinity, CAGGTAG, is the site most over-represented in regulatory elements of pre-blastoderm genes versus post-blastoderm genes (ten Bosch, 2006). A plasmid expressing full-length Zld protein promoted transcriptional activation of a zen(91)-lacZ reporter but not a mutated zen(91m)-lacZ reporter after co-transfection in Drosophila S2 cells. Taken together, these data strongly suggest that Zld activates transcription of zen and probably other TAGteam-containing genes (Liang, 2008).

zld transcripts were detected in the germline cells of the ovary, in unfertilized eggs, and throughout early development. Later, zld becomes restricted to the nervous system and specific head regions, as previously shown (Staudt, 2006). To analyse zld function, deletion alleles of zld were generated by imprecise excision. Hemizygous embryos showed abnormal central nervous system and head development, consistent with previous reports of CG12701 lethal P-insertion phenotypes (Staudt, 2006). zld transcripts were not observed in these embryos after cycle 14. However, younger embryos had high levels of maternal zld transcripts, indicating that maternally loaded zld transcripts are degraded during cellularization and replaced with zygotic zld (Liang, 2008).

To eliminate maternal zld from embryos, clones of zld294 mutant germ cells were induced in the adult female. All resulting embryos were null for maternal zld (M- zld), and the male embryos were also null for zygotic zld (M-Z- zld). All early M- zld embryos lacked zld transcripts but had normal patterns of other maternally deposited factors such as bicoid RNAs and the Dorsal protein gradient. Unlike M-Z+ zld embryos, which began to express zld ubiquitously in cycle 14, M-Z- zld embryos never expressed zld. However, regardless of their zygotic genotypes, all M- zld embryos showed a severely abnormal morphology after cycle 14 and did not survive to make cuticle (Liang, 2008).

Before cycle 14, M- zld embryos are similar to wild type, except for sporadic nuclear fallout. However, at early cycle 14 the hexagonal-actin network becomes disorganized and begins to degenerate resulting in a multinucleated phenotype resembling nullo and Serendipity α (Sry-α) mutants. Cellularization does not proceed as furrow canals never move inward like in the wild type, and Neurotactin (Nrt) accumulates abnormally in the apical cytoplasm -- reminiscent of the slow as molasses (slam) mutant phenotype. Staining with anti-Slam antibody confirmed that Slam protein is mostly absent by mid-cycle 14, whereas Slam has moved basally in wild type. In addition, nuclei do not elongate but instead become rounded, enlarged and clump together. Regions of higher nuclear density were observed, a phenotype similar to that obtained by injection of CG12701 double-stranded RNA (Staudt, 2006), which resembles a frühstart (frs, also known as Z600) phenotype (Grosshans, 2003). Despite their aberrant morphology, M- zld embryos attempt to form a ventral furrow but soon become highly disorganized with only pole cells recognizable. The M- zld cellularization defects were rescued by driving a wild-type copy of zld into the germ line using the ovarian tumour (otu) promoter. The cytoskeleton becomes well structured and furrow canal ingression is normal as Slam protein is restored (Liang, 2008).

The broad range of phenotypes strongly indicated that M- zld embryos do not express genes essential for cellular blastoderm formation. Expression was examined of Sry-α, slam and nullo, as well as sisA, sisB, sisC (also known as os), Sxl, zen and dpp. None of these genes was activated in M- zld embryos, except at the poles in some cases. However, sna and sog (which are activated by Dorsal) were not absent but were delayed in expression by at least two cycles, suggesting that Zld facilitates the onset of early gene transcription. Furthermore, the lateral stripes of sog were greatly reduced in width, indicating that in regions in which Dorsal protein amounts are low, a combinatorial mechanism involving both Dorsal and Zld establishes the broad sog domain. Notably, there are two TAGteam sites in the 393 bp sog enhancer that lie close to Dorsal binding sites (Liang, 2008).

The results indicated that Zld is a global activator of early genes. To test this directly the expression profiles of wild-type and M- zld embryos were compared in mitotic cycles 8-13, a time point presumably enriched in genes that are direct Zld targets. One-hundred-and-twenty genes were downregulated and surprisingly 176 genes were upregulated at least twofold, in the absence of Zld. The downregulated set was strongly enriched in genes that are zygotically expressed and involved in early developmental processes, including most of the genes assayed in situ. For example, 12 genes involved in cellular blastoderm formation (nullo, slam, Sry-α, bnk, frs, btsz, halo and 5 halo-like genes (Gross, 2003), 6 sex-determination genes (sisA, sisB, sisC, run, Sxl and dpn), and 8 dorsoventral genes (dpp, tld, tok, tsg, tsg-like, scw, zen and zen2) are in the downregulated data set. Overall, 75% of the early genes previously described as pre-cellular are included. This number may be an underestimate because there could be many genes such as sna and sog that did not make the twofold cutoff but are indeed regulated by Zld (Liang, 2008).

About 80% of the downregulated genes have TAGteam sites within 2 kilobases (kb) upstream of the transcription start site, and another 10% have TAGteam sites in introns, such as slam with two sites in its first intron, supporting the idea that most of the downregulated genes are direct Zld targets. In addition, the TAGteam sites upstream of the downregulated genes tend to be located very close to the transcription start site within 200 bp, consistent with the previous finding that early zygotic genes have a statistical over-representation of TAGteam sites close to the start site (Liang, 2008).

In contrast to the downregulated genes, the upregulated set is strongly enriched in genes that are maternally expressed. The possibility was considered that Zld activates components of the RNA degradation machinery that in turn destabilize maternal RNAs. Because the microRNA (miRNA) miR-309 enhancer contains two TAGteam sites, miR-309 primary transcripts were assayed in M- zld embryos, and indeed they were absent. It was recently shown that mature miR-309 miRNAs become abundant during cycle 14, and are involved in maternal transcript turnover in 2-4 h embryos. Not surprisingly, the 1-2 h (cycles 8-13) data set had no overlap with the 44 published miR-309 targets, however 2-4 h profiling experiments should demonstrate whether they are upregulated in the absence of zld. The upregulated genes were compared to those affected by smaug, another gene required for the removal of maternally supplied RNAs. There was little overlap with the published Smaug targets, suggesting that Zld is involved in a parallel pathway of maternal RNA degradation (Liang, 2008).

This study has demonstrated that Zld functions as a key transcriptional activator during the maternal-to-zygotic transition (MZT) in Drosophila. This is the first demonstration of such an activator in any organism. It is proposed that the biological role of Zld in the pre-blastoderm embryo is to set the stage for vital processes such as cellular blastoderm formation, counting of X chromosomes for dosage compensation and sex determination, and pattern formation, by ensuring the coordinated accumulation of batteries of gene products during the MZT. This early preparedness should allow sufficient time for the formation of molecular machines involved in these processes, and so are ready to spring into action during the prolonged interphase of cycle 14 (Liang, 2008).

Chromatin architecture emerges during zygotic genome activation independent of transcription

Chromatin architecture is fundamental in regulating gene expression. To investigate when spatial genome organization is first established during development, this study examined chromatin conformation during Drosophila embryogenesis and observed the emergence of chromatin architecture within a tight time window that coincides with the onset of transcription activation in the zygote. Prior to zygotic genome activation (ZGA), the genome is mostly unstructured. Early expressed genes serve as nucleation sites for topologically associating domain (TAD) boundaries. Activation of gene expression coincides with the establishment of TADs throughout the genome and co-localization of housekeeping gene clusters, which remain stable in subsequent stages of development. However, the appearance of TAD boundaries is independent of transcription and requires the transcription factor Zelda for locus-specific TAD boundary insulation (Hug, 2017).

These results provide strong evidence that chromatin conformation in Drosophila is established during zygotic genome activation in a transcription-independent manner. Before the main wave of ZGA, the genome is mostly unstructured, except for ~180 regions enriched in RNA Pol II binding and housekeeping genes that display TAD boundary-like insulation. These early transcribed loci serve as nucleation sites for the establishment of TAD boundaries, and inter-TAD insulation is progressively gained at loci marked for transcription. These observations have implications for understanding of how chromatin conformation is regulated that are discuss below (Hug, 2017).

Previous studies have examined chromatin conformation in a range of cell lines and tissues across different species leading to the observation of TADs, which seem to be mostly invariant between cell lines and developmental stages and across evolutionary conserved loci. This organization remains prevalent even when the presence of proteins associated with chromatin organization such as CTCF or cohesin is perturbed. However, a common feature of these studies is that they were performed in cells actively engaged in transcribing their genomes (Hug, 2017).

In contrast, using Drosophila embryos allowed probing of chromatin conformation in vivo in a unique nuclear setting, before the activation of the zygotic genome. The resulting in situ Hi-C maps display a dramatic remodelling of the three-dimensional organization of the chromatin, coinciding with the main activation of the zygotic genome. Furthermore, TAD boundaries throughout development were found to be enriched in housekeeping genes, which have been previously found to be arranged in clusters in the genome. Once established, TAD boundaries are maintained in vivo throughout development, in agreement with previous observations in cell lines. Given that these boundaries are enriched in housekeeping genes, which are actively transcribed in most cellular types in an organism, the results provide further evidence that the genomic arrangement of housekeeping genes in the genome is a major determinant for the conserved TAD organization found in most tissues. In line with this suggestion, previous studies have found an association between TAD boundaries and constitutively expressed genes that seems to be conserved throughout eukaryotic genomes ranging from yeast to mammals. Interestingly, a recent examination of the relationship between dosage compensation and chromatin conformation in mice has also highlighted a strong functional link between transcription and the establishment of local chromatin architecture, including TAD-based conformation at escapee loci on the inactive X chromosome (Giorgetti, 2016). These results suggest that while genomes of different sizes have evolved specific mechanisms to ensure the right three-dimensional organization of chromatin in the nuclear space, the insulating properties found at transcribed loci are a conserved feature across evolution (Hug, 2017).

An observation of chromatin organization maps is the presence of contacts between TAD boundary regions at large genomic distances, which resemble the interactions of 'A' compartments previously described for mammals (Lieberman-Aiden, 2009). These enriched contacts suggest that housekeeping, gene-rich boundary regions cluster together in the nucleus. These observations are reminiscent of transcription foci that have been observed previously by microscopy and are in agreement with the extensive promoter-promoter contacts between housekeeping genes found in mammals using ChIA-PET. The long-range contacts between housekeeping genes are weaker in embryos before nc14. It is worth noting that, prior to nc14, nuclear divisions take place consecutively in a short period of time (~10-19 min each), which might preclude the formation of those long-range interactions. Examinations of interaction dynamics in embryos with extended nuclear division timing will help to determine the impact of short nuclear cycle divisions (Hug, 2017).

The precise mechanisms by which chromatin conformation emerges during development remain to be fully unveiled. The relationship between transcription and chromatin organization has been examined before, suggesting different levels of interdependency between the two. These studies, however, were unable to examine the role of transcription in establishing TADs and TAD boundaries since they were performed in cells with an already established chromatin conformation. By inhibiting transcription in embryos without prior apparent chromatin organization, this study demonstrated that the establishment of TADs and TAD boundaries is independent of transcription and does not require zygotic factors. However, inhibition of RNA Pol II function reduced inter-TAD insulation and spatial clustering of housekeeping genes. ChIP data for RNA Pol II showed an almost complete absence of binding on gene bodies in both transcription inhibition experiments and reduced binding at promoters after treatment, which argues for a quantitative lack of RNA Pol II binding in these embryos. Given that both treatments can lead to degradation of RNA Pol II, the loss of insulation therefore might arise from a dosage-dependent lack of RNA Pol II occupancy at TAD boundaries. Together these data suggest that while the emergence of chromatin architecture is independent of transcription, mechanisms associated with this process, such as chromatin opening, the assembly of the pre-initiation complex (PIC), or recruitment of RNA Pol II or other basic transcriptional machinery, might play a role in establishing and maintaining chromatin conformation. A small number of boundaries were observed that are not associated with RNA Pol II, which indicates that other mechanisms might be involved in the formation of those boundaries. It should be noted that global inhibition of transcription might lead to secondary effects that affect the organization of chromatin. Intriguingly, binding of dCTCF in Drosophila does not seem to be as predictive of TAD boundaries as in mammals. Therefore, in agreement with previous observations identifying insulation properties for RNA Pol II binding , the results suggest RNA Pol II as a possible major determinant of inter-TAD insulation in Drosophila (Hug, 2017).

The analyses of the temporal dynamics of TAD boundary formation and the open chromatin state of TAD boundaries suggest that events preceding the recruitment of RNA Pol II might be important for the establishment of TAD boundaries. Of interest is the NSL complex, which has been previously associated with the eviction of nucleosomes and assembly of the PIC at promoters of housekeeping genes (Lam, 2012). Among the marks for active chromatin that are enriched at TAD boundaries, distinct temporal dynamics were found, suggesting that different mechanisms might be associated with the establishment and maintenance of these boundaries. The association with some of these marks might only be consequential and not causal (for example H3K36me3) given that chromatin organization still occurs upon inhibition of transcription, which impairs the establishment of this chromatin mark. Little is known about the role of two of the associated histone marks, namely H3K18ac and H4K8ac. Interestingly, H3K18ac is also enriched in yeast micro TAD boundaries, suggesting an evolutionarily conserved role. H3K18ac has been associated with the regulation of double-strand breaks and genome integrity in mammals, which together with the recent association of topoisomerase (DNA) II beta (TOP2B) at TAD boundaries provide further evidence that TAD boundaries might be involved in the regulation of replication timing. Current models of TAD formation include the 'loop-extrusion' model, based on the presence of cohesin-mediated loops and CTCF anchors, and the 'self-assembly' model, based on the aggregation ability of nucleosomes from inactive chromatin, among others. So far the current analyses do not allow determining which model would be more plausible. The architectural proteins involved in the establishment of such conformation in the loop extrusion model are maternally provided and ubiquitously expressed thereafter. Furthermore, according to computational simulations, the rapid succession of nuclei divisions might not allow for the necessary time to form TAD-like structures at early stages of development. Irrespective of how chromatin is compacted into TADs, this analyses suggest that the genome is partitioned in the three-dimensional nuclear space into specific functional units consisting of TADs and inter-TAD regions, which might allow for fine-tuned regulation of developmentally regulated and housekeeping genes separately (Hug, 2017).

Finally, genetic analyses on the role of Zelda in establishing chromatin conformation demonstrate that Zelda is necessary for the formation of Zelda-specific TAD boundaries. Given the ability of Zelda to open chromatin, it is plausible that generating a relaxed local chromatin environment could be one of the initial steps in the establishment of chromatin conformation. Zelda binding is necessary for chromatin accessibility and transcription factor binding during ZGA, and hence boundaries that lose insulation upon Zelda depletion might do so because of a combination of lack of architectural protein binding, binding of other transcription factors, or binding of RNA Pol II. Analyses of open chromatin at TAD boundaries identify other enriched DNA motifs such as BEAF-32 and GAF, which, together with the observation of Zelda-independent TAD boundaries, suggest that other DNA binding proteins might act in a coordinated fashion to orchestrate the emergence of chromatin architecture. Of interest will be the novel DNA binding motifs identified in this study, as well as factors such as GAF, which has been shown to be able to open chromatin in a Zelda-independent manner (Hug, 2017).

Overall, these data provide new insight regarding when and how chromatin conformation is established during development. The powerful tools of Drosophila genetics such as maternal knockdowns and forward genetic screens combined with the recent development of genome editing tools such as CRISPR-Cas9 and targeted approaches to modify chromatin environments will allow for precise examinations of other factors implicated in the establishment of chromatin conformation. Therefore, explorations of the ZGA in Drosophila will constitute an ideal platform for examining the molecular mechanisms that determine how chromatin organization is regulated and how it affects gene expression during development and disease (Hug, 2017).

These results offer insight into when spatial organization of the genome emerges and identify a key factor that helps trigger this architecture (Hug, 2017).

Polycomb-dependent chromatin looping contributes to gene silencing during Drosophila development

Interphase chromatin is organized into topologically associating domains (TADs). Within TADs, chromatin looping interactions are formed between DNA regulatory elements, but their functional importance for the establishment of the 3D genome organization and gene regulation during development is unclear. Using high-resolution Hi-C experiments, this study analyzed higher order 3D chromatin organization during Drosophila embryogenesis and identified active and repressive chromatin loops that are established with different kinetics and depend on distinct factors: Zelda-dependent active loops are formed before the midblastula transition between transcribed genes over long distances. Repressive loops within polycomb domains are formed after the midblastula transition between polycomb response elements by the action of GAGA factor and polycomb proteins. Perturbation of PRE function by CRISPR/Cas9 genome engineering affects polycomb domain formation and destabilizes polycomb-mediated silencing. Preventing loop formation without removal of polycomb components also decreases silencing efficiency, suggesting that chromatin architecture can play instructive roles in gene regulation during development (Ogiyama, 2018).

This study shows that the 3D organization of the Drosophila genome is established during embryogenesis through the stepwise formation of TADs and chromatin compartments as well as of active and repressive chromatin loops anchored by Zelda and GAGA factor-dependent PREs. Moreover, this study dissected the function of PRE sequences and PRE looping interactions for the formation of polycomb domains and PcG-mediated repression (Ogiyama, 2018).

Developmental profiling shows that 3D genome folding involves multiple rapid processes, some of which occur simultaneously and others in a stepwise manner, and involves contacts at different scales of linear distances. Consistent with recent studies in mammals and Drosophila, this study observed that chromatin is largely unstructured in early embryos before ZGA (Hug, 2017, Ke, 2017). A few minutes later, at cycles 9-13, early boundaries corresponding to the Pol-II- and Zld-bound sites of genes of the first midblastula transcriptional wave and TADs begin to demarcate, albeit poorly, as previously described (Hug, 2017). Another rapid structural transition occurs within less than 20 min after mitosis 13 when early TADs become much more demarcated. This rapid structural transition is remarkable and unexpected. Physical modeling considers chromosome dynamics an important parameter in 3D genome folding regulation, and it has been proposed that chromosomes never reach equilibrium in interphase because of their relatively slow dynamic behavior in the confined nuclear space. Whereas most models used FISH data and in vivo chromatin tracking to calibrate their parameters to simulate dynamic behaviors, it will be interesting to take into account the rapid changes detected in the Hi-C data for model refinement (Ogiyama, 2018).

Another unprecedented finding is that genome compartmentalization of chromatin sharing distinct epigenetic marks occurs at different developmental stages and dynamics during Drosophila embryogenesis. Although active and PcG-repressed compartments are established soon after the major wave of ZGA, these two compartments are initially not well separated. This might reflect a higher degree of the genome switching compartments at early embryonic stages, as it has been previously observed during human ESC differentiation, and a higher degree of chromatin mobility during early development. In contrast, active and PcG-repressed compartments are well separated at the end of embryogenesis, correlating with a decrease in chromatin mobility, which might help to stabilize gene-regulatory states (Ogiyama, 2018).

These results extend previous findings (Blythe, 2016) and suggest that Zld and GAGA factor play a key role in chromatin organization during embryogenesis, not only locally but also by regulating 3D genome folding. In addition to defining a subset of TAD boundaries (Hug, 2017), Zld is required to induce long-range active gene contacts during the MBT. A second wave of active chromatin loops appears between Pol-II-bound sites after the MBT. These loops are independent of Zld but associated with classical insulator proteins, such as CTCF and GAGA factor. After MBT and the major wave of ZGA, GAGA factor acts together with Zld to determine the open chromatin structure of active gene promoters, and insulators. Another major role for GAGA factor is to set up repressive chromatin loops formed over shorter distances within polycomb domains. These loops are first observed at early cleavage cycle 14, at the end of MBT, and well ahead of the time at which Hox phenotypes can be detected in PcG mutants. Because previous studies suggested that GAGA factor can contribute to the formation of chromatin loops, this protein might directly mediate chromatin contacts or might recruit other chromatin-associated factors that mediate looping interactions. One such factor could be cohesin, which physically interacts with PcG proteins, associates with repressive looping anchor points (Eagen, 2017), and mediates looping within the engrailed and invected polycomb domain. However, the fact that GAGA factor-dependent loops are particularly strong in polycomb domains supports the hypothesis that GAGA factor might induce loops via recruitment of PRC1 proteins, such as Ph, which is able to mediate looping contacts via the oligomerization of its SAM domain. Because mutation of GAGA motifs within the PRE sequence induces not only loss of GAGA factor but also a loss of PcG recruitment, it was not possible to discriminate whether loss of PRE looping is the direct consequence of loss of GAGA factor binding or the consequence of loss of PcG binding in this case (Ogiyama, 2018).

A recent study showed that deletion of the four major PREs at the invected/engrailed gene locus, which are also involved in a repressive chromatin loop, does not substantially affect the 3D structure of this polycomb domain or the deposition of H3K27me3 (De, 2016). The authors concluded that these and other PREs in this highly complex locus might act in a redundant manner to establish repressive polycomb domains (De, 2016). In contrast, the current analysis of the simpler dac gene locus shows that disruption of a single PRE induces the loss of PRE looping interactions and that PREs act cooperatively to set up a repressive chromatin environment. Mutation of both PREs simultaneously at the dac gene locus induces the loss of all tested repressive chromatin marks, indicating that no additional PRE sequences are present at this polycomb domain. However, the two PREs are not of equivalent importance for the formation of the polycomb domain and target gene repression, and it will be exciting to investigate in the future whether the differential function of the PREs is encoded in their DNA sequence or whether it is determined by their genomic position (Ogiyama, 2018).

Unexpectedly, deletion of individual or both PREs is not sufficient to globally activate target gene expression in embryogenesis, which might be due to the absence of an essential activator outside of the dac expression domain. In contrast, changes in chromatin structure and contacts were detected at this stage, suggesting that these changes play a causal role in the observed dac gain-of-function phenotype observed at later developmental stages (Ogiyama, 2018).

The Hi-C time course data indicated that polycomb domains form gradually and are correlated with the formation of polycomb bodies and the deposition of H3K27me3. The current data also indicate that PRE looping interactions occur at the beginning of polycomb domain formation and are maintained during development. Two non-exclusive hypotheses for the importance of repressive chromatin loops can be postulated: initial PRE looping interactions might help PRE-anchored PcG complexes to contact neighboring nucleosomes and facilitate the deposition of H3K27me3 along polycomb domains. Alternatively, PRE looping contacts might contribute to establish a particular chromatin nano-compartment, enriched in PcG proteins and refractory to illegitimate activation by transcriptional components (Ogiyama, 2018).

Because deletion or mutations of PRE sequences lead to the loss of PRE looping interactions but also to the loss of PRE function, it is impossible to uncouple the importance of chromatin looping interactions from the role of PcG proteins in genome regulation. By inserting a functional gypsy insulator element between the two PREs of the dac gene locus, it was possible to block PRE interactions without affecting PRE function, allowing the demonstration that, although PRE looping contacts do not play a deterministic role in the establishment of polycomb domains and gene silencing, they are necessary to lock genes in a repressed state during development. This was, however, not due to loss of PcG proteins or of H3K27me3, suggesting that looping is rather important for the formation of a repressive nano-compartment. This function might be key to prevent improper gene activation frequently observed in cancer or other diseases (Ogiyama, 2018).

In summary, this work identified a multistep phenomenon leading to 3D genome organization in Drosophila as well as the importance and regulators of repressive chromatin loops. Future work combining epigenomics and microscopy with genome engineering should allow identifying the mechanisms by which 3D chromatin compartments fine-tune gene expression in this and other cases of gene regulation and 3D genome reprogramming (Ogiyama, 2018).

6mA-DNA-binding factor Jumu controls maternal-to-zygotic transition upstream of Zelda

A long-standing question in the field of embryogenesis is how the zygotic genome is precisely activated by maternal factors, allowing normal early embryonic development. Previous work has shown that N6-methyladenine (6mA) DNA modification is highly dynamic in early Drosophila embryos and forms an epigenetic mark. However, little is known about how 6mA-formed epigenetic information is decoded. This study reports that the Fox-family protein Jumu binds 6mA-marked DNA and acts as a maternal factor to regulate the maternal-to-zygotic transition. zelda encoding the pioneer factor Zelda was found to be marked by 6mA. Genetic assays suggest that Jumu controls the proper zygotic genome activation (ZGA) in early embryos, at least in part, by regulating zelda expression. Thus, these findings not only support that the 6mA-formed epigenetic marks can be read by specific transcription factors, but also uncover a mechanism by which the Jumu regulates ZGA partially through Zelda in early embryos (He, 2019).

It has been suggested that Zelda and Zelda-like proteins function as pioneer transcription factors to initiate zygotic genome activation. Interestingly, this study found that zelda was marked with 6mA and regulated by Jumu. Bioinformatics analysis showed that 78% (1715/2,198) of the GSRJ were also Zelda's targets. Consistently, genetic experiments showed that maternal overexpression of Zelda caused embryonic lethal phenotype, which mimics that observed in Jumu maternal mutant embryos. Importantly, this study found that partial knockdown of Zelda significantly suppressed the embryonic lethal phenotype induced by loss of maternal Jumu. These findings strongly suggested that Zelda is one of critical target genes of Jumu, and that Jumu regulates the proper zygotic genome activation, at least in part, through regulating Zelda. Previous studies have suggested that loss of maternal Zelda leads to either down-regulation or upregulation of target genes in Drosophila embryos, suggesting that Zelda regulates gene expression in either direct or indirect manner, although Zelda is a positive regulator of gene expression. For the case of indirect targets, Zelda could activate a set of miRNAs, which inhibit sets of downstream target genes. In addition to zelda, the results suggest that other important genes, such as lilli and lola, were marked with 6mA and regulated by Jumu. Based on data analysis, a model is proposed by which the maternal factor Jumu specifically reads a 6mA-based epigenetic code to control other transcription factors including Zelda, thereby ensuring the proper embryogenesis. Of note, Jumu is required for both germline development and embryogenesis; however, the actions of Jumu appear to be different between two biological contexts. In the context of early embryos, loss of maternal Jumu led to upregulation of Zelda, and loss of Jumu and overexpression of maternal Zelda caused similar embryonic lethal phenotypes. Importantly, transcriptomic profiles of GSRJ in maternal Jumu mutant were very similar to that in maternal Zelda overexpression samples. By contrast, in context of germline, knockdown or overexpression of Zelda in germ cells had no apparent effects on the normal oogenesis, suggesting that Zelda is dispensable for germline development (He, 2019).

This study has identified a Fox family protein Jumu that functions as a 6mA reader to regulate gene expression. However, it still remains unknown about whether 6mA modification has a role to inhibit the binding of protein-DNA. Nevertheless, because Fox family proteins have conserved roles in controlling embryonic development and tissue homeostasis, misregulation of some Fox family proteins has been implicated in many human diseases including cancers. It will be interesting to investigate how Fox family proteins function in concert with 6mA epigenetic mark to regulate development in mammals and in human diseases (He, 2019).

Bicoid-dependent activation of the target gene Hunchback requires a two-motif sequence code in a specific basal promoter

In complex genetic loci, individual enhancers interact most often with specific basal promoters. This study investigated the activation of the Bicoid target gene hunchback (hb), which contains two basal promoters (P1 and P2). Early in embryogenesis, P1 is silent, while P2 is strongly activated. In vivo deletion of P2 does not cause activation of P1, suggesting that P2 contains intrinsic sequence motifs required for activation. This study shows that a two-motif code (a Zelda binding site plus TATA) is required and sufficient for P2 activation. Zelda sites are present in the promoters of many embryonically expressed genes, but the combination of Zelda plus TATA does not seem to be a general code for early activation or Bicoid-specific activation per se. Because Zelda sites are also found in Bicoid-dependent enhancers, it is proposed that simultaneous binding to both enhancers and promoters independently synchronizes chromatin accessibility and facilitates correct enhancer-promoter interactions (Ling, 2019).

The promoter deletion and insertion experiments in this paper show that Bcd-dependent promoter usage at the hb locus is controlled by intrinsic DNA sequences that lie in the interval between ~51 and +69 with respect to the P2 TSS. Two sequence motifs (a strong Zld site plus TATATAAA) are critical for the efficient activation of the P2 promoter, and inserting them together into the inactive P1 promoter is sufficient to convert it to a partially active Bcd-dependent promoter. Because deletion of the P2 promoter does not result in the activation of the normally inactive P1 promoter, these motifs appear to function by actively and specifically promoting transcription, and there is little competition between P2 and P1 for Bcd-dependent activation (Ling, 2019).

Understanding P2 regulation is complicated by the Zld site's position immediately downstream of the Bcd-dependent Prox enhancer and by both the enhancer and the promoter being contained in a contiguous 390 bp fragment. One specific issue is whether the Zld site upstream of the TATATAAA sequence should be considered part of the Prox enhancer or part of the P2 promoter. Three considerations suggest that it is an integral part of the promoter. First, the Zld site extends from position ~41 to ~35 bp with respect to the hb TSS and only 5 bp upstream of the TATA sequence at position ~30. Second, the Prox enhancer deletion experiments suggest that the Zld site is required for strong activation by the Dist enhancer. Third, a study showed that at least 55 developmentally regulated promoters in Drosophila contain consensus Zld motifs that form a meta-peak ~50 bp upstream of the TSS. Altogether, it is proposed that Zld binding sites should be considered core promoter motifs for a subset of genes that are activated during the mid-blastula transition in the Drosophila (Ling, 2019).

Because Zld may function as a pioneer factor, its binding to the P2 promoter might loosen chromatin by displacing nucleosomes. Such a mechanism has been proposed for Zld sites in enhancer elements. In particular, the hb gene contains Zld sites in both its Bcd-dependent enhancers and in the P2 promoter. It is proposed that binding Zld generates an open chromatin configuration at both types of elements, which would synchronize the binding of Bcd to the enhancers and the binding of TFIID and other basal transcription factors to the P2 promoter. Because of the prevalence of Zld sites in the enhancers and promoters of embryonically expressed genes, this is likely to be a general mechanism that facilitates correct pairings between enhancers and promoters (Ling, 2019).

The P1 promoter does not contain either a strong Zld motif or a canonical TATA sequence, and it is in a closed chromatin configuration when Bcd-dependent activation of P2 occurs. This suggests that P1 is immune to Bcd-dependent activation but when placed adjacent to either the Prox or the Dist-Short enhancer or inserted into the position of P2 in the dual reporter, this promoter is efficiently activated. One explanation is that all three of these experiments position strong Zld sites in the enhancers within 100 bp of the P1 promoter. It is possible that these sites help organize a region of accessible chromatin that spreads into the adjacent P1 promoter, facilitating its activation, even in the absence of a canonical TATA box. To test this, the distance between the nearest Zld site and the P1 promoter was increased to more than 300 bp (Dist-P1), which resulted in the abolishment of expression. Perhaps this distance places the promoter beyond the range of spreading chromatin mediated by the Zld sites. In the endogenous hb gene, the P1 promoter is positioned more than 1 kb downstream of the nearest Zld site in the Dist enhancer and is inactive at this time (Ling, 2019).

Insertion of the 5' half of the P2 core, which contains the Zld site, the TATATAAA sequence, and the InrInr, causes significant activation of P1, but this activation is only about half that seen when the 120 bp P2 core sequence is inserted intact into the P1 position. This suggests that motifs downstream of the TSS are required for generating the transcription rates mediated by P2 in its normal position. Even the activation by the 120 bp P2 core sequence is less than two-thirds of that seen when the P2 is in its original position. It is possible that the Zld site in the Prox enhancer, which is not included in either P2 insertion experiment, augments P2 expression or that sequences between the Dist enhancer and P1 contribute to a region of compacted chromatin that represses the ability to activate at this stage. Future experiments will be required to test these hypotheses (Ling, 2019).

Several published studies suggested that promoters containing specific sequence motifs might attract interactions with enhancers bound by specific proteins, and it is tempting to speculate that the two-motif code discovered in this study is a common feature of promoters activated by Bcd-bound enhancers. This does not seem to be the case. For example, of the 24 embryonic promoters that contain Zld sites and TATA boxes mentioned earlier, only one is activated by a Bcd-dependent enhancer. To test this idea more rigorously, a survey was conducted of 25 well-annotated Bcd-dependent target promoters. About half of these target genes (11, including hb) were previously classified as pre-mid-blastula transition (MBT) genes because they rank among the first zygotically activated genes. Of these, seven contain TATA in their promoter sequences, and two of these contain Zld sites within 100 bp upstream of the TSS. A third promoter contains a single Zld site at ~90 but no TATA. The other 14 Bcd target genes are activated slightly later and were classified as mid-blastula transition zygotic (MBT-Zyg) or mid-blastula transition maternal (MBT-Mat) genes. Of these, only two have TATA-containing promoters, and only one of these also contains a canonical Zld site close to the TATA box. In summary, these results suggest a bias toward having TATA sequences in the promoters of the earliest expressed Bcd target genes, but they do not support the idea that Zld sites or TATA elements (or the combination of both) mediate Bcd-dependent activation per se (Ling, 2019).

Previous studies suggested that the Prox and Dist hb enhancers work together to maximize expression levels of hb. The data presented in this paper substantially extend these studies. First, the data show that both enhancers make productive interactions with P2. Deleting the Prox enhancer alone in the context of the dual reporter caused a 44% reduction in P2 expression, and deleting both the Prox and the Dist enhancers virtually abolished expression, confirming that the Dist enhancer can contribute significantly in the absence of the Prox enhancer. In vivo, deletion of the Prox enhancer causes a strong reduction in hb expression, causing lethality and the loss of two thoracic segments from the larval cuticle. Thus, the amount of hb produced by the Dist enhancer alone is insufficient to provide in vivo hb function. In contrast to previous studies, no significant effect was detected on P2 expression levels when the Dist enhancer was deleted from the reporter gene. Furthermore, deleting the Dist enhancer in vivo did not lead to a mutant phenotype, suggesting that it is dispensable for development under normal laboratory conditions. Thus, the Prox enhancer is critical for hb function, and although the Dist enhancer can interact with P2 to some degree, the level produced by this enhancer alone cannot replace that normally provided by the Prox enhancer (Ling, 2019).

Finally, the results show that the Zld site and TATATAAA each contribute quantitatively to the level of transcription driven by the P2 promoter. Furthermore, attempts to convert P1 into a Bcd-responsive promoter resulted in many output levels. Constructs carrying the Zld+TATA code were expressed at higher levels than those containing a single Zld or TATA site. In addition, a construct carrying the TATATAAA motif and the double initiator was expressed at higher levels than one carrying a simple TATA sequence and an Inr. Finally, changing the spacing between the Zld site and the TATA motif strongly affected expression levels. Altogether, these experiments suggest that basal promoter sequences can play critical roles in precisely determining levels of transcription, in addition to mediating specific enhancer-promoter interactions (Ling, 2019).

Synthetic reconstruction of the hunchback promoter specifies the role of Bicoid, Zelda and Hunchback in the dynamics of its transcription

For over 40 years, the Bicoid-hunchback (Bcd-hb) system in the fruit fly embryo has been used as a model to study how positional information in morphogen concentration gradients is robustly translated into step-like responses. A body of quantitative comparisons between theory and experiment have since questioned the initial paradigm that the sharp hb transcription pattern emerges solely from diffusive biochemical interactions between the Bicoid transcription factor and the gene promoter region. Several alternative mechanisms have been proposed, such as additional sources of positional information, positive feedback from Hb proteins or out-of-equilibrium transcription activation. By using the MS2-MCP RNA-tagging system and analysing in real time, the transcription dynamics of synthetic reporters for Bicoid and/or its two partners Zelda and Hunchback, this study showed that all the early hb expression pattern features and temporal dynamics are compatible with an equilibrium model with a short decay length Bicoid activity gradient as a sole source of positional information. Meanwhile, Bicoid's partners speed-up the process by different means: Zelda lowers the Bicoid concentration threshold required for transcriptional activation while Hunchback reduces burstiness and increases the polymerase firing rate (Fernandes, 2022).

Recently, synthetic approaches have been used to understand how the details of gene regulation emerge from the plethora of binding sites for transcription factors buried in genomes. In developmental systems, these approaches are starting to help us unravel the evolution of gene regulatory modules. In many cases, using high-throughput analysis of systematically mutagenized regulatory sequences, expression was measured through synthesis of easily detectable fluorescent proteins, RNA sequencing or antibody or FISH staining on fixed samples. Even though these approaches allowed screening for a high number of mutated sequences with a very high resolution (single nucleotide level), the output measurements remained global and it was hard to capture the temporal dynamics of the transcription process itself. In addition, because effects of single mutations are frequently compensated by redundant sequences, it remained often difficult from these studies to highlight the mechanistic roles of the TF they bind to. This work combined the MS2 tagging system, which allows for a detailed measurement of the transcription process dynamics at high temporal resolution, with an orthogonal synthetic approach focusing on a few cis-regulatory elements with the aim of reconstructing from elementary blocks most features of hb regulation by Bcd. The number and placement of TF BS in the MS2 reporters are not identical to those found on the endogenous hb promoter and the number of combinations tested was very limited when compared to the high throughput approaches mentioned above. Nevertheless, this synthetic approach combined with quantitative analyses and modeling sheds light on the mechanistic steps of transcription dynamics (polymerase firing rate, bursting, licensing to be ON/OFF) involving each of the three TFs considered (Bcd, Hb, and Zld). Based on this knowledge from synthetic reporters and the known differences between them, an equilibrium model of transcription regulation was built that agrees with the data from the hb-P2 reporter expression (Fernandes, 2022).

Expression from the Bcd-only synthetic reporters indicate that increasing the number of Bcd BS from 6 to 9 shifts the transcription pattern boundary position toward the posterior region. This is expected as an array with more BS will be occupied faster with the required amount of Bcd molecules. Increasing the number of Bcd BS from 6 to 9 also strongly increases the steepness of the boundary indicating that cooperativity of binding, or more explicitly a longer time to unbind as supported by our model fitting, is likely to be at work in this system. In contrast, adding three more BS to the 9 Bcd BS has very limited impact, indicating that either Bcd molecules bound to the more distal BS may be too far from the TSS to efficiently activate transcription or that the system is saturated with a binding site array occupied with 9 Bcd molecules. In the anterior with excess Bcd, the fraction of time when the loci are active at steady state also increases when adding 3 Bcd BS from B6 to B9. By assuming a model of transcription activation by Bcd proteins bound to target sites, the activation rate increases by much greater fold (~4.5 times) than the number of BS (1.5-2 times) suggesting a synergistic effect in transcription activation by Bcd (Fernandes, 2022).

The burstiness of the Bcd-only reporters in regions with saturating amounts of Bcd, led us to build a model in two steps. The first step of this model accounts for the binding/unbinding of Bcd molecules to the BS arrays. It is directly related to the positioning and the steepness of the expression boundary and thus to the measurement of positional information. The second step of this model accounts for the dialog between the bound Bcd molecules and the transcription machinery. It is directly related to the fluctuation of the MS2 signals including the number of firing RNAP at a given time (intensity of the signal) and bursting (frequency and length of the signal). Interestingly, while the first step of the process is achieved with an extreme precision (10% EL), the second step reflects the stochastic nature of transcription and is much noisier. This model therefore also helps to understand and reconcile this apparent contradiction in the Bcd system (Fernandes, 2022).

As predicted by an original theoretical model, 9 Bcd BS in a synthetic reporter appear sufficient to reproduce experimentally almost entirely the spatial features of the early hb expression pattern i.e. measurements of positional information. This is unexpected as the hb-P2 promoter is supposed to only carry 6 Bcd BS and leaves open the possibility that the number of Bcd BS in the hb promoter might be higher. Alternatively, it is also possible that even though containing 9 Bcd BS, the B9 reporter can only be bound simultaneously by less than 9 Bcd molecules. This possibility must be considered if for instance, the binding of a Bcd molecule to one site prevents by the binding of another Bcd molecule to another close by site (direct competition or steric hindrance). Even though this possibility cannot be excluded, it is thought to be unlikely for several reasons: (1) some of the Bcd binding sites in the hb-P2 promoter are also very close to each other and the design of the synthetic constructs was made by multimerizing a series of 3 Bcd binding sites with a similar spacing as found for the closest sites in the hb-P2 promoter; (ii) the binding of Bcd or other homeodomain containing proteins to two BS is generally increased by cooperativity when the sites are close to each other (as close as two base pairs for the paired homeodomain) compared to binding without cooperativity when they are separated by five base pairs or more (Fernandes, 2022).

Importantly, even though it is not really known if the B9 and the hb-P2 promoter contain the same number of effective Bcd BS, the B9 reporter which solely contains Bcd BS recapitulates most spatial features of the hb-P2 reporter, clearly arguing that Bcd on its own brings most of the spatial (positional) information to the process. Interestingly, the B9 reporter is however much slower (2-fold) to reach the final boundary position than the hb-P2 reporter. This suggested that other maternally provided TFs binding to the hb-P2 promoter contribute to fast dynamics of the hb pattern establishment. Among these TFs, this study focused on two known maternal partners of Bcd: Hb which acts in synergy with Bcd and Zld, the major regulator of early zygotic transcription in fruit fly. Interestingly, adding Zld or Hb sites next to the Bcd BS array reduces the time for the pattern to reach steady state and modifies the promoter activity in different ways: binding of Zld facilitates the recruitment of Bcd at low concentration, making transcription more sensitive to Bcd and initiate faster while the binding of Hb affects strongly both the activation/deactivation kinetics of transcription (burstiness) and the RNAP firing rate. Thus, these two partners of Bcd contribute differently to Bcd-dependent transcription. Consistent with an activation process in two steps as proposed in this model, Zld will contribute to the first step favoring the precise and rapid measurements of positional information by Bcd without bringing itself positional information. Meanwhile, Hb will mostly act through the second step by increasing the level of transcription through a reduction of its burstiness and an increase in the polymerase firing rate. Interestingly, both Hb and Zld binding to the Bcd-dependent promoter allow speeding-up the establishment of the boundary, a property that Bcd alone is not able to achieve. Of note, the hb-P2 and Z2B6 reporters contain the same number of BS for Bcd and Zld but they have also very different boundary positions and mean onset time of transcription T0 following mitosis when Bcd is limiting. This is likely due to the fact that the two Zld BS in the hb-P2 promoter are not fully functional: one of the Zld BS is a weak BS while the other Zld BS has the sequence of a strong BS but is located too close from the TATA Box (5 bp) to provide full activity (Fernandes, 2022).

Zld functions as a pioneer factor by potentiating chromatin accessibility, transcription factor binding and gene expression of the targeted promoter. Zld has recently been shown to bind nucleosomal DNA and proposed to help establish or maintain cis-regulatory sequences in an open chromatin state ready for transcriptional activation. In addition, Zld is distributed in nuclear hubs or microenvironments of high concentration. Interestingly, Bcd has been shown to be also distributed in hubs even at low concentration in the posterior of the embryo. These Bcd hubs are Zld-dependent and harbor a high fraction of slow moving Bcd molecules, presumably bound to DNA. Both properties of Zld, binding to nucleosomal DNA and/or the capacity to form hubs with increased local concentration of TFs can contribute to reducing the time required for the promoter to be occupied by enough Bcd molecules for activation. In contrast to Zld, knowledge on the mechanistic properties of the Hb protein in the transcription activation process is much more elusive. Hb synergizes with Bcd in the early embryo and the two TF contribute differently to the response with Bcd providing positional and Hb temporal information to the system. Hb also contributes to the determination of neuronal identity later during development. Interestingly, Hb is one of the first expressed members of a cascade of temporal TFs essential to determine the temporal identity of embryonic neurons in neural stem cells (neuroblasts) of the ventral nerve cord. In this system, the diversity of neuronal cell-types is determined by the combined activity of TFs specifying the temporal identity of the neuron and spatial patterning TFs, often homeotic proteins, specifying its segmental identity. How spatial and temporal transcription factors mechanistically cooperate for the expression of their target genes in this system is not known. The current work indicates that Hb is not able to activate transcription on its own but that it strongly increases RNAP firing probability and burst length of a locus licensed to be ON. Whether this capacity will be used in the ventral nerve cord and shared with other temporal TFs would be interesting to investigate (Fernandes, 2022).

The Bcd-only synthetic reporters also provided an opportunity to scrutinize the effect of Bcd concentration on the positioning of the expression domain boundaries. This question has been investigated with endogenous hb in the past, always giving a smaller shift than expected given the decay length of 20% EL for the Bcd protein gradient and arguing against the possibility that positional information in this system could solely be dependent on Bcd concentration. When comparing the transcription patterns of the B9 reporter in Bcd-2X flies and Bcd-1X flies, a shift was detected of ~10.5 ± 1% EL of the boundary position. This shift revealed a gradient of Bcd activity with an exponential decay length of ~15 ± 1.4% EL (~75 μm), significantly smaller than the value observed directly (20% EL, ~ 100 μm) with immuno-staining for the Bcd protein gradient but closer from the value of 16.4% EL obtained with immuno-staining for Bcd of the Bcd-GFP gradient. Given the discrepancies of previous studies concerning the measurements of the Bcd protein gradient decay length, this work calls for a better quantification to determine how close the decay length of the Bcd protein gradient is from the decay length of the Bcd activity gradient uncovered here. This work opens the possibility that the effective decay length of 15% EL corresponds to a population of 'active' or 'effective' Bcd distributed in steeper gradient than the Bcd protein gradient observed by immunodetection which would include all Bcd molecules. Bcd molecules have been shown to be heterogenous in intranuclear motility, age and spatial distributions but to date, it is not known which population of Bcd can access the target gene and activate transcription. The existence of two (or more) Bicoid populations with different mobilities obviously raises the question of the underlying gradient for each of them. Also, the dense Bcd hubs persist even in the posterior region where the Bcd concentration is low. As the total Bcd concentration decreases along the AP axis, these hubs accumulate Bcd with increasing proportion in the posterior, resulting in a steeper gradient of free-diffusing Bcd molecules outside the hubs. At last, the gradient of newly translated Bcd was also found to be steeper than the global gradient. Finally and most importantly, reducing by half the Bcd concentration in the embryo induced a similar shift in the position of the hb-P2 reporter boundary as that of the Bcd-only reporters. This further argues that this gradient of Bcd activity is the principal and direct source of positional information for hb expression (Fernandes, 2022).

The effective Bcd gradient found here rekindles the debate on how a steep hb pattern can be formed in the early nuclear cycles. With the previous value of λ=20% EL for the decay length of the Bcd protein gradient, the Hill coefficient inferred from the fraction of loci's active time at steady state PSpot is ~6.9, beyond the theoretical limit of the equilibrium model of Bcd interacting with six target BS of the hb promoter. This led to hypotheses of energy expenditure in Bcd binding and unbinding to the sites, out-of-equilibrium transcription activation, hb promoters containing more than 6 Bcd sites or additional sources of positional information to overcome this limit. The effective decay length λeff ~15% EL, found here with a Bcd-only reporter but also hb-P2, corresponds to a Hill coefficient of ~5.2, just below the physical limit of an equilibrium model of concentration sensing with 6 Bcd BS alone. Of note, a smaller decay length also means that the effective Bcd concentration decreases faster along the AP axis. In the Berg & Purcell limit (Biophys. J., 1977), the time length to achieve the measurement error of 10% at hb-P2 expression boundary with λ=15% EL is ~2.1 times longer than with λ=20% EL. This points again to the trade-off between reproducibility and steepness of the hb expression pattern and reinforces the importance of Hb and Zelda in speeding-up the process (Fernandes, 2022).

Cell-type-specific chromatin occupancy by the pioneer factor Zelda drives key developmental transitions in Drosophila

During Drosophila embryogenesis, the essential pioneer factor Zelda defines hundreds of cis-regulatory regions and in doing so reprograms the zygotic transcriptome. While Zelda is essential later in development, it is unclear how the ability of Zelda to define cis-regulatory regions is shaped by cell-type-specific chromatin architecture. Asymmetric division of neural stem cells (neuroblasts) in the fly brain provide an excellent paradigm for investigating the cell-type-specific functions of this pioneer factor. Zelda was shown to synergistically function with Notch to maintain neuroblasts in an undifferentiated state. Zelda misexpression reprograms progenitor cells to neuroblasts, but this capacity is limited by transcriptional repressors critical for progenitor commitment. Zelda genomic occupancy in neuroblasts is reorganized as compared to the embryo, and this reorganization is correlated with differences in chromatin accessibility and cofactor availability. It is proposed that Zelda regulates essential transitions in the neuroblasts and embryo through a shared gene-regulatory network driven by cell-type-specific enhancers (Larson, 2021).

The results demonstrate that Zld, an essential transcriptional activator of the zygotic genome, promotes the undifferentiated state in the neural stem-cell lineage of the larva and can revert partially differentiated cells to a stem cell. Other pioneer factors are known to have similar functions when misexpressed, and this can lead to disease. For example, expression of DUX4, an activator of the zygotic genome in humans, in muscle cells leads to Facioscapulohumeral muscular dystrophy (FSHD), and OCT4 and Nanog are overexpressed in undifferentiated tumors and their expression is associated with poor clinical outcomes. Despite the ability of Zld to promote the undifferentiated stem-cell fate, this capacity is limited as cells differentiate to INPs. This study showed that the ability of Zld to drive gene expression is limited by the repressors Erm and Ham. Because of the ability of pioneer factor misexpression to cause disease, understanding these cell-type-specific constraints on pioneer factors has important implications for understanding of development and disease (Larson, 2021).

Zld expression promotes the reversion of partially differentiated immature INPs to a stem-cell fate, resulting in supernumerary type II neuroblasts. Furthermore, failure to down-regulate zld in the newly generated INPs results in supernumerary type II neuroblasts. Thus, Zld levels must be precisely controlled to allow differentiation following asymmetric division of the type II neuroblasts. Zld promotes the undifferentiated state, at least in part, through the ability to drive expression of Dpn, a key transcription factor for driving type II neuroblast self-renewal. dpn is a target of the Notch pathway in type II neuroblasts and constitutively activated Notch signaling drives dpn expression. However, loss of Notch signaling does not completely abrogate expression of known target genes, including dpn, suggesting that additional activators can drive expression in the absence of Notch. Indeed, this study shows that Zld functions as such a factor in driving dpn expression, and loss of a single copy of dpn can suppress the ability of Zld to promote the reversion of immature INPs to type II neuroblasts. It is proposed that this redundancy with Notch is not limited to regulating dpn expression and that Zld and Notch may function together to regulate a number of genes required for type II neuroblast maintenance. Supporting this, Zld is bound to 49% of the identified direct Notch-target genes in neuroblasts. Although, Zld is not required for type II neuroblast maintenance, loss of Zld can enhance knockdown of the Notch pathway demonstrating a partially redundant requirement for these two pathways in maintaining type II neuroblast fate. Based on these data, it is proposed that Zld and Notch function in parallel to drive gene expression, and this redundancy robustly maintains the type II neuroblast pool (see Zld promotes the undifferentiated state through neuroblast-specific enhancers that become progressively silenced during differentiation) (Larson, 2021).

Zld functions in parallel to Notch to maintain type II neuroblasts in an undifferentiated state through activation of tll and dpn. Downregulation of Notch signaling in newly born INPs allows Erm and Ham to become sequentially activated during INP commitment. In the INPs, Erm- and Ham-mediated silencing of tll prevents reactivation of Notch signaling from inducing tll expression and driving reversion to neuroblasts. When misexpressed in immature INPs, Zld activates dpn expression and promotes reversion to a neuroblast. Changes to the chromatin structure mediated by Erm and Ham have been documented that limit the ability of Zld to drive tll expression and therefore reprogram mature INPs. Cell-type-specific binding by Zld correlates with chromatin accessibility and identifies tissue-specific enhancers. Type II neuroblast-specific Zld-binding sites are not enriched for the canonical Zld-binding motif and are instead enriched for sequences corresponding to additional co-factors that may stabilize Zld binding or promote chromatin accessibility at these loci (Larson, 2021).

Zld binding in the early embryo is distinctive as it is driven primarily by DNA sequence with a majority of the canonical-binding motifs occupied. This is in contrast to most other transcription factors, whose binding is influenced widely by chromatin accessibility and therefore bind only a small fraction of their canonical motifs. This study reports the genome-wide occupancy of Zld in a tissue apart from the early embryo and begins to identify important functions for zygotically expressed Zld in a stem-cell population. While thousands of loci have been identified that are occupied by Zld both in the embryo and in the larval type II neuroblasts, thousands more were unique to each cell type. This is in contrast to what has been shown for another pioneer-transcription factor, Grainy head (Grh), which has similar genomic occupancy in the embryo and larval imaginal discs. Thus, unlike for Grh, Zld binding is cell-type-specific and likely governed by changes to the chromatin structure along with the expression of cell-type-specific transcription factors (Larson, 2021).

Despite their ability to engage nucleosomal DNA, experiments, largely in cell culture, have demonstrated that most pioneer-transcription factors show cell-type-specific chromatin occupancy. Both chromatin state and co-factors influence the binding of pioneer factors, like Oct. While most pioneer factors have been studied through misexpression in culture, the current data show that binding of an endogenously expressed pioneer factor within a developing organism is also cell-type specific. By analyzing genome occupancy and chromatin accessibility in two different cell types, the embryo and type II neuroblast lineage, it was demonstrated that binding is highly correlated with accessibility in both cell types. While it is not possible to determine whether chromatin accessibility regulates Zld binding or Zld binding drives accessibility, it is proposed that at sites with the Zld motif, Zld may be responsible for promoting accessibility. While at the majority of sites bound by Zld in the larval type II neuroblasts, accessibility influences Zld occupancy. In the early embryo, Zld binds when the chromatin is naive with relatively few chromatin marks and is rapidly replicated. This Zld binding is driven largely by DNA sequence and is required for chromatin accessibility at a subset of sites. Thus, in the early embryo, Zld binding can influence accessibility. However, Zld occupancy is reorganized in the type II neuroblasts such that only a fraction of the canonical-binding motifs is occupied, and those motifs that are not bound by Zld are not accessible. The small percentage of Zld-bound sites that contain the canonical Zld-binding motif in the neuroblasts rapidly loose accessibility following induced differentiation and sites that lack a canonical-binding motif more gradually loose accessibility. This suggests that at sites with the Zld motif, Zld may be responsible for promoting accessibility. In contrast to this small subset of sites, there is a significant enrichment of type II neuroblast-specific binding sites at promoters, which are known to be generally accessible in a broad range of cell types. This suggests that Zld occupancy in the type II neuroblasts is likely shaped by chromatin accessibility. Together, these data support a model whereby in the early embryo Zld can bind broadly to the naive genome while in the neuroblasts Zld binding is limited by chromatin state. Future studies will enable the identification of what limits Zld binding and will allow for the definition of chromatin barriers to reprogramming within the context of a developing organism (Larson, 2021).

Along with chromatin structure, co-factors regulate mammalian pioneer-factor binding in culture, including binding by Oct4, Sox2 and FOXA2. In addition to chromatin accessibility, the data similarly support a role for specific transcription factors in regulating Zld binding in type II neuroblasts. Type II neuroblast-specific Zld-bound loci are not enriched for the canonical Zld motif, suggesting additional factors facilitate Zld binding to these regions. Supporting a functional role for this recruitment, expression of Zld with mutations in the zinc-finger DNA-binding domain, which abrogate sequence-specific binding, can still drive supernumerary neuroblasts. It has been previously shown that while this mutant protein lacks sequence-specific binding properties, the polypeptide retains an affinity for nucleosomal DNA. This nonspecific affinity may be stabilized by additional factors expressed in the neuroblasts. One such factor may be the GA-dinucleotide binding factor CLAMP that has recently been shown to promote Zld binding at promoters and whose binding motif is enriched at neuroblast-specific, Zld-binding sites (Larson, 2021).

The capacity of Zld to drive the undifferentiated state is limited along the type II neuroblast lineage. While expression of Zld in immature INPs results in supernumerary neuroblasts, Zld expression in mature INPs does not. Similarly, misexpression of the Zld-target gene dpn in immature INPs can drive their reversion to neuroblasts, and the data suggest that Zld-mediated reversion is caused, at least in part, by driving expression of dpn. By contrast, the endogenous expression of dpn in mature INPs does not cause their reversion to neuroblasts because the self-renewal program is decommissioned during INP maturation. This decommissioning is mediated by successive transcriptional repressor activity. It has been recently demonstrated that Erm and Ham function sequentially to repress expression of genes that promote neuroblast fate. The data suggest that changes to the chromatin structure mediated by these transcriptional repressors limit the ability of Zld to drive gene expression and therefore reprogram mature INPs (Larson, 2021).

An essential target of Erm and Ham repression is tll. In contrast to other stem-cell regulators like Notch and Dpn, Tll is expressed only in type II neuroblasts and not in the transit-amplifying INPs. Furthermore, expression of tll in mature INPs can robustly drive supernumerary neuroblasts. In the embryo, tll is a Zld-target gene, and ChIP-seq data identify Zld-binding sites in the type II neuroblasts. While Zld occupies the promoter of tll in both type II neuroblasts and the embryo, this study identified unique binding sites for Zld in upstream regions in both cell types and demonstrate that these likely denote cell-type-specific enhancers. A neuroblast-specific, Zld-bound enhancer was identified that drives expression specifically in the neuroblasts and a model is supported whereby Zld activates expression from this enhancer in the type II neuroblasts. Erm- and Ham-mediate chromatin changes, likely through histone deacetylation, that progressively limit chromatin accessibility during INP maturation. This decrease in accessibility inhibits the ability of ectopically expressed Zld to activate expression from this enhancer, keeping tll repressed in the INPs. Gene expression profiling identified Six4 as a gene that, like tll, is expressed only in neuroblasts and not in INPs. This study identified a neuroblast-specific Zld-bound region that progressively loses accessibility during INP maturation. Thus, Erm and Ham likely silence multiple Zld-bound enhancers to allow for the transition from a self-renewing neuroblast to a transient-amplifying INP. It is proposed that while the Zld-bound neuroblast-specific enhancer is accessible in neuroblasts, following asymmetric division, changes to the chromatin state mediated by Erm and Ham and downregulation of Zld expression robustly decommissions this enhancer, allowing for differentiation (Larson, 2021).

The data support a model in which Erm- and Ham-mediated changes to the chromatin inhibit binding by the pioneer factor Zld and, in so doing, limit the ability of Zld expression to reprogram cell fate. These studies in both the early embryo and in type II neuroblasts provide a powerful platform for identifying the barriers to pioneer-factor-mediated reprogramming within the context of development and support a role for both chromatin organization and cell-type-specific co-factors in determining Zld occupancy. Future studies will reveal these specific barriers and will help to identify fundamental processes that may limit reprogramming both in culture and in disease states (Larson, 2021).

Stepwise modifications of transcriptional hubs link pioneer factor activity to a burst of transcription

Binding of transcription factors (TFs) promotes the subsequent recruitment of coactivators and preinitiation complexes to initiate eukaryotic transcription, but this time course is usually not visualized. It is commonly assumed that recruited factors eventually co-reside in a higher-order structure, allowing distantly bound TFs to activate transcription at core promoters. This study used live imaging of endogenously tagged proteins, including the pioneer TF Zelda, the coactivator dBrd4 (Female sterile (1) homeotic), and RNA polymerase II (RNAPII), to define a cascade of events upstream of transcriptional initiation in early Drosophila embryos. These factors are sequentially and transiently recruited to discrete clusters during activation of non-histone genes. Zelda and the acetyltransferase dCBP (Nejire) nucleate dBrd4 clusters, which then trigger pre-transcriptional clustering of RNAPII. Subsequent transcriptional elongation disperses clusters of dBrd4 and RNAPII. These results suggest that activation of transcription by eukaryotic TFs involves a succession of distinct biomolecular condensates that culminates in a self-limiting burst of transcription (Cho, 2023).

In eukaryotes, the recruitment of RNA polymerase II (RNAPII) to transcription start sites on DNA depends on the assembly of the preinitiation complex (PIC) and is regulated by hundreds of trans-acting factors. In particular, transcription factors (TFs) recruit nucleosome remodelers, histone modifiers, and Mediator to promote the formation of PIC. How these numerous upstream inputs are integrated to give the extraordinary specificity and intricacy of transcriptional regulation remains incompletely understood. A common view suggested by biochemical studies is that these factors are progressively assembled into a single final complex through cooperative interactions. However, other sophisticated processes initiating DNA replication and promoting splicing of mRNAs are governed by a series of distinct and ephemeral complexes in which each complex promotes the next in energy-driven steps. This study examines the possibility that initiation of transcription similarly involves directional transformations of intermediate complexes that would provide additional opportunity for specificity and regulation (Cho, 2023).

Visualizing the composition of transcriptional machinery over time might detect intermediate complexes that integrate the multitude of regulatory inputs of transcriptional control. In recent years, advances in confocal and super-resolution imaging led to the discovery that a wide variety of transcriptional regulators are recruited to form clusters at active genes. These clusters are thought to function as 'transcriptional hubs' by locally enriching transcriptional machinery and enhancing their binding to target DNA sites. Transcriptional hubs are a type of membraneless compartment, whose formation typically involves the multivalent interaction between intrinsically disordered regions (IDRs). Accordingly, IDRs are commonly found in the activation domains of TFs as well as the C-terminal domain (CTD) of Rpb in RNAPII. Similar to the idea that a single final complex is assembled on the DNA to initiate transcription, it has been proposed that the heterotypic interactions between IDRs can give rise to a compartment that simultaneously enriches TFs, coactivators, Mediator, and RNAPII at promoters. Nonetheless, how transcriptional hubs are regulated and whether they undergo compositional changes are still unclear (Cho, 2023).

Studying the dynamics of transcriptional hubs in living cells is complicated by the discontinuous and stochastic nature of eukaryotic transcription, a phenomenon also known as bursting. The Drosophila embryo provides a powerful context to study the timing of events upstream of transcriptional initiation. The early wave of transcription in Drosophila embryos is coupled to the rapid nuclear division cycles such that a few hundred genes initiate a burst of transcription about 3 min after each mitosis. The synchrony of early nuclear cycles and real-time localization of tagged proteins allow one to track activation events prior to the onset of transcription, and tools to knockdown function are available to assess the contribution of events to gene activation. In a recent study, live imaging of endogenously tagged RNAPII revealed the abrupt appearance of RNAPII clusters 2–3 min after mitosis. Brief metabolic labelling revealed foci of nascent transcripts throughout the nuclei in fixed embryos—these foci broadly colocalized with RNAPII clusters, indicating that early-forming RNAPII clusters mark sites of active transcription. Importantly, as nascent transcript levels increased, RNAPII clusters declined and eventually dispersed. These observations are consistent with numerous observations and support a model in which a large excess of RNAPII is recruited prior to initiation, which is then inefficiently converted to elongating RNAPII. What produces this pre-transcriptional RNAPII clustering and how it is coordinated with a burst of transcription are not yet fully understood. In this study, events are followed during the ~2.5 min between mitotic exit and the formation of RNAPII clusters and the fate of these clusters as transcription ensues at about 3 min after mitosis (Cho, 2023).

Zelda (Zld) is a pioneer TF that widely promotes the early wave of zygotic gene expression. Maternally supplied Zld binds to thousands of enhancers and promoters, and its binding sites exhibit increased chromatin accessibility and histone acetylation. Depletion of maternally expressed Zld curtails early zygotic transcription, and the embryos become highly defective at the mid-blastula transition (MBT). The transactivation domain of Zld has been mapped to an intrinsically disordered region. Moreover, fluorescently tagged Zld forms highly dynamic clusters in the nucleus, and previous studies suggest that Zld clusters increase the local concentration of other TFs and facilitate their binding to target DNA. Knockdown of Zld reduces RNAPII 'speckles' in fixed embryos. While these previous studies support a model in which Zld promotes the recruitment of additional components to form transcriptional hubs and facilitates the onset of zygotic transcription, the exact mechanism has not been determined (Cho, 2023).

This study combined genetic perturbation and real-time imaging to delineate a pathway that nucleates and serially transforms transcriptional hubs to trigger initiation of transcription in early Drosophila embryos. Zld is shown to act through transcription coactivators, including the lysine acetyltransferase dCBP and the BET protein dBrd4, to initiate RNAPII clustering at non-histone genes. Importantly, real-time imaging reveals only limited colocalization of these factors at transcriptional hubs, suggesting dynamic and directional changes in the composition such that upstream activators do not stably persist in the hubs with downstream effectors and RNAPII. A model is proposed in which Zld forms numerous largely unstable clusters, some of which trigger a dCBP-dependent step to build more stable dBrd4 clusters; a subset of these dBrd4 clusters then promotes RNAPII clustering near active promoters, and this pool of RNAPII fuels a burst of transcription. Inhibition of transcriptional elongation stabilizes some Zld and dBrd4 clusters, indicating that transcription directly or indirectly promotes their dispersal. Finally, while early inhibition of transcription inhibits RNAPII clustering, abrupt inhibition of transcript elongation after hub formation stabilizes RNAPII clusters. These findings indicate that transcription destabilizes hubs, a feedback that could lead to cycles of RNAPII accumulation and depletion, thereby contributing to the busting feature of transcription. It is suggested that the onset of transcription, like the onset of replication, involves upstream events that directionally modify the machinery to precisely control the process (Cho, 2023).

It has long been recognized that the compartmentalization of transcriptional machinery is a fundamental aspect of eukaryotic gene control. Early cytological studies revealed discrete clusters of RNAPII and nascent transcripts, which were speculated to be stable "transcription factories". Subsequent studies show that rather than genes being recruited to stable factories, numerous factors form hubs or liquid-like condensates transiently at active genes. This leaves open the questions of what governs the dynamics of transcriptional hubs/condensates and how their emergence and dispersal are linked to transcript synthesis. This study used real-time approaches to dissect upstream events in transcriptional initiation whose timing is constrained and synchronized in early Drosophila embryos by coupling to the rapid cell cycles. A cascade of dependencies is documented paralleled by a temporal cascade of cluster formation. The findings indicate that transcriptional hubs directionally pass through a series of intermediate states with different composition, rather than simply enriching all the factors involved in initiating transcription. Specifically, the pioneer TF Zelda acts through coactivators dCBP and dBrd4 to indirectly concentrate pools of RNAPII near promoters. Inhibition of transcription by α-amanitin stabilizes dBdr4 and RNAPII clusters, indicating that transcription directly or indirectly promotes dispersal of transcriptional hub components resulting in negative feedback. It is suggested that the progressive maturation of transcriptional hubs coupled with a negative feedback-loop stimulates a rapid but self-limiting burst of transcription in the early rapid embryonic cycles. These findings have striking parallels to the proposal that non-equilibrium dynamics of transcriptional condensates make direct contributions to sequential transcriptional bursts in the longer cell cycles of more mature cells (Cho, 2023).

The dynamic nature of transcriptional hubs described in this study is distinct from the well characterized transcriptional condensates at nucleoli or histone locus bodies, which are stable compartments and incorporate multiple functionally related components. The dynamic process with its multiple transitions might serve to add precision and sophistication to transcriptional control. First, transitions between discrete steps could provide proofreading steps that test the stability of intermediate complexes to filter out stochastic noise and increase regulatory specificity. Second, additional regulators might promote or prevent passage through the different transitions, thereby allowing the transcriptional hubs to integrate multiple inputs to generate the intricate spatiotemporal expression of developmental genes. In line with these ideas, these data show that the transitions from Zld clusters to dBrd4 and then to RNAPII are each associated with a decline in the number of clusters, suggesting that the maturation of transcription hubs is selective at successive steps. It will be important to learn how this feature contributes to the extraordinary accuracy with which the graded and combinatorial inputs generate transcriptional outputs (Cho, 2023).

The molecular mechanisms that drive the sequential transformation of transcriptional hubs remain to be fully determined. During the first step, Zld and dCBP might directly interact with each other or undergo co-condensation. Alternatively, open chromatin established by Zld could facilitate binding of additional TFs that interact with dCBP. However, it should also be kept in mind that TFs might inhibit deacetylation to indirectly enhance local dCBP-dependent acetylation. In any case, it seems likely, but not yet demonstrated, that dCBP acts by increasing local acetylation to recruit the reader dBrd4. Although dBrd4 might simply bind to histone marks such as H3K27ac, the acetylation of transcriptional machinery could also be involved in recruiting dBrd4. Upon crossing a concentration threshold, dBrd4 clustering might be promoted by multivalent interactions mediated by its own IDR. While the initial clustering of RNAPII appears to spatially coincide with dBrd4 clusters, the subsequent behavior is not consistent with stable partnership, as dBrd4 is lost from temporarily persisting RNAPII clusters. Imaging the period of loss of dBrd4 revealed accompanying features that varied between clusters: abrupt physical rearrangement of foci, simple gradual loss of dBbr4 from complexes, and apparent de-mixing of previously colocalized signals to form largely separate dBrd4 and RNAPII clusters. These behaviors may represent different manifestations of progressive modifications of the biomolecular condensate that reduce the interactions that previously stabilized co-residency of dBdr4 and RNAPII. Finally, both positive and negative effects of transcriptional elongation on the dynamics of transcriptional hubs were observed. The initial requirement of transcription for RNAPII clustering might involve the upstream roles of enhancer RNAs in nucleating RNAPII cluster. In contrast, a later sustained period of transcription of the gene body appears to mediate negative feedback to disperse dBrd4 and RNAPII clusters. This could be explained by a suggested disruption of multivalent interaction between IDRs by the negative charge of nascent RNA but numerous other less direct mechanisms might be responsible. While thw results reveal the timing and coordination of upstream events required for transcription, much more work is needed to provide a mechanistic understanding of the observed processes (Cho, 2023).

Regardless of the molecular details, it is expected that similar regulatory principles are employed by evolutionarily diverse transcription factors to mediate transcriptional activation. For example, in the zebrafish embryo, the pioneer factors Nanog, Pou5f3, and Sox19b similarly recruit CBP/p300 and Brd4 to establish transcriptional competence during early zygotic gene expression. Activation by estrogen receptor α (ERα) also involves histone acetylation and subsequent recruitment of Brd46. Notably, elegant work has shown that dozens of factors are recruited to the ERα target promoter in a cyclical and sequential fashion. It is envision that many of these factors are dynamically recruited to the hubs, and that the enzymatic reactions they carry out contribute to the speed and irreversibility of the transformation of transcriptional hubs. Lastly, it is suggested that the formation of transcriptional hubs in early embryos ensures the rapid initiation of a transcriptional burst within a short interphase window; in other biological contexts, the hubs might serve additional functions such as bridging enhancers and promoters or coordinating expression of multiple loci65. The Drosophila embryos will provide a powerful system to dissect the relationship between transcriptional hubs, chromatin interactions, and transcription dynamics (Cho, 2023).

Zelda potentiates morphogen activity by increasing chromatin accessibility

Zygotic genome activation (ZGA) is a major genome programming event whereby the cells of the embryo begin to adopt specified fates. Experiments in Drosophila and zebrafish have revealed that ZGA depends on transcription factors that provide large-scale control of gene expression by direct and specific binding to gene regulatory sequences. Zelda (Zld) plays such a role in the Drosophila embryo, where it has been shown to control the action of patterning signals; however, the mechanisms underlying this effect remain largely unclear. A recent model proposed that Zld binding sites act as quantitative regulators of the spatiotemporal expression of genes activated by Dorsal (Dl), the morphogen that patterns the dorsoventral axis. This study tested this model experimentally, using enhancers of brinker (brk) and short gastrulation (sog), both of which are directly activated by Dl, but at different concentration thresholds. In agreement with the model, it was shown that there is a clear positive correlation between the number of Zld binding sites and the spatial domain of enhancer activity. Likewise, the timing of expression could be advanced or delayed. Evidence is presented that Zld facilitates binding of Dl to regulatory DNA, and that this is associated with increased chromatin accessibility. Importantly, the change in chromatin accessibility is strongly correlated with the change in Zld binding, but not Dl. It is proposed that the ability of genome activators to facilitate readout of transcriptional input is key to widespread transcriptional induction during ZGA (Foo, 2014).

In blastoderm embryos, brinker (brk) is activated in an eight- to ten-cell-wide domain that develops into the ventral neurogenic ectoderm (NE), whereas short gastrulation (sog) is expressed in a broader band of 16-18 cells encompassing the entire NE. Both genes have the same ventral expression boundary due to repression by Snail (Sna) in the presumptive mesoderm. The dorsal borders of their domains lie in regions of the Dorsal (Dl) gradient where amounts are low and change little, raising the question of how their enhancers can interpret small differences in Dl concentrations (Foo, 2014).

sog and brk each have two reported cis-regulatory modules (enhancers) that are active in early embryos. The sog intronic lateral stripe enhancer (LSE) is less well conserved and drives a slightly narrower stripe of expression relative to the sog shadow enhancer, also known as the neurogenic ectoderm enhancer (NEE), which recapitulates the broad endogenous sog pattern. The brk 5' and 3' enhancers both support lateral stripes similar to endogenous brk; however, the brk 3' enhancer drives a more dynamic pattern that broadens at cellularization. Thus, this study focused on the brk 5' enhancer to avoid confounding dynamic change of width (Foo, 2014).

The sog 426 bp NEE contains three CAGGTAG heptamer sites for optimal Zelda (Zld) binding. However, the brk 498 bp 5' enhancer does not have any canonical Zld binding sites (also known as TAGteam sites). To explain its Zld dependence, electrophoretic mobility shift assays were used to look for Zld binding sites in the brk 5' enhancer. Three CAGGTCA sequences and a tandem GAGGCACAGGCAC sequence were identified that promote very weak Zld binding, which was abolished upon mutation of the sites (Foo, 2014).

To test whether altering the number of Zld binding sites in the NE enhancers can affect the expression they drive, mutant forms of the brk and sog enhancers were created. The sog NEE drives a lacZ reporter expression pattern identical to endogenous sog. Mutation of all three CAGGTAG sites dramatically reduced the expression width (sog). Similar changes were also observed by a previous study when the CAGGTAG sites were mutated in the sog LSE. Costaining of lacZ and endogenous sog illustrates that the narrowed lacZ domain resulted from a collapse of the dorsal, not the ventral, border. It is inferred that without Zld, sog is unable to be activated by the lower levels of Dl in the dorsal neuroectoderm region. In embryos lacking maternal Zld (referred to as zld-), both the endogenous sog and sog wt domains shrink and become sporadic. This is not due to an indirect effect on the Dl concentration gradient because it is unchanged in zld-. Thus, loss of Zld in trans, or Zld binding sites in cis, has the same effect on NEE activity, indicating a direct modulation of sog by Zld (Foo, 2014).

Next the opposite experiment was performed by introducing three CAGGTAG sites into the brk 5' enhancer. This modified enhancer (brk+3a) drives a considerably expanded expression domain compared to brkwt. A second form of the brk enhancer with CAGGTAG sites added to different locations (brk+3b) also drives the same expanded expression domain, arguing against the requirement of precise motif grammar in Zld's regulation of NE genes (Foo, 2014).

To rule out the possibility that the expansion in domain width of brk+3a is caused by inadvertent disruption of a repressor binding site rather than addition of Zld binding sites, the three added CAGGTAG sequences were mutated in brk+3a into 7-mers that are neither the original sequence nor Zld binding sites. Mutation of these sites reduced the expanded domain of brk+3a back to a width similar to brk wt. When each of the brk+3a , brk+3b , and brk+3m transgenic enhancers was placed into a zld- background, narrow and sporadic expression resulted resembling that of endogenous brk in zld-, again supporting that the CAGGTAG-driven broadened expression is Zld dependent. Moreover, mutation of the newly found weak Zld binding sites led to a narrowed and weakened stripe of expression, identical to the pattern of brk wt in zld- (Foo, 2014).

To better correlate the number of Zld sites with the extent of reporter expression, six different forms of the sog NEE were constructed containing either one or two of the three CAGGTAG sites). The width of expression correlated moderately to the number of Zld sites in the enhancer. However, some sites appear to be more important than others in contributing to the expression width, indicating a context dependency for Zld binding sites. From these results and others' work demonstrating weakened NE gene expression upon removal of Zld or Zld sites, it is evident that Zld is indispensable for the proper expression of NE genes (Foo, 2014).

It was next asked whether the number of Zld binding sites also influences the timing of Dl target expression, since previous reports have implicated Zld as a developmental timer. A correlation has been observed between the onset of zygotic gene expression and strength of Zld binding at nuclear cycle 8. Besides that, when the enhancer region of zen, which contains four Zld binding sites, was multimerized, it drove precocious activation of reporter expression. And finally, it has been shown that the expression of many patterning genes is delayed in zld- embryos, including sog and brk. It was reasoned that since Dl nuclear concentrations increase from nuclear cycles 10 to 14, the lower levels of Dl present in earlier cycles would no longer be adequate to activate target genes without Zld's input, resulting in delayed activation of sog and brk (Foo, 2014).

To measure the onset of transcription, it was determined when the four transgenic enhancers (sogwt, sog0, brkwt, and brk+3a) could activate an intron-containing yellow reporter gene, which allows detection of nascent transcripts. Reporter expression driven by the sog wt enhancer was first detectable in nuclear cycle 10 embryos, whereas no reporter activity was observed for the sog0 enhancer until nuclear cycle 11. Even in nuclear cycle 12, the expression driven by sog0 is more sporadic compared to sog wt. Unlike in nuclear cycle 14 embryos, reporter expression can be seen in ventral nuclei of nuclear cycle 11 and nuclear cycle 12 embryos because the Sna repressor has not yet accumulated to high levels. Adding three Zld sites to the brk enhancer resulted in advanced initiation of reporter activity from nuclear cycle 11 to nuclear cycle 10, and reporter expression also became more robust, in terms of both the proportion of nuclei showing expression and the ratio of embryos with expressing nuclei. These results clearly illustrate that by manipulating Zld binding sites, the timing of NE gene activation can be altered. Temporal regulation by transcription factor binding sites has also been shown in Ciona where the number of Brachyury binding sites governs the timing of notochord gene expression (Foo, 2014).

It is believed that Zld regulates the temporal and spatial expression of NE genes by promoting Dl activity, rather than acting independently, because nuclear Dl is absolutely required for the activation of brk and sog, which exhibit no expression in genetic backgrounds lacking nuclear Dl. One possible mechanism may involve cooperativity at the level of DNA binding. To test the hypothesis that the extent of Zld binding impacts Dl binding at target enhancers, chromatin immunoprecipitation was performed followed by quantitative PCR (ChIP-qPCR) to measure Zld and Dl binding to the different transgenic enhancers (Foo, 2014).

The sog0 enhancer without Zld sites has diminished Zld binding when compared to sog wt. Dl binding is also much reduced. As an internal control, Zld and Dl binding to the endogenous sog locus showed no significant difference between the lines. On the other hand, introduction of Zld sites into the brk transgenic enhancer led to higher Zld binding and Dl binding, while Zld and Dl binding to the endogenous locus remained similar between lines. These results illustrate that changing the number of Zld sites, and therefore changing the amount of Zld binding to the NE enhancers, influences the level of Dl binding to its target sites in vivo (Foo, 2014).

The results from reporter expression analyses and ChIP experiments suggest that Zld promotes transcriptional output by facilitating Dl DNA binding. Zld might directly interact with Dl, leading to cooperative DNA binding as in the Dl-Twist (Twi) interaction. Alternatively, Zld might assist factor binding by interacting with common coactivators or by changing the local chromatin accessibility. The latter possibility is favored for several reasons: (1) Zld binding greatly overlaps with that of many other transcription factors such as Bcd, Hunchback, Dl, Twi, Sna, and Mothers against Dpp (Mad); (2) Zld helps the binding of Twi and Bcd to target DNA; (3) the presence of Zld binding sites is associated with high levels of transcription factor binding; and (4) the Zld site (CAGGTA; [2]) is the most enriched motif in transcription factor binding 'HOT regions,' which were seen to correlate with decreased nucleosome density. Hence, it is more likely that Zld plays a more general role, such as 'opening' the underlying chromatin, than that it interacts specifically with multiple other factors (Foo, 2014).

The hypothesis was addressed that Zld facilitates the binding of Dl by making the local chromatin more accessible. DNase I's preferential digestion of nucleosome-depleted DNA in the genome can be used to map active regulatory regions accessible for transcription factor binding. DNase I hypersensitivity assays followed by qPCR (DNase I-qPCR) were performed to measure the chromatin 'openness' of transgenic enhancers carrying varying numbers of Zld sites. The sog transgenic enhancer region had significant reduction of chromatin accessibility when Zld sites were mutated, while adding Zld sites to the brk transgenic enhancer increased sensitivity to DNase I digestion. The DNase I hypersensitivity assessed on endogenous brk and sog loci were comparable between transgenic lines, serving as a control for embryo staging between transgenic lines and the DNase I digestion procedure (Foo, 2014).

These results suggest that the presence of Zld sites, and thus Zld binding, makes the local chromatin more accessible for Dl, and potentially other transcription factors. However, it is feasible that the total number of factor binding sites influences chromatin accessibility rather than the number of Zld sites in particular. Therefore, the DNase I hypersensitivity of a transgenic brk enhancer was assayed that lacks all Dl binding sites and shows no reporter expression. Dl binding decreased nearly to background levels compared to brk wt, but the Zld binding and DNase I hypersensitivity showed only slight decreases, which is not comparable to the effects seen upon manipulation of Zld sites on the brk and sog enhancers. It was reasoned that the binding of each transcription factor may contribute to the DNase I hypersensitivity to a certain extent but that the major influence comes from Zld binding. To further evaluate the contribution of Zld versus Dl sites to chromatin accessibility, the fold change in Zld and Dl binding for sog0, brk+3a , and brk0Dl was calculated relative to their corresponding wt transgenic enhancers and then correlated the fold change in factor binding with the change in DNase I hypersensitivity. A strong correlation was found between the change in Zld binding and DNase I hypersensitivity, whereas the change in Dl binding and DNase I hypersensitivity do not correlate. These results support the idea that the number of Zld sites rather than Dl sites is important in determining chromatin accessibility (Foo, 2014).

Using Zld's coregulation of NE genes as a case in point, this study has shed light on how Zld functions as a zygotic genome activator. The data reveal that Zld works in combination with Dl and regulates Dl target genes by binding differentially to their regulatory sequences. Changing the number of Zld sites on Dl target gene enhancers has a pronounced effect on their expression both temporally and spatially. As a uniformly distributed factor, Zld supplies positional information by promoting Dl binding to target enhancers, thereby increasing the 'apparent dosage' of Dl. Zld's input is especially important where the level of morphogen is low and likely plays a similar role for other key factors in the blastoderm embryo, such as Twi, Bcd, and Mad. Uniform factors have been found to act in combination with Sonic Hedgehog in neural tube differentiation, and the current findings on how Zld potentiates morphogen activity will be relevant to vertebrate systems (Foo, 2014).

Although the results do not rule out other possible mechanisms, they strongly support the idea that Zld binding increases chromatin accessibility, which is thought to contribute greatly to how it activates such a wide range of targets. In this model, the amount of Zld binding on a region would determine how open and therefore how active it is. At the center of this property is Zld's ability to occupy a large fraction of its recognition sites in early embryos. Besides that, Zld is present in nuclei as early as nuclear cycle 2, which is considerably earlier than other factors. Therefore, Zld may act as a pioneer factor as previously suggested, but whether Zld binds to its sites in nucleosomes and repositions them, or whether it recruits histone modifiers that in turn affect binding of other factors like Dl, awaits further investigation. Interestingly, this idea may extend beyond flies, since newly discovered genome activators in zebrafish zygotic genome activation have been seen to cooperate with developmental regulators and prime the genome for subsequent activation. Thus, it seems that developmental control of zygotic genome activation is highly similar in flies and fish (Foo, 2014).

Nucleosome-mediated cooperativity between transcription factors

The Drosophila genome activator Vielfaltig (Vfl), also known as Zelda (Zld), is thought to prime enhancers for activation by patterning transcription factors (TFs). Such priming is accompanied by increased chromatin accessibility but the mechanisms by which this occurs are poorly understood. This study analyzed the effect of Zld on genome-wide nucleosome occupancy and binding of the patterning TF Dorsal (Dl). The results show that early enhancers are characterized by an intrinsically high nucleosome barrier. Zld tackles this nucleosome barrier through local depletion of nucleosomes with the effect being dependent on the number and position of Zld motifs. Without Zld, Dl binding decreases at enhancers and redistributes to open regions devoid of enhancer activity. It is proposed that Zld primes enhancers by lowering the high nucleosome barrier just enough to assist TFs in accessing their binding motifs and promoting spatially controlled enhancer activation if the right patterning TFs are present. It is envisioned that genome activators in general will utilize this mechanism to activate the zygotic genome in a robust and precise manner (Sun, 2015).

An important finding from this study is that early enhancers acquire high nucleosome occupancy about the length of typical enhancers in the absence of Zld. These regions have high predicted nucleosome occupancy based on underlying DNA sequences and acquire high nucleosome occupancy in wild-type embryonic tissues when Zld is no longer present during late embryogenesis. Taken together, these data show that early enhancers generally have a strong intrinsic nucleosome barrier (Sun, 2015).

Previous evidence on the intrinsic nucleosome occupancy at enhancers has been conflicting since it has been reported as either low or high. The current results unambiguously demonstrate high intrinsic nucleosome occupancy at early Drosophila enhancers since this study not only predicts intrinsic nucleosome occupancy but also demonstrates high nucleosome occupancy experimentally (as observed in zld minus embryos and in wild-type late muscle tissue). This has important implications for the well-studied function of early Drosophila enhancers (Sun, 2015).

The simplest model is that the high nucleosome occupancy in the absence of appropriate TFs protects enhancers from inappropriate binding and activation. However, a more intriguing model proposed by Mirny poses that high nucleosome occupancy promotes a specific type of TF cooperativity called cooperative nucleosome binding (Mirny, 2010). Experimental evidence showed that TFs can dramatically enhance each others' binding to nucleosomal DNA simply by competing against a common nucleosome. Thus, the higher the nucleosome barrier, the more TFs are required to break the histone-DNA contacts. This in turn makes the enhancer activity dependent on multiple TFs without requiring direct physical interactions between them. This model fits well for the current system since early Drosophila enhancers are strongly controlled by the combinatorial input of multiple TFs, and no strict motif grammar has been found between their binding motifs (Sun, 2015).

Since ChIP results show that Dl binding depends on Zld, but not the other way around, there is a hierarchy by which TFs activate enhancers in a combinatorial manner. It is proposed that Zld's pioneering role is its ability to lower (or prevent) the very high nucleosome barrier in each enhancer and that it does so just enough to allow patterning TFs to bind and to help antagonize the remaining nucleosome barrier. Such partial nucleosome depletion by Zld is supported by findings that binding of Zld only leads to a relatively local depletion of about 1-2 nucleosomes within an enhancer, that multiple Zld binding motifs lead to stronger depletion and that the position of the Zld motifs within the enhancer matters. The degree of nucleosome depletion by Zld thereby sets a threshold required for patterning TFs such as Dl to achieve robust transcriptional activation (Sun, 2015).

It should be noted that the mechanism by which Zld induces nucleosome depletion remains unknown. In the simplest scenario, Zld might bind to its targets very early during the rapid nuclear cycles, when the chromatin may not be as densely packed and thus more accessible, and then prevent nucleosomes from being assembled nearby. Alternatively, Zld may bind, destabilize and eject nucleosomes, thereby acting as a more classical pioneer factor. Regardless of whether Zld can bind its motifs embedded in nucleosomes, Zld's ability to reduce nucleosome occupancy and facilitate the binding of TFs certainly fulfills a pioneering role (Sun, 2015).

The pioneering role presented in this study for Zld during Drosophila ZGA may be a general feature of key zygotic genome activators. For example, Pou5f3, which controls zygotic genome activation (ZGA) together with Nanog and SoxB1 family proteins in zebrafish also binds before ZGA. Interestingly, the mammalian homolog of Pou5f3, Oct4, is a pluripotency factor that, along with Sox2 and Klf4, gains initial access to closed chromatin at enhancers of genes promoting reprogramming from fibroblasts to induced pluripotent stem cells. This points to a mechanistic link between ZGA and cellular reprogramming, the center of which may be the pioneering activity to potentiate TF binding and gene expression as exemplified by Zld (Sun, 2015).

Taken together, the following temporal working model is proposed for how Zld primes early embryonic enhancers during ZGA. As the Zld protein level rises in the first hour of development, Zld begins to locally reduce nucleosome occupancy at target enhancers that normally have a high intrinsic nucleosome barrier. This is unlikely to be solely an effect of histone acetylation, which accompanies early Zld binding, since acetylated histones are more broadly found over Zld-bound regions (Sun, 2015).

Starting 1-2h and peaking at 2-3h, patterning TFs such as Dl gain access to these enhancers. In certain embryonic regions, where the right combination of patterning TFs is present, Zld and these TFs then strongly bind through collaborative nucleosome binding and activate transcription in a distinct pattern in the embryo. In this process, some TFs such as Dl might be more strongly dependent on prior chromatin accessibility. A recent genome-wide analysis identified NF-κ B, the mammalian homolog of Dl, as a 'settler' TF whose binding is strongly governed by the accessible chromatin created by 'pioneer' TFs (Sun, 2015).

In the absence of Zld, binding of Dl is severely diminished. This is accompanied by a redistribution of Dl to other regions in the genome that remain accessible. Such TF redistribution in the absence of a key activator has been observed previously in yeast, flies and mammalian systems. The simplest explanation for this phenomenon is the law of mass action, i.e. given that the nuclear Dl concentration remains the same, more unbound Dl is available to drive ectopic binding. A good candidate for facilitating ectopic Dl binding in the absence of Zld is GAF since this study found the ectopic Dl bound regions to be enriched for the GAGA motif (Sun, 2015).

When this early pattern formation phase ends and Zld levels begin to decrease, the nucleosome-favoring sequences promote high nucleosome occupancy at these regions, closing enhancers and reducing transcriptional output. Thus, Zld acts as a timer of ZGA in that it controls the engagement and decommission of TFs at target enhancers by transiently reducing the nucleosome barrier. Since the Zld-mediated nucleosome depletion strongly correlates with early enhancer activity, it is likely a central mechanism by which Zld specifies and primes enhancers across the genome. It will be interesting to analyze whether this is a general property of zygotic genome activators and whether other pioneer factors play similar roles at later stages of development (Sun, 2015).

Zelda is differentially required for chromatin accessibility, transcription-factor binding and gene expression in the early Drosophila embryo

The transition from a specified germ cell to a population of pluripotent cells occurs rapidly following fertilization. During this developmental transition, the zygotic genome is largely transcriptionally quiescent and undergoes significant chromatin remodeling. In Drosophila, the DNA-binding protein Zelda (also known as Vielfaltig) is required for this transition and for transcriptional activation of the zygotic genome. Open chromatin is associated with Zelda-bound loci as well as more generally with regions of active transcription. Nonetheless, the extent to which Zelda influences chromatin accessibility across the genome is largely unknown. This study used Formaldehyde Assisted Isolation of Regulatory Elements to determine the role of Zelda in regulating regions of open chromatin in the early embryo. Zelda was shown to be essential for hundreds of regions of open chromatin. This Zelda-mediated chromatin accessibility facilitates transcription-factor recruitment and early gene expression. Thus, Zelda possesses some key characteristics of a pioneer factor. Unexpectedly, chromatin at a large subset of Zelda-bound regions remains open even in the absence of Zelda. The GAGA factor-binding motif and embryonic GAGA factor binding are specifically enriched in these regions. It is propose that both Zelda and GAGA factor function to specify sites of open chromatin and together facilitate the remodeling of the early embryonic genome (Schulz, 2015).

This study used FAIRE to identify regions of open chromatin in the early embryo and determine the role of ZLD in establishing or maintaining chromatin accessibility. It was demonstrated on a genome-wide level that ZLD is instrumental in defining specific regions of open chromatin. Furthermore, this ZLD-mediated chromatin accessibility dictates both transcription factor binding and early gene expression. Unexpectedly, most open chromatin regions to which ZLD is bound do not absolutely require ZLD for chromatin accessibility. At these regions ZLD may function redundantly with GAF to determine the chromatin state. It is suggested that ZLD directly mediates the very earliest gene expression by facilitating chromatin accessibility. At cycle 14, when thousands of genes are transcribed, ZLD and GAF may coordinate to determine both regions of open chromatin and levels of gene expression (Schulz, 2015).

ZLD is known to be instrumental in regulating expression of both the very first set of zygotic genes transcribed after fertilization as well as a large set of genes transcribed at cycle 14. ZLD is already bound to thousands of loci at cycle 10, including those that will not be activated until four nuclear cycles later during the major wave of genome activation. This suggests that early ZLD-binding is poising genes for later activation. Nonetheless, it remains unclear what differentiates the small subset of ZLD-bound loci that are transcribed early from the hundreds of ZLD-bound genes activated at cycle 14. This study demonstrates regions that require ZLD for chromatin accessibility are correlated with the subset of genes transcribed prior to cycle 14 and with histone acetylation. However, not all ZLD-bound regions are equally dependent on ZLD for chromatin accessibility. It is therefore proposed that ZLD is essential for creating regions of open chromatin that drive expression of the subset of earliest expressed genes. This may be mediated, in part, by local histone acetylation. At cycle 14, other factors likely function with ZLD to determine chromatin accessibility (Schulz, 2015).

It has been shown that ZLD is required for the DNA binding of three different transcription factors: TWI, DL, and BCD. Additionally, transgenic versions of the brinker (brk) and sog enhancers show a correlation between the number of ZLD-binding sites and both DL binding and DNase I accessibility. Thus, prior work has clearly demonstrated a role for ZLD in mediating transcription factor binding, but the mechanism by which ZLD served this function has been unclear. This study demonstrates that BCD binding is lost in zld minus embryos preferentially at those regions that depend on ZLD for chromatin accessibility. The data show that ZLD potentiates transcription-factor binding through the establishment or maintenance of open chromatin, and this is likely to be important for ZLD-mediated transcriptional activation. The mechanism by which ZLD establishes or maintains chromatin accessibility remains unknown. Unlike the pioneer factor FoxA1, which can open chromatin by binding chromatin through a winged-helix domain, the ZLD DNA-binding domain does not resemble that of a linker histone. Instead, ZLD binds DNA through a cluster of four zinc fingers in the C-terminus. In addition, ZLD is a large protein with no recognizable enzymatic domains that activates transcription through a low-complexity protein domain. Thus, ZLD likely facilitates open chromatin through interactions with cofactors, and it is possible that recruitment of different cofactors to distinct ZLD-bound loci could partially explain the differential requirement on ZLD for chromatin accessibility in the early embryo (Schulz, 2015).

ZLD binding but not ZLD-mediated chromatin accessibility is a defining feature of HOT regions HOT regions, loci that are bound by a large number of different transcription factors, have been identified in multiple organisms, including worms, flies and humans. Unexpectedly, these HOT regions are not strongly enriched for the DNA-sequence motifs bound by the transcription factors that define them. Instead, HOT regions are associated with open chromatin, suggesting that chromatin accessibility along with sequence motif enrichment drives the high transcription factor occupancy. In Drosophila, HOT regions are enriched for developmental enhancers that contain the canonical ZLD-binding site, CAGGTAG, as well as for in vivo ZLD binding additional transcription factors. Early ZLD binding is a robust predictor of where multiple additional transcription factors will later bind (Schulz, 2015).

By analyzing the 5000 regions with the highest FAIRE signal, this study demonstrates that high transcription factor occupancy is correlated with ZLD-bound regions of accessible chromatin and not with open chromatin more generally. Furthermore, this association was not specific for those regions that require ZLD for accessibility. Thus, HOT regions overlap with ZLD-bound regions of open chromatin regardless of whether these loci require ZLD for accessibility. The data suggest that, while ZLD-mediated chromatin accessibility may facilitate gene expression, it is not this function of ZLD alone that defines HOT regions (Schulz, 2015).

The FAIRE data showed that more than 400 regions are bound by ZLD and require ZLD for chromatin accessibility. However, at least three times as many regions are bound by ZLD, but remain open even in its absence. The data predict GAF functions at many of these constitutively open chromatin regions to maintain chromatin accessibility, even in the absence of ZLD. Along with the CAGGTAG element, GAF-binding motifs are enriched in HOT regions. Like ZLD, GAF is maternally deposited into embryos. Furthermore, GAF is known to facilitate nuclease-hypersensitive regions and interact with members of the NURF A TP-dependent chromatin-remodeling complex. The data show that at early expressed genes there is a correlation between regions that require ZLD for chromatin accessibility (differential, ZLD-bound) and ZLD-dependent gene expression. However, this association is not found for genes expressed during cycle. Instead, the data suggest that at loci associated with this later gene expression, 2GAF is functioning together with ZLD to regulate chromatin accessibility and gene expression. Maternally deposited GAF is required for robust transcription and nuclear divisions during the MZT. GAF is thought to mediate transcription, at least in part, through a role in the establishment of poised polymerase. The fact that poised polymerase is not established until cycle 13, supports the model that GAF is required specifically for gene expression at cycles 13-14. Thus, it is suggested that ZLD-dependent early embryonic enhancers may be unique in that they rely only on ZLD for chromatin accessibility. Although there are likely additional factors involved, the data demonstrate that later in development ZLD and GAF likely function together to define the chromatin landscape of the early embryo (Schulz, 2015).

Pioneer factors are a specialized class of transcription factors that bind nucleosomal DNA and initiate chromatin remodeling, allowing the recruitment of additional transcription factors. ZLD-binding is strongly driven by DNA sequence, much more so than the binding of other transcription factors. This observation combined with the FAIRE data and analyses demonstrates that ZLD exhibits many of the characteristics of a pioneer factor: 1) engaging chromatin prior to gene activity; 2) establishing or maintaining chromatin accessibility to facilitate transcription factor binding; and 3) playing a primary role in cell reprogramming. Additional properties have been shown for classical pioneer factors, including remaining bound to the mitotic chromosomes (i.e. bookmarking) and binding to nucleosomal DNA. It will be important to determine whether ZLD shares these characteristics with other pioneer factors. Pioneer factors, such as FoxA1, can bind to closed chromatin and subsequently increase accessibility of the target site. However the chromatin of the early embryo may provide a unique environment with little compacted chromatin. Heterochromatin formation is not observed until the 14th nuclear cycle. Chromatin bound H3 levels increase through the MZT, and histone modifications indicative of silent genes, such as H3K27 trimethylation, are not evident until there is widespread activation of the zygotic genome (Schulz, 2015).

Thus, while ZLD binds to genes prior to zygotic genome activation this activity may not require binding to compacted chromatin. It may be that ZLD is distinctive in the timing of its expression rather than in its chromatin-binding properties and that the sequence-driven binding of ZLD is a property of the open chromatin and rapid nuclear divisions that characterize the earliest stages of embryonic development. Despite the fact that this study has demonstrated a critical role for ZLD in determining chromatin accessibility at hundreds of genomic regions, the data show that this role is limited to specific regions associated with the earliest-expressed embryonic genes. Other factors, such as GAF likely work redundantly with ZLD to define chromatin accessibility during the MZT (Schulz, 2015).

The coordinated function of multiple factors in determining chromatin structure and genome activation is not without precedent. It has recently been demonstrated homologs of the core pluripotency factors, Nanog, Pou5f3 (also known as Pou5f1 and Oct 4), and Sox19B (a member of the SoxB1 family), act analogously to ZLD during the zebrafish MZT to drive genome activation. Furthermore, Oct 4 and Sox2 are known to be pioneer factors instrumental in reprogramming differentiated cells to a pluripotent state. Together, these data suggest that chromatin remodeling in the early embryo requires the function of multiple factors, and this activity facilitates the transition from the specified germ cells to the pluripotent cells of the early embryo (Schulz, 2015).

Number of nuclear divisions in the Drosophila blastoderm controlled by onset of zygotic transcription

The cell number of the early Drosophila embryo is determined by exactly 13 rounds of synchronous nuclear divisions, allowing cellularization and formation of the embryonic epithelium. The pause in G2 in cycle 14 is controlled by multiple pathways, such as activation of DNA repair checkpoint, progression through S phase, and inhibitory phosphorylation of Cdk1, involving the genes grapes, mei41, and wee1. In addition, degradation of maternal RNAs and zygotic gene expression are involved. The zinc finger Vielfaltig (Vfl) controls expression of many early zygotic genes, including the mitotic inhibitor fruhstart. The functional relationship of these pathways and the mechanism for triggering the cell-cycle pause have remained unclear. This study shows that a novel single-nucleotide mutation in the 3' UTR of the RNA polymerase RNPII215 gene leads to a reduced number of nuclear divisions that is accompanied by premature transcription of early zygotic genes and cellularization. The reduced number of nuclear divisions in mutant embryos depends on the transcription factor Vfl and on zygotic gene expression, but not on grapes, the mitotic inhibitor Fruhstart, and the nucleocytoplasmic ratio. It is proposed that activation of zygotic gene expression is the trigger that determines the timely and concerted cell-cycle pause and cellularization (Sung, 2012).

Embryos from germline clones of the lethal mutation X161 (in the following, designated as mutant embryos) showed a reduced cell number but otherwise developed apparently normally until at least gastrulation stage. Cell specification along the anterior-posterior and dorsoventral axes proceeded as in wild-type, as demonstrated by the seven stripes of eve expression, mesoderm invagination, and cephalic furrow formation. The reduced cell number can be due to a lower number of nuclear divisions prior to cellularization or to loss of nuclei in the blastoderm. To distinguish these possibilities, time-lapse recordings were performed of mutant embryos in comparision to wild-type. To measure the cell-cycle length, the nuclei in these embryos were fluorescently labeled. Three types of embryos were observed: (1) with 13 nuclear divisions with an extended interphase 13 (28 min versus 21 min in wild-type), (2) with 12 nuclear divisions, and (3) with partly 12 and partly 13 nuclear divisions with an extended interphase 13. Because a severe nuclear fallout phenotype was not observed, it is concluded that the reduced cell number in gastrulating embryos is due to the reduced number of nuclear divisions. Consistent with these observations, the number of centromeres and centrosomes was normal in mutant embryos (Sung, 2012).

In wild-type embryos, interphase 14 is different from the preceeding interphases, in that the plasma membrane invaginates to enclose the individual nuclei into cells. In X161 embryos with patches in nuclear density, furrow markers showed more advanced furrows in the part with a lower number of divisions, indicating a premature onset of cellularization. Furthermore, in time-lapse recordings, the speed of membrane invagination was measured, with no obvious difference found between X161 and wild-type embryos. Additionally, cellularization was investigated by live imaging with moesin-GFP labeling F-actin. Clear accumulation of F-actin at the furrow canals was observed in wild-type embryos after about 20 min in interphase 14, but not in interphase 13. In X161 embryos with 12 nuclear divisions, a comparable reorganization was observed already in interphase 13 after about 25 min. This analysis shows that both the cell-cycle pause and cellularization are initiated in X161 embryos earlier than in wild-type embryos (Sung, 2012).

To identify the mutated gene in X161, the lethality and blastoderm phenotype was mapped. The X161 gene was separated from associated mutations on the chromosome by meiotic recombination and mapped to a region of four genes by complementation analysis with duplications and deficiencies. Sequencing of the mapped region and complementation tests with two independent RPII215 loss-of-function alleles, RPII215(1) and RPII215[G0040], and a transgene comprising the RPII215 locus revealed the large subunit of the RNA polymerase II as the mutated gene. A single point mutation was identified in the 3' UTR of RPII215 about 40 nt downstream of the stop codon. This region in the 3' UTR is not conserved and does not show any obvious motifs (Sung, 2012).

To test whether the mutation in the noncoding region affects transcript or protein expression, mRNA levels were quantified by reverse transcription and quantitative PCR and protein levels by whole-mount staining and immunoblotting with extracts of manually staged embryos. mRNA levels were found to be the same in wild-type and X161. In contrast, immunohistology and immunoblotting revealed reduced RPII215 protein levels. In summary, the analysis shows that the X161 point mutation within the 3' UTR affects mainly RPII215 protein levels. The precocious onset of cellularization raised the hypothesis that the timing of zygotic gene expression may be affected in the X161 embryos. To establish the expression profiles of selected maternal and zygotic genes, nCounter NanoString technology was used with embryos staged by the nuclear division cycle. Embryos expressing histone 2Av-RFP were manually selected 3 min after anaphase of the previous mitosis or at midcellularization (Sung, 2012).

Expression of ribosomal proteins was analyzed. They did not change much and were not different in wild-type and mutant embryos, confirming the robustness of the method. Zygotic genes, whose expression strongly increases during the syncytial cycles, showed an earlier upregulation in X161 than in wild-type embryos. Comparing the profiles by plotting the ratio of the expression levels, a clear difference was revealed in cycle 12, with a factor of up to ten, indicating that zygotic genes are precociously expressed in X161 embryos. The premature expression of early zygotic genes was confirmed by whole-mount in situ hybridization for slam and frs mRNA (Sung, 2012).

Next, expression profiles were analyzed of RNAs subject to RNA degradation. Transcripts representative for the two classes of degradation were selected, depending on zygotic gene expression, and on egg activation. Degradation of string, twine, and smaug transcripts in interphase 14 depends of zygotic gene expression. In X161 mutants, the mRNA of these three genes was degraded already in cycle 13, slightly sooner than in wild-type. The profiles of string and twine RNA were confirmed by RNA in situ hybridization. Consistent with the precocious RNA degradation in X161, Twine and String protein levels decreased already in interphase 13 of X161 embryos. Finally, the profile was analyzed of mRNAs whose degradation depends on egg activation. No consistent pattern or clear difference was detected between the profiles of wild-type and X161 mutants. The data show that zygotic gene expression starts earlier in X161 than in wild-type and that degradation of mRNAs follows zygotic gene expression (Sung, 2012).

The cell cycle may be paused prematurely by altered levels of maternal factors, such as CyclinB, grapes, and twine, or by precociously expressed zygotic genes, such as frs and trbl. To distinguish these two options, mutant embryos with suppressed zygotic gene expression were analyzed. Embryos injected with the RNA polymerase II inhibitor α-amanitin develop until mitosis 13 but then fail to cellularize and may undergo an additional nuclear division, depending on injection conditions. Using this assay, whether zygotic genes are required for the reduced number of nuclear divisions was tested in X161 mutants. If the precocious cell-cycle pause were due, for example, to reduced levels of CyclinB mRNA, α-amanitin injection should not change the reduced number of divisions. All injected mutant embryos passed through at least 13 nuclear divisions, similar to injected wild-type embryos, whereas injection of water resulted in a mixed phenotype of 12 and 13 nuclear divisions, comparable to uninjected X161 embryos. This experiment demonstrates that the reduced division number in X161 embryos requires zygotic gene expression (Sung, 2012).

The expression of many early zygotic genes is controlled by the zinc-finger protein Vfl (also called Zelda). Tests were performed to see whether the precocious cell-cycle pause in X161 mutants is mediated by vfl-dependent genes. Analysis of X161 vfl double-mutant embryos revealed that, in contrast to X161 mutants, the cell cycle undergoes at least 13 divisions. Activation of zygotic gene expression was further analyzed by staining for Vfl and activated RPII21. Staining of both in presyncytial stages of X161 mutants was detected already in cycle 5. No specific staining for the activated RPII215 was detected in X161 vfl double-mutant embryos, and no difference in Vfl staining in syncytial embryos was detected in wild-type and X161 embryos. These findings show that the genes relevant for the precocious cell-cycle pause in X161 mutants are vfl target genes. A zygotic gene involved in cell-cycle control is frs, which is sufficient to induce a pause of the cell cycle. Analysis of X161 frs double-mutant embryos showed, however, that the number of nuclear divisions was not changed as compared to X161 single mutants. This indicates that frs is not the only cell-cycle inhibitor expressed in the early embryo. Proteins mediating the DNA repair checkpoint, such as Grapes/Chk1, are required for the cell-cycle pause. Passing normally through the nuclear division cycles, the cell cycle shows striking abnormalities in nuclear envelope formation and chromosome condensation in interphase 14 in embryos from grapes females. Tests were performed to see whether the timing of the transition in cell-cycle behavior in grapes embryos depends on the onset of zygotic transcription by analyzing X161 grapes double-mutant embryos. Some of the X161 grapes double mutants were found to show the defects in nuclear envelope formation and chromatin condensation already in interphase 13, indicating that the requirement of grapes for chromatin structure shifted from interphase 14 to 13. These data suggest that the activation of grapes and the DNA checkpoint depends on the onset of zygotic gene expression (Sung, 2012).

A factor controlling the number of nuclear divisions is the ploidy of the embryo, given that haploid embryos undergo 14 instead of 13 nuclear divisions prior to cellularization. Based on this and on related observations, it has been proposed that the nucleocytoplasmic (N/C) ratio controls the trigger for MBT. To address the functional relationship of X161 and the N/C ratio, haploid X161 embryos were analyzed. A mixture was observed in the number of nuclear divisions between 12 and 14 in fixed embryos. Embryos were even observed containing three patches with nuclear densities corresponding to 12, 13, and 14 nuclear divisions. About half of the embryos underwent 12 nuclear divisions, similar to X161 embryos. These data suggest that ploidy acts independently of general onset of zygotic transcription, which is consistent with the observation that only a subset of zygotic genes are expressed with a delay in haploid embryos. Consistent with this report, cellularization starts for a first time temporarily in interphase 14 in haploid embryos and for a second time in interphase 15. These observations suggest that the N/C ratio in Drosophila specifically affects cell-cycle regulators such as frs, for example, but not general zygotic genome activation and onset of cellularization (Sung, 2012).

In summary, the data support the model that activation of the zygotic genome controls the timing of the MBT. First, onset of MBT is sensitive to changes in RNA polymerase II activity. Second, the changes in zygotic gene expression in X161 embryos occur earlier than the changes in zygotic RNA degradation, Cdc25 protein destabilization, or activation of grapes. Third, the X161 mutant phenotype depends on zygotic transcription and on the transcription factor Vfl, showing that the precocious cell-cycle pause and onset of cellularization cannot be due to changes in maternal factors, such as higher expression of CyclinB. Although the altered levels of RNA polymerase II in X161 mutants probably affect expression of many genes during oogenesis, these changes seem not to matter in functional terms, given the overall normal morphology and specific mutant phenotype. It is conceivable that transcriptional repressors are expressed or translated in eggs in lower levels. In the embryo, such lower levels of repressors would allow the trigger for onset of zygotic gene expression to reach the threshold earlier than in wild-type embryos. The first signs of zygotic transcription are detected already during the presyncytial stages, before nuclear cycle 8/9. This may be the time when the trigger for MBT is activated (Sung, 2012).

Grainyhead and Zelda compete for binding to the promoters of the earliest-expressed Drosophila genes

Maternally contributed mRNAs and proteins control the initial stages of development following fertilization. During this time, most of the zygotic genome remains transcriptionally silent. The initiation of widespread zygotic transcription is coordinated with the degradation of maternally provided mRNAs at the maternal-to-zygotic transition (MZT). While most of the genome is silenced prior to the MZT, a small subset of zygotic genes essential for the future development of the organism is transcribed. Previous work has identified the TAGteam element, a set of related heptameric DNA-sequences in the promoters of many early-expressed Drosophila genes required to drive their unusually early transcription. To understand how this unique subset of genes is regulated, a TAGteam-binding factor, Grainyhead (Grh), was identified. Grh and the previously characterized transcriptional activator Zelda (Zld) bind to different TAGteam sequences with varying affinities, and Grh competes with Zld for TAGteam occupancy. Moreover, overexpression of Grh in the early embryo causes defects in cell division, phenocopying Zld depletion. These findings indicate that during early embryonic development the precise timing of gene expression is regulated by both the sequence of the TAGteam elements in the promoter and the relative levels of the transcription factors Grh and Zld (Harrison, 2010).

To understand how a subset of genes are uniquely transcribed in the pre-cellular blastoderm (pre-CB) Drosophila embryo when the remainder of the genome is not, attempts were made to identify proteins that bind to TAGteam elements in the regulatory regions of pre-CB-expressed genes. Nuclear extract prepared from wild-type Drosophila embryos were fractionated and assayed for activity using DNase I protection of a portion of the early-expressed Sxl establishment promoter, SxlPe, containing two overlapping CAGGCAG sites. Partially purified protein(s) protected multiple regions of SxlPe from DNase I digestion, including the two TAGteam elements. As a final purification step, fractions were applied to a DNA-affinity column composed of oligonucleotides corresponding to four repeats of a portion of the zen ventral repression element (VRE), a TAGteam-containing sequence shown to regulate the pre-CB expression of zen. Using the zen VRE for the DNA-affinity column rather than SxlPe ensured that the purified protein(s) would bind to at least two sequences driving pre-CB gene expression. Two polypeptides of ~130 kD and 120 kD specifically eluted from the column. Mass spectrometry identified these two polypetides as the products of two splice isoforms generated from the single gene, grainyhead (grh). This identification was confirmed by immunoblotting (Harrison, 2010).

Grh is a transcription factor conserved from worms to humans that acts in mediating both transcriptional repression and activation. Previous work has demonstrated that Drosophila Grh can bind to the promoters of three additional pre-CB expressed genes, fushi tarazu (ftz), tailless (tll), and decapentaplegic (dpp), and the evidence suggests that Grh binding to these promoters results in transcriptional repression. However, this study is the first to demonstrate that Grh binds to TAGteam sites, greatly increasing the number of pre-CB genes Grh may regulate. In the early embryo, Grh may act as a repressor, in part, through its interactions with Polycomb-group proteins. Later in both fly and mammalian embryonic development, Grh is expressed in the epidermis and functions as an important transcriptional activator during the wound-healing response. Thus, whether Grh binding results in transcriptional activation or repression depends on developmental context (Harrison, 2010).

While the immunoblots confirmed that Grh bound to the zen VRE DNA-affinity column, it was important to determine if Grh provided the DNA-binding activity present in embryonic nuclear extract. Anti-Grh antibodies raised against the DNA-binding domain disrupted the SxlPe DNA-binding activity present in nuclear extract, whereas non-specific IgG did not, confirming that Grh was responsible for the activity. In addition, purified full-length recombinant Grh (rGrh) provided DNase I protection of the zen VRE and SxlPe indistinguishable from that of the activity in nuclear extract (Harrison, 2010).

Because Grh bound to TAGteam elements in both SxlPe and the zen VRE, it was determined whether Grh specifically required TAGteam sequences for binding. Heparin-fractionated nuclear extract and rGrh were used for DNase I protection assays with a fragment of SxlPe identical to that used in the purification described above except that the overlapping CAGGCAG elements were mutated. Grh binding to the mutated CAGGCAG elements was severely inhibited demonstrating that these TAGteam sequences are essential for Grh binding (Harrison, 2010).

Grh bound to additional sequences outside the TAGteam elements in SxlPe and the zen VRE as determined by DNase I protection assays. These additional binding sites do not contain sequences highly similar to the TAGteam sequences. While a consensus Grh binding site has been defined as ACYGGTT(T), there is considerable variability among previously defined Grh binding sites. MEME searches on the Grh binding sites defined by DNase I protection experiments failed to identify a strong consensus site, despite some similarity between the previously defined consensus site and TAGteam elements (Harrison, 2010).

The focus of these studies was placed on the TAGteam-binding activity of Grh as these sequences have been shown to have important functions in the pre-CB embryo. The TAGteam elements have been defined as a group of related sequences including CAGGTAG, CAGGCAG and TAGGTAG, with CAGGTAG being the most enriched in the promoters of pre-CB genes (De Renzis, 2007; Li, 2008; ten Bosch, 2006). To determine if Grh could bind to the prevalent CAGGTAG sequence, rGrh was used in protection assays on a region of the sc promoter containing three CAGGTAG elements and one CAGGCAG element. These experiments demonstrated protection of the CAGGCAG element as well as at least two of the three CAGGTAG elements. Electromobility shift assays (EMSAs) were used to test the affinity of Grh for different members of the TAGteam family. EMSAs with probes that only differed by the sequence of the TAGteam element showed that Grh binds strongly to CAGGTAG and CAGGCAG elements, but only weakly to TAGGTAG sequences, demonstrating the importance of the initial cytosine in Grh recognition. Previous work analyzing the ability of Grh to bind to the closely related sequence GCAGGTAA also showed the importance of the cytosine in Grh recognition. Furthermore, this cytosine was critical for the pre-CB ventral repression of a transgene reporter driven by the dpp ventral repression region (VRR). Together these data show that Grh specifically binds to TAGteam elements within the promoters of three genes expressed in the early embryo, although preferentially to specific TAGteam sequences (Harrison, 2010).

Given that Grh can bind to TAGteam elements, it was asked if Grh was present in the early embryo. Using RT-PCR it was shown that grh transcripts are present in early embryos as well as in egg chambers (ovaries). In agreement with these data, in situ hybridizations had previously identified grh mRNA in these tissues. There are two well-characterized examples of alternative splicing of the grh pre-mRNA, and it was determined that alternative splicing results in multiple mRNAs present in the early embryo. Immunoblots showed that these mRNAs are translated producing Grh proteins. Notably, Grh protein appears absent or at very low levels in late-stage egg chambers, suggesting that maternal grh mRNA, but not protein, is deposited into the embryo (Harrison, 2010).

While this study was characterizing the TAGteam-binding factor Grh, another TAGteam-binding protein, Zelda (Zld), was identified. Zld is a zinc-finger protein that binds to TAGteam elements in the zen VRE and is required for the proper activation of more than 100 genes in the pre-CB embryo. To test whether Grh and Zld have similar binding profiles for the zen VRE, full-length recombinant Zld was purified and used in DNase I protection assays. Interestingly, whereas Grh showed strong protection of the CAGGCAG element and little to no protection of the TAGGTAG element, Zld showed protection of the TAGGTAG and not the CAGGCAG element. To further determine if Zld and Grh had different binding affinities for TAGteam family members, the affinity of Zld for distinct TAGteam elements was tested using EMSAs. Similar to the binding profile for Grh, Zld bound most strongly to oligonucleotides containing the canonical CAGGTAG element. However, the affinity of Zld for the two additional TAGteam elements was reversed from that of Grh: Zld bound the TAGGTAG element more strongly than the CAGGCAG element. These data show that at least two TAGteam-binding factors are present in the early embryo, and that the affinities of these factors for various TAGteam elements differ. While it was previously unknown if all of the related TAGteam elements are equally effective in driving gene expression, the data demonstrating that the transcriptional activator Zld as well as the transcription factor Grh have different affinities for the related TAGteam sequences suggest that it is unlikely they are. Although all three TAGteam sequences are enriched in promoters of pre-CB expressed genes, their differential recognition by these two transcription factors may result in distinct effects on the levels or timing of gene expression (Harrison, 2010).

Because both Grh and Zld bind to TAGteam elements, tests were performed to see whether both proteins could bind these sequences simultaneously or whether instead they compete for binding. For these assays it was imperative probe bound by Zld could be distinguised from that bound by Grh. While Grh is smaller than Zld, it binds DNA as a dimer, and binding of the Grh dimer in EMSAs resulted in shifted species that were difficult to distinguish from those shifted by Zld binding. Therefore a C-terminal portion of Grh containing the DNA-binding and dimerization domains (amino acids 603-1032) was expressed and purified. This truncated form of Grh binds to TAGteam-containing sequences, but its binding is easily distinguishable by EMSA from that of Zld. EMSAs performed with a probe containing two overlapping CAGGCAG elements from SxlPe, and low amounts of Grh(603-1032) resulted in a single shifted species. Increasing amounts of protein produced a slower migrating species, likely due to the binding of a second Grh dimer, suggesting that Grh binds to each of the CAGGCAG elements in the probe. Probes corresponding to portions of the zen VRE or sc promoter containing TAGteam elements yielded similar results. Higher levels of Grh(603-1032) were required to bind both TAGteam elements in the zen VRE probe than for the SxlPe or sc probes, as expected from previous findings that Grh binds weakly to the TAGGTAG variant. The full-length rGrh protein showed similar binding behavior, suggesting that the DNA-binding and dimerization domains alone control binding-site specificity. Additionally, it is noted that attempts to co-immunoprecipitate full-length Grh and Zld from embryonic extracts or a mixture of purified epitope-tagged proteins were negative, indicating that interactions between the two full-length proteins through direct protein/protein interactions is not likely. Having shown that rGrh and Grh(603-1032) have similar binding profiles and binding of the truncated protein to oligonucleotide probes is easily distinguished from Zld binding, Grh(603-1032) was used in EMSAs to test for cooperativity or competition (Harrison, 2010).

The minimal amount of rZld required to saturate binding and eliminate free probe was determined experimentally. Reactions supplemented with increasing quantities of Grh(603-1032) showed reduced amounts of probe complexed with Zld and a concomitant increase in the amounts of probe bound by two Grh dimers, demonstrating that Grh competes with Zld for TAGteam binding. As predicted from previous results, Grh competed most weakly with Zld for binding to the zen VRE probe, which contains a TAGGTAG site to which Grh binds more weakly than Zld. Therefore Grh is capable of competing with the transcriptional activator Zld for binding to TAGteam sites, and these data suggest that Grh acts to repress transcription from TAGteam-containing promoters in the pre-CB embryo. Similarly, Grh has been shown to compete with an unidentified activator for binding to a TAGteam-related sequence in the dpp VRR. Thus, one possible mechanism for the previously reported Grh repression in the pre-CB embryo is competition with an activator for DNA binding (Harrison, 2010).

Given that at approximately equal molar amounts Grh and Zld compete for TAGteam binding, the relative levels of each protein was determined at different times during early embryonic development to learn whether they would have an opportunity to compete in the early embryo. Embryos were harvested at one-hour time intervals after egg laying (AEL), and levels of Grh and Zld were compared using quantitative immunoblots. Grh levels were constant in the early embryo. By contrast, levels of Zld were low in the 0-1 hour embryos and increased in the 1-2 hour embryos, when early gene expression initiates. To allow for a comparison between the relative amounts of each protein in the early embryo, approximate protein concentrations for Zld and Grh were determined by comparison of the immunoblot signals with the signal obtained from known amounts of recombinant protein. Each embryo contains approximately 1.8 × 109 molecules of Grh regardless of age, equating to ~9 × 108 molecules of Grh dimers with DNA-binding activity. This estimate is based on the fact that as determined by gel filtration chromatography little or no Grh protein exists as a monomer. Zld levels were ~7 × 108 molecules per embryo in the 0-1 hour embryo and increased to ~1.5 × 109 molecules per embryo in the 1-2 and 2-3 hour embryos. Thus, in the early embryo when there is no zygotic transcription, Grh levels are higher than Zld levels. These data, in combination with the fact that Grh may have a slightly higher affinity for CAGGTAG sites than Zld, suggest a model wherein Grh is likely bound to TAGteam elements and helps to maintain a transcriptionally silent state in the pre-CB embryo. At the time that early zygotic transcription initiates Zld levels have increased, raising the possibility that Zld now outcompetes Grh for TAGteam binding and thus helps drive gene expression (Harrison, 2010).

Validating the suggestion that Grh is not an essential activator of early gene expression, maternal depletion of grh does not result in obvious defects in cellular blastoderm formation or viability. Furthermore in situ hybridizations have not shown any obvious effects of maternal depletion or overexpression of Grh on the expression patterns of zen or tll in the stage 5 embryo; it is unclear whether this is because grh expression is only being perturbed in the maternal germline. It is possible that premature expression of pre-CB genes resulting from the maternal depletion of grh will not result in a significant phenotypic consequence unless the embryo is subject to stress or Zld levels are perturbed, resulting in a failure to detect abnormal expression patterns for zen and tll. In addition, the extra maternally deposited Grh in the overexpression experiments may be overcome by the increase in Zld levels that occurs one hour after fertilization. Alternatively, the additional Grh binding sites in the pre-CB promoters might have other functions in regulating gene expression that confound these experiments where the expression of the native genes was observed. Importantly, no expansion of tll expression was observed in embryos maternally depleted for grh despite previously published reports to the contrary. The only difference between these experiments and the published experiments were that the FLP-FRT system was used to generate embryos lacking maternal grh, while the previous work relied on X-ray induced mitotic recombination. Thus, it is suggested that the expansion of tll observed ib previous may have been due to an unrelated defect caused by the irradiation (Harrison, 2010).

It is noted that overexpression of grh in the maternal germline leads to defects in nuclear division in the blastoderm embryo reminiscent of the defects observed in zld mutant embryos or when Zld levels are decreased by RNAi, supporting the model that Grh acts as a transcriptional repressor by competing with Zld for DNA binding. Anaphase bridges between dividing nuclei, aberrant cell divisions perpendicular to the normal plane of division, and a lack of synchronicity in cell divisions were detected in about 50% of the blastoderm embryos generated from mothers of two different lines overexpressing grh in the maternal germline. None of these defects were noted in wild-type siblings. Interestingly, Grh overexpression in the pre-CB embryo resulted in ~50% reduction in hatching, indicating that these cell-division defects may ultimately decrease embryo viability. These observations are consistent with the competition model suggested by the in vitro experiments. When Grh is overexpressed in the maternal germline the resulting abnormally high levels of Grh in pre-CB embryos may disrupt the ability of Zld to function properly in the very early embryo by competing for TAGteam-binding sites (Harrison, 2010).

In summary, these data suggest that the concentrations of at least two TAGteam-binding factors (Grh and Zld), as well as the sequence variants of the TAGteam elements in the promoters, regulate gene expression in the pre-CB embryo, ensuring that transcription does not initiate prematurely. In its simplest form the model from existing data is that Grh acts as to inhibit premature transcription in the pre-CB embryo during the first hour following fertilization by blocking the ability of Zld to bind to TAGteam sites and activate gene expression. As Zld levels increase during the second hour, Zld now successfully competes against the constant level of Grh for TAGteam binding and activates gene expression. This competition between Grh and Zld can ensure that despite minor fluctuations in Zld levels or other stochastic activating events, expression of pre-CB genes will not initiate prematurely. This model is supported by previous work showing that Grh binds to repressive elements in the tll and dpp promoters, mutation of the Grh binding site can cause an expansion of dpp expression, and Grh competes with an unidentified activator for binding to sites in the dpp promoter. Furthermore, as Zld and Grh bind differentially to discrete TAGteam variants, activation at different promoters can be fine-tuned by the combination of TAGteam sequences present. This differential binding preference may explain, in part, how different pre-CB genes initiatetranscription at precise nuclear cycles. Thus Grh, Zld and the TAGteam elements could combinatorially regulate transcription in the pre-CB embryo, establishing the foundation for proper future embryonic development (Harrison, 2010).

Zelda binding in the early Drosophila melanogaster embryo marks regions subsequently activated at the maternal-to-zygotic transition

The earliest stages of development in most metazoans are driven by maternally deposited proteins and mRNAs, with widespread transcriptional activation of the zygotic genome occurring hours after fertilization, at a period known as the maternal-to-zygotic transition (MZT). In Drosophila, the MZT is preceded by the transcription of a small number of genes that initiate sex determination, patterning, and other early developmental processes; and the zinc-finger protein Zelda (ZLD) plays a key role in their transcriptional activation. To better understand the mechanisms of ZLD activation and the range of its targets, chromatin immunoprecipitation coupled with high-throughput sequencing (ChIP-Seq) was used to map regions bound by ZLD before (mitotic cycle 8), during (mitotic cycle 13), and after (late mitotic cycle 14) the MZT. Although only a handful of genes are transcribed prior to mitotic cycle 10, thousands of regions bound by ZLD were identified in cycle 8 embryos, most of which remain bound through mitotic cycle 14. As expected, early ZLD-bound regions include the promoters and enhancers of genes transcribed at this early stage. However, ZLD was also observed bound at cycle 8 to the promoters of roughly a thousand genes whose first transcription does not occur until the MZT and to virtually all of the thousands of known and presumed enhancers bound at cycle 14 by transcription factors that regulate patterned gene activation during the MZT. The association between early ZLD binding and MZT activity is so strong that ZLD binding alone can be used to identify active promoters and regulatory sequences with high specificity and selectivity. This strong early association of ZLD with regions not active until the MZT suggests that ZLD is not only required for the earliest wave of transcription but also plays a major role in activating the genome at the MZT (Harrison, 2011).

ZLD and the TAGteam sequences to which it binds were originally identified as key regulators of the early wave of zygotic transcription that precedes the MZT, and genome-wide measurements of ZLD binding validate this activity. However, this study has demonstrated that ZLD is also bound to the promoters and enhancers of more than a thousand genes that are not transcribed until the MZT, and that early ZLD binding is strongly associated with open chromatin and transcription factor binding during the MZT. Strong Zld binding to many genes was observed in cycle 8 embryos, including the early-transcribed genes sc, zerknüllt (zen) and even-skipped (eve) genes (see Zld binds to TAGteam elements in promoters and regulatory elements prior to zygotic activation) Thus, rather than being specifically involved in the onset of zygotic transcription, data indicate that ZLD has a much wider role in activating the zygotic genome, although its specific molecular mechanism remains elusive (Harrison, 2011).

The sequence of ZLD offers few clues to its function. Its roughly 1,600 amino acids contain no known domains besides C2H2 zinc-fingers, and none of its orthologs (found only in arthropods) have been experimentally characterized. That ZLD is important in both promoters and enhancers, and that its binding seems to affect the distribution of a diverse collection of transcription factors, argue against it directly recruiting polymerase and transcription factors. It is proposed instead that ZLD acts as a generic activator of the zygotic genome by controlling chromatin accessibility and/or histone modifications in the regions where it is bound (Harrison, 2011).

There is increasingly good evidence that difference in chromatin state across the genome at the MZT play a major role in determining which regions are active. Various studies have assayed the state of chromatin in cycle 14 embryos and have shown that regions of concentrated transcription factor binding are strongly associated with regions of 'open' chromatin, and that temporal changes in DNA accessibility and transcription factor binding are often coordinated. Furthermore, a recent computational analysis that dissected the factors that influence the ability to predict transcription factor binding offers compelling evidence that, at least in the D. melanogaster blastoderm, the state of chromatin shapes—and does not simply reflect—transcription factor binding. But one important question left unanswered by these studies is how differences in chromatin state are established. The current data and analyses clearly implicate CAGGTAG sites and ZLD (Harrison, 2011).

It is already known that CAGGTAG sites were enriched in active promoters and regions of transcription factor binding at the MZT, and that the gain and loss of CAGGTAG sites is a major driver of changes in transcription factor binding at the MZT between different Drosophila species. This study shows that ZLD binds to these CAGGTAG sites in vivo; that there is a tight connection between ZLD binding, chromatin state and MZT activity; and, crucially, that ZLD binding precedes, by at least several mitotic cycles, transcription factor binding and transcription at regions active at the MZT. Thus it is in precisely the right places at the right time to act as a generic activator of the MZT (Harrison, 2011).

Although little is known about the chromatin state in the early embryo, the data of this paper support a model in which the genome transitions from a fairly uniform open state (in which ZLD binds to 65% of CAGGTAG sites) to the mosaic of open and closed domains known to exist in cycle 14 (and in which ZLD binds to only 39% of CAGGTAG sites). If this is correct, ZLD likely plays a role in managing this transition, recruiting or repelling chromatin remodeling proteins to the regions where it is bound in uniformly open chromatin at cycle 8 and thereby ensuring they remain open at cycle 14. It is, however, also possible that early ZLD binding to its MZT targets may represent opportunistic binding of the protein to accessible regions containing CAGGTAG sites, with its MZT-specific activity arising from binding closer in time to the MZT (Harrison, 2011).

ZLD shares some compelling similarities with Xenopus β-catenin, which is required for expression of a subset of genes prior to the MZT. At least two genes, siamois and xnr3, require β-catenin for expression, but are not expressed until the MZT. β-catenin is required at or before the 32 cell stage to poise siamois and xnr3 for activation and helps to establish this poised state by recruiting the histone methyltransferase Prmt2 to the promoters of these genes. Thus β-catenin and ZLD are similarly required to drive pre-MZT expression of a subset of genes and also to poise additional genes for activation at the MZT. But unlike the specialized function of β-catenin, the current data suggest that ZLD acts globally to activate the zygotic genome (Harrison, 2011).

The proposed function for ZLD is reminiscent of the so-called 'pioneer' transcription factors. This concept was introduced to describe the role of FoxA1 in regulating gene regulation in the developing mammalian liver. In the undifferentiated endoderm, FoxA1 is bound to the enhancer of the hepatocyte-expressed albumin gene (Alb1) before Alb1 is expressed. FoxA1 binding mediates chromatin decondensation, and this modified chromatin environment allows for the subsequent binding of additional transcription factors that drive liver-specific gene expression (Harrison, 2011).

However, in contrast to chromatin in multipotent progenitor cells, the chromatin of the totipotent cells of the early embryo are likely to be in a relatively 'open' conformation. Thus, ZLD may not actively mediate chromatin decondensation but rather may act to maintain regions of accessible chromatin. There is precedent for chromatin remodeling being involved in the MZT. In mice, the chromatin-remodeling enzyme BRG1 is required for zygotic genome activation (Harrison, 2011).

Work in embryonic stem cells and in zebrafish embryos suggests that transcriptional activation at the MZT also involves specific histone modifications. In zebrafish, histones acquire modification patterns reminiscent of pluripotent embryonic stems cells as the embryo progresses through the MZT. Most notably, histone H3 acquires both marks of active transcription, tri-methylation on lysine four (H3K4me3), and of repression, tri-methylation on lysine 27 (H3K27me3). These bivalent histone marks were initially observed in embryonic stem cells and have been shown to poise the genomes of these cells for differentiation (Harrison, 2011).

Such bivalent marks have not been observed in Drosophila. However the earliest embryos examined were 4–12 hours, after the embryo has transitioned through the MZT and its cells are no longer fully pluripotent. Recently, it has been shown that in embryonic stem cells, bivalent domains are resolved as cells differentiate, raising the possibility that bivalent domains are present in Drosophila but no longer evident in the post-gastrulation embryos that have been examined. Perhaps ZLD works by recruiting or otherwise influencing the recruitment of proteins that modify chromatin, or by modifying chromatin itself. However, the fact that no bivalent domains have been observed in Drosophila or in Xenopus leaves open the possibility that ZLD is acting through a different mechanism. It is imperative that careful genome-wide analysis of histone modifications be performed in Drosophila and other species as they transition through the MZT to determine whether the formation of bivalent chromatin domains is a common characteristic of pluripotent cells (Harrison, 2011).

What differentiates ZLD target genes expressed prior to the MZT from those genes expressed only later? The genes most highly-bound by ZLD are transcribed by cycle 10. In one case, it has been shown that increased ZLD binding alone can lead to precocious activation, and it is possible that high levels of ZLD binding to promoters and proximal enhancers is sufficient to activate expression. However, most ZLD bound regions are not active until cycle 14. The generally lower levels of ZLD binding to these regions may necessitate the presence of other factors (such as patterning transcription factors or STAT92E) not expressed or activated until closer to the MZT. In this way ZLD would act indirectly to keep chromatin open at these regions until these other factors are able to exert their control. Alternatively, ZLD may act to directly recruit a zygotically expressed coactivator to the regulatory regions of genes expressed at the MZT. For example, ZLD could recruit factors, such as P-TEFb, that work to release stalled RNA polymerase II or, similar to β-catenin, recruit chromatin-modifying enzymes. The ability of ZLD to activate transcription could also be modulated by post-translational modifications to the protein itself (Harrison, 2011).

It is worth noting that before zygotic induction Drosophila embryos are undergoing rapid rounds of DNA replication and ORC, the replication initiator, does not bind to specific sequences, but rather depends upon access to open chromatin. Hence ZLD, with its potential role in shaping the chromatin landscape may also play a key role prior to transcription initiation in allowing for the proper assembly and spacing of pre-replication sites, and CAGGTAG may be a good predictor of origins. As the embryo progresses through the MZT, ORC binding becomes less closely spaced and origin firing becomes less synchronous suggesting that DNA replication reflects a changing chromatin environment (Harrison, 2011).

It is noteworthy that ZLD may activate distinct sets of genes by different mechanisms. TAGteam sites were first defined as sequence elements driving the expression of a small number of genes prior to the MZT (2006). It was therefore assumed that the TAGteam-binding protein, ZLD, might function specifically to activate this subset of genes. However, this study has shown that ZLD is marking the genome for widespread transcriptional activation of the zygotic genome at cycle 14. Perhaps, ZLD is able to directly activate the small subset of genes expressed prior to the MZT, but that ZLD-mediated gene activation at the MZT requires additional zygotically expressed cofactors or post-translational modifications (Harrison, 2011).

Given the ability of transcription factors such as β-catenin, FoxA1, and ZLD to mark genes for subsequent activation, and the recent evidence that chromatin remodeling, histone modifications and RNA polymerase II occupancy prepare developmental genes for later transcription, it is suggested that the poising of genomes for subsequent activation is likely to be a common feature of pluripotent cells. Determining the roles of these mechanisms in regulating gene expression at this important developmental timepoint will be crucial to understanding how these cells are poised for differentiation and how subsequent activation can be regulated to drive specific cell fates (Harrison, 2011).

HOT regions function as patterned developmental enhancers and have a distinct cis-regulatory signature

HOT (highly occupied target) regions bound by many transcription factors are considered to be one of the most intriguing findings of the recent modENCODE reports, yet their functions have remained unclear. This study tested 108 Drosophila melanogaster HOT regions in transgenic embryos with site-specifically integrated transcriptional reporters. In contrast to prior expectations, 102 (94%) were found to be active enhancers during embryogenesis and to display diverse spatial and temporal patterns, reminiscent of expression patterns for important developmental genes. Remarkably, HOT regions strongly activate nearby genes and are required for endogenous gene expression, as was shown using bacterial artificial chromosome (BAC) transgenesis. HOT enhancers have a distinct cis-regulatory signature with enriched sequence motifs for the global activators Vielfaltig, also known as Zelda, and Trithorax-like, also known as GAGA. This signature allows the prediction of HOT versus control regions from the DNA sequence alone (Kvon, 2012).

Taken together, these data show that Drosophila HOT regions function as cell type-specific transcriptional enhancers to up-regulate nearby genes during early embryo development. In contrast to prior expectations, HOT enhancers display diverse spatial and temporal activity patterns, which are reminiscent of expression patterns of important developmental genes. It was further found that the activity of many HOT enhancers appears to be unrelated to the expression of the bound transcriptional activators, suggesting that neutral TF binding to HOT regions is frequent. Interestingly, for Twi, Kr, and five additional TFs, it was found that HOT enhancers with functional footprints of the TFs are significantly enriched in the TFs' motifs compared with HOT enhancers to which the TFs seem to bind neutrally (e.g., 2.2-fold for Twi). This supports previous suggestions that the recruitment of TFs to HOT regions might be independent of the TFs' motifs and mediated by protein-protein interactions or nonspecific DNA bindin. This seems to be particularly true for (HOT) regions to which the TFs bind neutrally without impact on the regions' transcriptional enhancer activity (Kvon, 2012).

By uncovering a distinct cis-regulatory signature that is characteristic and predictive of HOT regions, computational analysis establishes a link between HOT regions, early embryonic enhancers (EEEs), and maternal TFs that are ubiquitously present in the early Drosophila embryo. Specifically, the results suggest that ZLD might be more generally important for the establishment of regulatory elements in the early embryo, while GAGA appears to be a distinguishing feature of HOT regions. This is supported by an analysis of genome-wide data on ZLD and GAGA binding in early Drosophila embryos: While 71.4% of HOT regions and 75.0% of EEEs are bound by ZLD (compared with 42.2% and 13.0% of control WARM and COLD regions), GAGA binds to 53.4% of HOT regions but only 20.0% of EEEs (compared with 28.3% and 7.8% for WARM and COLD regions). Even when considering only regions that are functioning as transcriptional enhancers in the early embryo (all EEEs from CAD and this study combined), GAGA binds to significantly more HOTenhancers than to enhancers that are not HOT. An instructive role for ZLD in defining chromatin that is open and accessible to other factors is further supported by its unusual property to bind to the majority (64%) of all occurrences of its sequence motif in the Drosophila genome. ZLD might thus be a prerequisite for both HOTregions and EEEs more generally. Similarly, a role for GAGA in nucleating or promoting the formation of TF complexes is consistent with its ability to self-oligomerize via its BTB/POZ domain and also form heteromeric complexes with the TF Tramtrack and potentially other BTB/POZ domain- containing TFs (e.g., Abrupt, Bric-a-brac, Broad complex, and others). GAGA, with its ability to recruit other TFs by protein-protein interactions, might contribute to HOT regions independent of the specific cellular or developmental context. Interestingly, C. elegans HOT regions are also strongly enriched in the GAGA motifs, and the motif is the most important sequence feature when classifying C. elegans HOT versus control regions. GAGA-like factors or their putative homologs or functional analogs across species might be a conserved feature of metazoan HOT regions (Kvon, 2012).

Establishment of regions of genomic activity during the maternal to zygotic transition

This study describes the genome-wide distributions and temporal dynamics of nucleosomes and post-translational histone modifications throughout the maternal-to-zygotic transition in embryos of Drosophila melanogaster. At mitotic cycle 8, when few zygotic genes are being transcribed, embryonic chromatin is in a relatively simple state: there are few nucleosome free regions, undetectable levels of the histone methylation marks characteristic of mature chromatin, and low levels of histone acetylation at a relatively small number of loci. Histone acetylation increases by cycle 12, but it is not until cycle 14 that nucleosome free regions and domains of histone methylation become widespread. Early histone acetylation is strongly associated with regions that were previously shown to be bound in early embryos by the maternally deposited transcription factor Zelda, suggesting that Zelda triggers a cascade of events, including the accumulation of specific histone modifications, that plays a role in the subsequent activation of these sequences (Li, 2014).

STAT is an essential activator of the zygotic genome in the early Drosophila embryo

In many organisms, transcription of the zygotic genome begins during the maternal-to-zygotic transition (MZT), which is characterized by a dramatic increase in global transcriptional activities and coincides with embryonic stem cell differentiation. In Drosophila, it has been shown that maternal morphogen gradients and ubiquitously distributed general transcription factors may cooperate to upregulate zygotic genes that are essential for pattern formation in the early embryo. This study shows that Drosophila STAT (STAT92E) functions as a general transcription factor that, together with the transcription factor Zelda, induces transcription of a large number of early-transcribed zygotic genes during the MZT. STAT92E is present in the early embryo as a maternal product and is active around the MZT. DNA-binding motifs for STAT and Zelda are highly enriched in promoters of early zygotic genes but not in housekeeping genes. Loss of Stat92E in the early embryo, similarly to loss of zelda, preferentially down-regulates early zygotic genes important for pattern formation. STAT92E and Zelda synergistically regulate transcription. It is concluded that STAT92E, in conjunction with Zelda, plays an important role in transcription of the zygotic genome at the onset of embryonic development (Tsurumi, 2011).

This study describes a bioinformatics approach to investigating the mechanisms controlling transcription of the zygotic genome that occurs during the MZT; STAT92E was identified as an important general transcription factor essential for up-regulation of a large number of early 'zygotic genes'. The role of STAT92E was described in controlling transcription of a few representative early zygotic genes, such as dpp, Kr, and tll, that are important for pattern formation and/or cell fate specification in the early embryo. These studies suggest that STAT92E cooperates with Zelda to control transcription of many 'zygotic genes' expressed during the MZT. While STAT mainly regulates transcription levels, but not spatial patterns, of dpp, tll, and Kr, and possibly also other 'zygotic genes', Zelda is essential for both levels and expression patterns of these genes (Tsurumi, 2011).

The transcriptional network that controls the onset of zygotic gene expression during the MZT has remained incompletely understood. It has been proposed that transcription of the zygotic genome depends on the combined input from maternally derived morphogens and general transcription factors. The former are distributed in broad gradients in the early embryo and directly control positional information (e.g., Bicoid, Caudal, and Dorsal), whereas the latter are presumably uniformly distributed regulators that augment the upregulation of a large number of zygotic genes. Other than Zelda, which plays a key role as a general regulator of early zygotic expression, the identities of these general transcriptional activators have remained largely elusive. It has been shown that combining Dorsal with Zelda- or STAT-binding sites supports transcription in a broad domain in the embryo. The demonstration of STAT92E as another general transcription factor sheds light on the components and mechanisms of the controlling network in the early embryo. Moreover, STAT92E and Zelda may cooperate to synergistically regulate zygotic genes. The results thus validate the bioinformatics approach as useful in identifying ubiquitously expressed transcription factors that may play redundant roles with other factors and thus might otherwise be difficult to identify (Tsurumi, 2011).

The conclusion that STAT92E is important for the levels but not the spatial domains of target gene expression in the early embryo is consistent with several previous reports. It has been shown that in Stat92E or hop mutant embryos, expression of eve stripes 3 and 5 are significantly reduced but not completely abolished. In addition, JAK/STAT activation is required for the maintenance of high levels, but not initiation, of Sxl expression during the MZT. Moreover, it has previously been shown that STAT92E is particularly important for TorsoGOF-induced ectopic tll expression but not essential for the spatial domains of tll expression in wild-type embryos under normal conditions. On the other hand, Zelda may be important for both levels and spatial patterns of gene expression. This idea is consistent with the finding that Zelda-binding sites are enriched in both promoter and promoter-distal enhancers regions, whereas STAT-binding sites are enriched in promoter regions only. It has been reported that pausing of RNA polymerase II is prominently detected at promoters of highly regulated genes, but not in those of housekeeping genes. In light of these results that STAT and Zelda sites are highly enriched in the early zygotic gene promoters, it is suggested that these transcription factors might contribute to chromatin remodeling that favors RNA polymerase II pausing at these promoters (Tsurumi, 2011).

Finally, the MZT marks the transition from a totipotent state to that of differentiation of the early embryo. As a general transcription factor at this transition, STAT, together with additional factors (such as Zelda), is important for embryonic stem cell differentiation. Further investigation is required to understand the molecular mechanism by which STAT and Zelda cooperate in controlling zygotic transcription in the early Drosophila embryo. Moreover, it would be interesting to investigate whether STAT plays similar roles in embryonic stem cell differentiation in other animals (Tsurumi, 2011).

Temporal coordination of gene networks by Zelda in the early Drosophila embryo

In past years, much attention has focused on the gene networks that regulate early developmental processes, but less attention has been paid to how multiple networks and processes are temporally coordinated. Recently the discovery of the transcriptional activator Zelda (Zld), which binds to CAGGTAG and related sequences present in the enhancers of many early-activated genes in Drosophila, hinted at a mechanism for how batteries of genes could be simultaneously activated. This study used genome-wide binding and expression assays to identify Zld target genes in the early embryo with the goal of unraveling the gene circuitry regulated by Zld. Zld was found to bind to genes involved in early developmental processes such as cellularization, sex determination, neurogenesis, and pattern formation. In the absence of Zld, many target genes failed to be activated, while others, particularly the patterning genes, exhibited delayed transcriptional activation, some of which also showed weak and/or sporadic expression. These effects disrupted the normal sequence of patterning-gene interactions and resulted in highly altered spatial expression patterns, demonstrating the significance of a timing mechanism in early development. In addition, prevalent overlap between Zld-bound regions and genomic 'hotspot' regions were observed that are bound by many developmental transcription factors, especially the patterning factors. This, along with the finding that the most over-represented motif in hotspots, CAGGTA, is the Zld binding site, implicates Zld in promoting hotspot formation. It is proposed that Zld promotes timely and robust transcriptional activation of early-gene networks so that developmental events are coordinated and cell fates are established properly in the cellular blastoderm embryo (Nien, 2011).

A combined approach of Zld ChIP-chip profiling, expression profiling, and genetic analysis revealed a wide-ranging regulatory role for Zld, and provides new insights into how essential embryonic processes are coordinated during early development. The results demonstrate that Zld is required for timely and robust target-gene responses. The observed increase in Zld protein levels in the second hour of development raises the possibility that a 'temporal gradient' of ubiquitously distributed Zld functions together with the spatial gradients of the patterning morphogens to define spatiotemporal specificity of zygotic gene expression in the early embryo (Nien, 2011).

Zld binding analyses indicate that there are at least eight TAGteam sites. CAGGTAG and CAGGTAA were the most over-represented and the most highly conserved in the Zld-bound regions. About half of Zld binding is TAGteam site dependent, and all of the sex determination, cellularization, and patterning genes that were studied have TAGteam sites in their enhancers and in many cases near the TSS. Curiously, within the CAGGTAG site is CAGGTA, a motif found strongly enriched in hotspots. Likewise, TATCGAT, CT-repeat, and CAC-related sites are similar to additional motifs found in hotspots: GTATCGAT, CTCTCTCTCT, and CTCACACG, respectively, which were proposed by modENCODE to be 'candidate drivers' of hotspot formation. TATCGAT is contained within the DRE (DNA replication related element) octamer site, TATCGATA, which is found near the TSS of genes involved in DNA replication. Additionally, TATCGATA is similar to the BEAF-32 insulator site. The CT-repeat site is also associated with an insulator sequence, the Trl/GAF motif. It is unclear how Zld interacts with the non-TAGteam sequences since TATCGAT, for example, does not appear to bind Zld in vitro. It is possible that the enrichment of these sites in Zld-bound regions is due to recruitment of Zld by components of complexes that directly interact with these sequences, or to opportunistic Zld interactions. Thus, at least for hotspots with the CAGGTA motif, which was discovered in the hotspots with highest complexity (bound by 12-14 factors), it is possible that Zld binding is involved in their establishment (Nien, 2011).

The idea of an 'initial step in the cascade of zygotic gene interactions that control development' was first proposed by Edgar and Schubiger (1986), and the idea of a 'timer' in early development that functions alongside the spatially restricted morphogens has been proposed for CAGGTAG sites. The current combined results on Zld extend both of these ideas. Zld protein accumulates to high levels by one hour of development, which coincides with the onset of zygotic genome activation. Within a two-hour period, the embryo cellularizes, determines X-chromosome dosage, patterns its body plan, and gets ready for gastrulation. By virtue of a single factor these processes are coordinately activated (Nien, 2011).

One can predict that increasing Zld levels in early embryos would advance timing of activation. Initial attempts to increase Zld protein levels by adding copies of Zld rescue constructs did not yield higher Zld protein levels, indicating that Zld levels may be tightly regulated. However, it has been shown that doubling the number of TAGteam sites in the zen enhancer leads to precocious expression, supporting the idea that Zld acts to time zygotic gene activation (Nien, 2011).

In the absence of Zld, all direct targets are either: (1) not expressed, (2) delayed but recover, or (3) delayed but do not recover fully. For example, genes involved in sex determination, cellularization, dorsal patterning, and proneural development are strongly down-regulated in zld- and never recover. In contrast, genes involved in AP and ventral patterning were not significantly down-regulated, and how they recovered depended on how they responded to other factors. The high-level Dl targets sna and twi recovered by nuclear cycle 14, but the lower-level targets sog, brk, and rho did not recover their normal patterns in zld-; instead they were expressed sporadically in a narrow domain with great variability among embryos (see Zld potentiates Dl morphogentic activity.). It appears that intermediate levels of Dl are no longer sufficient for robust and faithful target-gene expression, and lower levels cannot activate them at all; thus, the Dl gradient cannot be interpreted without Zld. These effects are likely due to the lack of direct Zld input, as mutation of the TAGteam sites in the sog primary enhancer caused a similar narrowing of the reporter expression domain. Indirect effects of delayed twi expression may also contribute, since mutation of Twi binding sites in the rho enhancer also resulted in a narrower domain. These observations suggest that Zld not only acts as a timer for Dl target-gene activation, but also potentiates Dl morphogenetic activity over a broad range in the neuroectoderm in order to establish multiple threshold responses. Along the AP axis, Zld may function in a similar way with Bcd. In zld-, the hb border shifts anteriorly, indicating that in regions of low-level Bcd, Zld enhances the sensitivity of target genes to morphogen concentrations. These results imply that Zld may promote transcription by acting synergistically with the patterning morphogens. It is important to note that the observed delay in expression does not necessarily mean the gene is activated later, but that without the synergy factor, there are not enough detectable transcripts at the time when assayed. Sporadic expression may reflect a similar situation (Nien, 2011).

Beyond the role of Zld in timing transcriptional initiation is a more elaborate timing mechanism, exemplified by the sequential appearance of the gap genes. How does Zld achieve differential activation of target genes within a network? A simple model would suggest that the activation of Zld target genes correlates with the strength of Zld binding to their regulatory elements. It was noticed that the earlier activated genes in the segmentation network had higher binding scores than those activated later. gt, tll, and all of the primary pair-rule genes, which are abundantly expressed by nuclear cycle 10, had higher binding scores than kni and Kr, which become abundant later in nc 11 and nc 12, respectively (see Zld regulates timing within the gap gene network.). Later-acting genes such as secondary pair-rule genes, segment polarity genes, and the homeotic genes were bound, but had lower binding scores (see Pair-rule patterns are altered in zld- embryos). Such a mechanism where timing of activation is dependent on strength of binding was shown for the Pha-4 transcription factor in C. elegans pharyngeal development. Pha-4 regulates a wide array of genes expressed at different stages, and the onset of target-gene expression depends on the affinity of Pha-4 binding sites in the regulatory regions of those target genes. An intriguing possibility for the early Drosophila embryo is that as Zld levels rise in the first hour of development a 'temporal' concentration gradient is formed such that interaction with higher affinity binding sites would occur before that with lower affinity sites, thus differentially activating target genes (Nien, 2011).

A second timing mechanism is provided by the intrinsic properties of the regulatory motifs established by Zld. The data revealed that Zld functions in several coherent feed forward loops, for example, binding both the XSE (X-chromosome signal element) genes (such as sisA) and Sxl, dpp and its targets, and twi and rho. Embedded in this type of motif is a mechanism of temporal control since a delay in the activation of the third gene in the loop occurs because of its dependence on accumulation of the second gene product. For example, the activation of Sxl is 2-3 nuclear cycles later than that of the XSE genes. In addition, experiments that abolished the TAGteam sites in the SxlPe enhancer caused a 3 nuclear cycle delay in reporter expression, demonstrating a direct role for these sites, and hence Zld, in timing transcriptional activation (Nien, 2011).

Zld also functions in an incoherent feed forward loop whereby one branch of the loop has the opposite sign. Zld promotes transcription of both the pair-rule and gap genes, while gap proteins repress pair-rule genes. The primary pair-rule gene transcripts are easily detectable by nuclear cycle 10, even before some of the gap genes, giving a new perspective on the canonical segmentation gene hierarchy in which the pair-rule genes are downstream of the gaps. Early strong activation of the pair-rule genes may be essential to guarantee transcriptional activation before repressor gradients overwhelm the AP axis (Nien, 2011).

Clues can be extracted from the results about how Zld may function on a mechanistic level. First, Zld appears in zygotic nuclei very early, before Bcd and Dl, possibly binding to target genes first. Second, loss of Zld results in delayed transcriptional activation and, in many cases, weak and/or sporadic expression. Third, Zld binding is frequently found at early enhancers (both primary and shadow), as well as close to the TSS of genes, hinting at a role in recruitment of the transcriptional machinery. Fourth, Zld binding coincides with hotspots, which were found to correlate with regions of nucleosome depletion. Together these observations suggest that Zld increases the transcriptional activity, or expressivity, of target genes. Mechanistically, Zld binding could facilitate either the access of other factors (both activators and repressors) to DNA or the interaction of these factors with the transcriptional machinery, an idea put forth after a correlation was observed between the evolutionary turnover of the CAGGTAG site along with the patterning factor binding sites (Nien, 2011).

An alternative mechanism to ensure robust and coordinated early embryonic expression is pol II pausing. Many Zld target genes such as sog have been shown to exhibit polymerase pausing. The delayed and sporadic expression in zld- could be explained by lack of paused pol II (Nien, 2011).

It is evident from these results that Zld coordinates the onset of transcriptional activity of the early gene networks during the MZT. Considering that Zld is also expressed at later times in development, it is predicted that Zld will act similarly to increase expressivity of genes in networks that function, for example, in central nervous system development in mid-stage embryos and imaginal disc patterning in larval development. In these processes, similar to the MZT, a simple strategy may be used to collectively activate and temporally control batteries of genes required for establishing the proper gene circuitry (Nien, 2011).

The TAGteam motif facilitates binding of 21 sequence-specific transcription factors in the Drosophila embryo

Highly overlapping patterns of genome-wide binding of many distinct transcription factors have been observed in worms, insects, and mammals, but the origins and consequences of this overlapping binding remain unclear. While analyzing chromatin immunoprecipitation data sets from 21 sequence-specific transcription factors active in the Drosophila embryo, this study found that binding of all factors exhibits a dose-dependent relationship with 'TAGteam' sequence motifs bound by the zinc finger protein Vielfaltig, also known as Zelda, a recently discovered activator of the zygotic genome. TAGteam motifs are present and well conserved in highly bound regions, and are associated with transcription factor binding even in the absence of canonical recognition motifs for these factors. Furthermore, levels of binding in promoters and enhancers of zygotically transcribed genes are correlated with RNA polymerase II occupancy and gene expression levels. These results suggest that Vielfaltig acts as a master regulator of early development by facilitating the genome-wide establishment of overlapping patterns of binding of diverse transcription factors that drive global gene expression (Satija, 2012).

These results suggest that binding of Vielfaltig to TAGteam sites facilitates or stabilizes the nearby binding of many of the transcriptional regulators active in the early Drosophila embryo, thereby providing a potential mechanism for Vielfaltig’s activation of the zygotic genome. Most transcription factor binding site motifs have limited predictive value in the absence of external information such as chromatin accessibility, but a clear association was observed between TAGteam motifs and binding of many transcription factors without Vielfaltig ChIP data. Taken together with the finding that the number of nearby transcription factor binding sites globally correlates with levels of binding only if a TAGteam motif is present, this suggests that TAGteam motifs may distinguish functional cis-regulatory modules (CRMs) from binding site clusters with little activity in the early embryo (Satija, 2012).

In order to assess the extent to which Vielfaltig binding near clusters of motifs for other factors may be sufficient to facilitate binding of other transcription factors, all promoters of coding genes with clusters of transcription factor binding sites (greater than or equal to eight motifs) were enumerated. The fraction of such promoters with high levels of overlapping binding (≥95th percentile of promoters) exhibited a dose-dependent relationship with the number of nearby TAGteam sites; 7% of promoters with one nearby TAGteam motif had high levels of overlapping binding, whereas 85% of promoters with more than three nearby TAGteam-motifs did. Requiring that promoters have at least one TAGteam-site conserved in D. yakuba within 250 bp of the TSS resulted in a still-higher fraction of promoters with high levels of binding. These results suggest that the presence of nearby TAGteam motifs is frequently, although not always, sufficient to facilitate binding of many transcription factors (Satija, 2012).

Locally compact chromatin structure may help to explain the existence of regions with TAGteam motifs, but little binding. Promoters with low levels of binding (≤50th percentile of promoters) despite the presence of nearby TAGteam motifs had less accessible chromatin (~10-fold) than did promoters with high levels of binding (≥95th percentile). These results are compatible with recent findings that many Ciona intestinalis and Drosophila CRMs contain sequence signatures, independent of transcription factor binding sites, predicted to promote nucleosome depletion (Satija, 2012).

The ranking of heptamer association with binding demonstrated the uniqueness of TAGteam sites, but revealed non-TAGteam motifs associated with overlapping patterns of binding as well. Of the 30 highest-ranked heptamers associated with overlapping binding, all of the non-TAGteam motifs contained GA or GAG repeats, suggesting a potential association between binding of many factors and GAGA factor, which is encoded by the gene Trithorax-like and implicated in chromatin remodeling. Intriguingly, a 1.5-fold enrichment for GAGA motifs was observed in promoters with high levels of binding when TAGteam motifs were not present (39% of highly bound promoters without TAGteam sites contained GAGA motifs versus 25% with TAGteam sites), suggesting that GAGA factor may help to facilitate binding of transcription factors at some regions not associated with Vielfaltig (Satija, 2012).

It is proposed that Vielfaltig facilitates the binding of additional transcription factors to nearby low-affinity sites, although the mechanisms by which it may do so remain unclear. Vielfaltig has six C2H2 zinc fingers, two N-terminal and four C-terminal, of which the C-terminal zinc fingers are sufficient to recapitulate its known DNA-binding specificity. Vielfaltig may help to maintain chromatin in an accessible state, participate in permissive protein-protein interactions, or have particular interaction partners. To investigate this last possibility, whether nearby binding of any factor could help to explain the relationship between TAGteam motifs and overlapping binding was assessed. Intriguingly, after controlling for levels of Medea binding, little difference was observed in binding of all factors between regions with and without TAGteam motifs. Furthermore, ~80% of regions where Medea is highly bound contain a TAGteam motif—the highest fraction for any of the 21 factors—even in the absence of strong overlapping binding, suggesting that Medea may frequently interact with Vielfaltig (Satija, 2012).

Highly conserved sequence homologs of vielfaltig have been identified in other insects, but not in vertebrates (Staudt et al. 2006). However, overlapping patterns of genome-wide binding of diverse transcription factors appears to be a common feature of undifferentiated animal cells. It is proposed that other animals may possess functional homologs that associate with hotspots of binding of multiple transcription factors, thereby influencing global patterns of gene expression (Satija, 2012).

Uncovering cis-regulatory sequence requirements for context specific transcription factor binding: The activity of Twist-bound enhancers and Twist binding itself depends on Vielfaltig motifs

The regulation of gene expression is mediated at the transcriptional level by enhancer regions that are bound by sequence specific transcription factors (TFs). Recent studies have shown that the in vivo binding sites of single TFs differ between developmental or cellular contexts. How this context-specific binding is encoded in the cis-regulatory DNA sequence has however remained unclear. This study computationally dissected context-specific TF binding sites in Drosophila, C.elegans, mouse, and human and finds distinct combinations of sequence motifs for partner factors, which are predictive and reveal specific motif requirements of individual binding sites. It is predicted that TF binding in the early Drosophila embryo depends on motifs for the early zygotic TFs Vielfaltig (also known as Zelda) and Tramtrack. The activity of Twist-bound enhancers and Twist binding itself depends on Vielfaltig motifs, suggesting that Vielfaltig is more generally important for early transcription. The finding that the motif-content can predict context-specific binding and that the predictions work across different Drosophila species suggests that characteristic motif combinations are shared between sites, revealing context-specific motif codes (cis-regulatory signatures), which appear to be conserved during evolution. Taken together, this study establishes a novel approach to derive predictive cis-regulatory motif requirements for individual TF binding sites and enhancers. Importantly, the method is generally applicable across different cell-types and organisms to elucidate cis-regulatory sequence determinants and the corresponding trans-acting factors from the increasing number of tissue- and cell-type specific TF binding studies (Yanez-Cuna, 2012).

Recent ChIP experiments revealed that in vivo TF binding sites differ between different cell-types (or more generally cellular contexts), consistent with the frequent re-use of TFs in different cellular or developmental contexts and their context-specific functions. However, whether and how context-specific TF binding is encoded in the cis-regulatory sequences and the relation between the DNA sequence and in vivo binding has remained unclear (Yanez-Cuna, 2012).

This study used binding sites of a single TF in different contexts as pivots to study the sequence determinants of in vivo binding. By systematically comparing the binding site sequences, it was shown that they contain motifs for other TFs that are characteristic for each context and allow the prediction of context-specific binding. The motif-based predictions were sufficiently strong to pinpoint cis-regulatory requirements for individual binding sites, providing specific testable hypotheses, which were validated experimentally (Yanez-Cuna, 2012).

This finding has important implications for transcriptional regulation: First, it argues that context-dependent TF binding is determined by the cis-regulatory sequence, consistent with the sufficiency of enhancer sequences to recapitulate their endogenous chromatin state (i.e. histone modifications and DNA methylation and activity in different contexts. Second, in vivo binding appears to be determined by combinations of TF motifs rather than a single TF’s motif, therefore substantially increasing the information content and specificity of in vivo binding. Individual motifs are often only 4-6 nucleotides long and would therefore occur every 256 to 4096 nucleotides by chance (i.e. in random DNA sequences - even when motif degeneracy is not taken into account). Second, as different motif combinations are functional, a single TF can have context-specific binding sites and target genes depending on both, the cis-regulatory sequence that contains a certain combination of motifs and the cell-type that expresses the corresponding TFs. For example, Twist motifs in the vicinity of motifs for Snail, Dorsal, or Vielfaltig are preferentially bound early while those near motifs for Tinman (TIN) or Chorion factor 2 (CF2) are preferentially bound late, when these TFs are present, respectively (Yanez-Cuna, 2012).

Twist binding correlates with the binding of other mesodermal TFs (e.g. early with Myocyte enhancer factor 2 (MEF2) and TIN with Pearson correlation coefficients of 0.2 and 0.4, respectively) and ChIP-chip data for other mesodermal TFs are predictive of Twist binding using cross-validation, suggesting that partner TFs might assist each others binding in a correlated fashion (Yanez-Cuna, 2012).

In general, the action of partner TFs might be direct, e.g. mediated by direct protein-protein interactions (suggested e.g. for 'condition-altered binding' or passive e.g. by opening or otherwise preparing chromatin for TF binding. Some of the uncharacterized motifs might for example recruit chromatin remodeling factors and one of them indeed correlates with nucleosome-depleted open chromatin. It is conceivable that chromatin-mediated functions might be temporally decoupled such that partner TFs could act sequentially rather than simultaneously (Yanez-Cuna, 2012).

The TF Vielfaltig’s TAGteam motif appears to be a key determinant of early Twist binding: It is enriched in early binding sites, required to successfully classify them in a predictive framework, required for function of four early enhancers with diverse activity patterns, and necessary for Twist binding. Similarly, the early binding sites of other factors are enriched in TAGteam motifs (e.g. early MEF2 binding sites), suggesting that it is a general determinant of early binding and enhancer function. Interestingly, Vielfaltig is maternally deposited and has been shown to bind to the TAGteam motifs, a set of motifs that are enriched in regulatory regions of early blastodermal genes (Yanez-Cuna, 2012).

Vielfaltig is sufficient to activate enhancers that contain TAGteam motifs and required for early gene expression and cellularization. It has further been shown that Vielfaltig binds to about 60% of all genomic instances of the TAGteam motif. The finding that Vielfaltig is a key determinant of early binding is intriguing and suggests that Vielfaltig might help to open (or keep open) chromatin and allow TFs to access their binding sites on DNA thereby defining early enhancers (Yanez-Cuna, 2012).

The motif for Tramtrack is important for early Twist binding. Maternal Tramtrack has been proposed to repress zygotic transcription of early patterning genes in a concentration dependent manner, thereby explaining the timing of zygotic activation. Due to the overlap of different motifs, the 38% and 52% early Twist binding sites that depend on Vielfaltig and Tramtrack are conservative estimates, and both factors are likely important for additional binding sites. This study suggests that Vielfaltig and Tramtrack play an important regulatory role in the early embryo, preparing and/or regulating enhancers of a broad set of genes (Yanez-Cuna, 2012).

The finding that context-specific TF binding can be predicted using cross-validation indicates that the motif combinations extracted from training sequences are sufficiently general to correctly predict previously unseen test sequences. This means that different sites share characteristic sequence features and might function by similar means. In fact, similar patterns of motifs enriched in binding sites for different TFs are found in the same context (e.g. the early Drosophila mesoderm), suggesting that different cell-types have specific gcodesh that are indicative of binding for different TFs. This is generally true for all datasets studied in species as diverse as human, mouse, C. elegans, and Drosophila. In its ability to discover which cis-regulatory motifs (and the corresponding TFs) are relevant for different functionally defined sets of sequences (e.g. those active or bound in defined cellular contexts), this approach is similar to recent k-mer based enhancer predictions in mammals (Lee, 2011). It is complementary to recent thermodynamic models of gene expression in the early Drosophila embryo. Here, all relevant TFs, their motifs, and their cellular protein-concentrations are known, and the models predict enhancer activity for selected DNA sequences in order to gain insights into mechanistic aspects of transcriptional regulation, e.g. the importance of weak binding or homo- and heterotypic TF-TF interactions (Yanez-Cuna, 2012).

This study shows that motif-analyses of context-specific binding sites can identify the precise cis-regulatory sequence requirements and the trans-acting factors for individual genomic sites. This has important implications for the many TFs such as Hox factors or TFs downstream of signaling pathways, which are broadly expressed but regulate certain genes specifically in some tissues but not in others: it is foreseem that the recent increase in cell-type specific ChIP analyses will reveal specific cis-regulatory requirements and the corresponding transacting factors that define the regulatory state for many cell-types. As TF-binding has been shown to be predictive of cell-type specific enhancer activity, this will bridge the gap between the sequence, TF binding, and enhancer/CRM function and will ultimately reveal how cell-type specific regulatory information is encoded in the DNA sequence (Yanez-Cuna, 2012).

Mutations of the Drosophila zinc finger-encoding gene vielfältig impair mitotic cell divisions and cause improper chromosome segregation.

This study describes the molecular characterization and function of vielfältig (vfl), a X-chromosomal gene that encodes a nuclear protein with six Krüppel-like C2H2 zinc finger motifs. vfl transcripts are maternally contributed and ubiquitously distributed in eggs and preblastoderm embryos, excluding the germline precursor cells. Zygotically, vfl is expressed strongly in the developing nervous system, the brain, and in other mitotically active tissues. Vfl protein shows dynamic subcellular patterns during the cell cycle. In interphase nuclei, Vfl is associated with chromatin, whereas during mitosis, Vfl separates from chromatin and becomes distributed in a granular pattern in the nucleoplasm. Functional gain-of-function and lack-of-function studies show that vfl activity is necessary for normal mitotic cell divisions. Loss of vfl activity disrupts the pattern of mitotic waves in preblastoderm embryos, elicits asynchronous DNA replication, and causes improper chromosome segregation during mitosis (Staudt, 2006).

Evidence is provided that the C2H2 zinc finger protein Vfl participates in mitotic cell division and eventually causes abnormal chromosome segregation during mitosis. This defect results in distinct organismal phenotypes, which are most prominently demonstrated by the lack of synchronous mitotic waves during early Drosophila embryogenesis, as reflected by uncoordinated division patterns and asynchronous DNA-replication cycles. As a result, preblastoderm nuclei are no longer arranged in a single layer at the periphery of the embryo and cause the formation of a number of extra folds during gastrulation. Subsequently, embryos develop an aberrant segmentation pattern, a defective nervous system, and an abnormal muscle pattern in the developing embryo. The mitosis-related phenotype of the vfl mutants is consistent with the notion that vfl transcripts are highly enriched in mitotically active cells and that the protein exerts a dynamic subcellular localization pattern during the cell cycle. From early telophase until the end of interphase, the Vfl protein is chromatin associated. During the DNA-condensation phase, it dissociates from chromatin, remains separated from DNA during metaphase and anaphase, and accumulates in a granular pattern in the area of the disintegrated nucleus once its envelope is dissolved. At this stage, Vfl is neither associated with lamin nor with microtubules or DNA. At the end of mitosis and upon entering the telophase, the granules vanish, and Vfl becomes instantly associated with interphase chromatin. This observation and the canonical DNA-binding domain of Vfl suggest that the protein is active when bound to DNA and inactive when dissociated from chromatin. This proposal implies that the low amounts of Vfl in the cytoplasm of mitotically inactive cells are likely to represent a storage of inactive protein and that Vfl functions once it is chromatin associated during interphase of dividing cells. Vfl may act as a DNA-binding factor that participates directly in chromatin dynamics and accessibility or act indirectly as a transcription factor that regulates the expression of genes involved in these processes. However, because the loss of vfl activity causes notable defects before mitosis 14, i.e., the time point when the embryo switches from maternally controlled nuclear divisions to the zygotic control of mitosis, it seems unlikely that Vfl acts as a transcription factor. Thus, the idea is favored that Vfl functions in DNA replication or participates in some aspects of chromatin dynamics required for proper mitosis (Staudt, 2006).

BrdU labeling experiments of vfl mutant and wild-type embryos indicate that the coordinated timing of mitosis, including the process of DNA replication, is strongly impaired by altering the normal dose of vfl activity. However, DNA replication is not blocked in the mutants as indicated by the appearance of up to four nuclei that remain associated by chromatin bridges, indicating that the partially separated nuclei are capable to undergo replication although sister chromatids have failed to separate during the previous anaphase. These observations and the early association of Vfl with early interphase chromatin suggest that the protein could participate in the temporal control of DNA replication. Alternatively, or in addition, the early chromatin association of Vfl could also reflect its requirement during DNA decondensation and/or cohesin loading, two processes that are prerequisite for the initiation of replication (Staudt, 2006).

An interesting aspect of the results concerns the regulation of Vfl localization during mitosis, i.e., association of Vfl with interphase chromatin and with granular structures during all other phases of mitosis. Chromatin condensation during prophase, when Vfl dissociates from chromatin, is accompanied by the cessation of transcription. This process correlates with inactivation of transcriptional regulators as well as other regulatory chromatin components by removing them from their DNA targets. Such an in vivo mechanism has been recently reported for a mammalian zinc finger protein Ikaros, which plays a key role in the development and the response of the immune system. Ikaros has also been implicated in the regulation of cell cycle progression and was found to dissociate from chromatin during early stages of mitosis due to a G2/M-specific phosphorylation event. It is not known how the dissociation of Vfl from chromatin and its accumulation in the granular structures are achieved mechanistically. It is speculated, however, that these structures represent the mitotic containment or a sequestering form for Vfl until cells enter the interphase again. In contrast, cells that do not continue mitosis nevertheless maintain comparatively low levels of Vfl in cytoplasmic granules, likely to represent a small pool of stored and inactive Vfl protein (Staudt, 2006).

Vfl has no direct vertebrate homologue that could be identified by sequence comparison. However, this result has to be taken with caution due to some specifics of zinc finger domains. The C2H2 zinc finger motif is an unusually small, self-folding domain of 25- to 30-amino acid residues. It includes paired cysteines and histidines as zinc-coordinating residues and possesses two short β-strands followed by an α-helix. The DNA-binding properties usually depend on no more than three amino acid residues of the zinc finger loop, the arrangement of C2H2 domains in proteins, and the higher order structure of proteins. This arrangement and that only a few conserved amino acid residues are required to ensure the sequence-specific DNA binding make it extremely difficult, if not impossible, to predict how the homologous vertebrate C2H2 finger would look. In contrast, the protein was found to be highly conserved in all insects analyzed, including the mosquito An. gambiae and the beetle T. castaneum, which separated from Drosophila for more than 250 and 300 million years ago, respectively. Thus, the function of Vfl might be still unrecognized among the more than 2000 C2H2 zinc finger proteins that were annotated in the mouse and human genomes. Alternatively, vfl may have a function for the insect specific nuclear divisions during syncytial blastoderm stage and therefore may not be conserved in species other than insects (Staudt, 2006).

Loss of vfl activity causes abnormally shaped and enlarged nuclei that fail to become integrated into the cortical arrangement of preblastoderm nuclei. Obviously, these morphological features, in particular the enlargement of the nuclei, cause some space constrictions at the periphery of the embryo, and thus a significant portion of nuclei fail to align properly. The subsequent divisions, which occur in part perpendicular to the normal division axis as frequently observed in such embryos, cause an additional space limitation which forces the epithelial cell layer to form the irregular and variable patterns of folds that were consistently observed in gastrulating embryos. Notably, the positions of the folds vary significantly from embryo to embryo. This finding can be attributed to the fact that the embryos analyzed were most likely not lack-of-function ('null') mutants and always contained half the normal dose of maternal vfl activity. Furthermore, although the RNAi injections into embryos were done as early as possible after egg deposition, some undetected amounts of protein might have already accumulated in such embryos. This proposal is consistent with the notion of earlier and more severe effects in response to increasing amounts of injected RNAi. Irrespectively of this speculation, the results presented in this study show that inadequate levels of Vfl interfere with the timing of mitosis and eventually result in impaired chromosome separation. The specific expression of vfl in mitotically active cells, the dose dependence of the protein as demonstrated by gain-of-function and loss-of-function experiments and the shuttling of Vfl between chromatin of interphase nuclei and the granular structures during the other stages of the mitotic cycle suggest that protein function is tightly regulated. The mechanisms of the shuttling, the molecular pathways in which Vfl participates, and the link between Vfl activity and DNA replication remain to be further elucidated (Staudt, 2006).

Design flexibility in cis-regulatory control of gene expression: synthetic and comparative evidence

In early Drosophila embryos, the transcription factor Dorsal regulates patterns of gene expression and cell fate specification along the dorsal-ventral axis. How gene expression is produced within the broad lateral domain of the presumptive neurogenic ectoderm is not understood. To investigate transcriptional control during neurogenic ectoderm specification, divergence and function of an embryonic cis-regulatory element controlling the gene short gastrulation (sog) was studied. While transcription factor binding sites are not completely conserved, it has been demonstrated that these sequences are bona fide regulatory elements, despite variable regulatory architecture. Mutation of conserved sequences revealed that putative transcription factor binding sites for Dorsal and Zelda, a ubiquitous maternal transcription factor, are required for proper sog expression. When Zelda and Dorsal sites are paired in a synthetic regulatory element, broad lateral expression results. However, synthetic regulatory elements that contain Dorsal and an additional activator also drive expression throughout the neurogenic ectoderm. These results suggest that interaction between Dorsal and Zelda drives expression within the presumptive neurogenic ectoderm, but they also demonstrate that regulatory architecture directing expression in this domain is flexible. A model for neurogenic ectoderm specification is proposed in which gene regulation occurs at the intersection of temporal and spatial transcription factor inputs (Liberman, 2009).

Through a comparative analysis of orthologous sog cis-regulatory modules from twelve Drosophilid species, core regulatory elements conserved in these sequences were identified. Considerable binding site turnover has occurred during the approximately 40 million years of evolution, yet some sequences are conserved. This observation supported the hypotheses that were investigated in this work which are, 1) that conserved sequences are functionally required and, 2) that variable architectures might generate the same or similar patterns of expression. Surprisingly, despite the opportunity for binding site turnover during the course of evolution, the sog regulatory regions from D. virilis can still be interpreted faithfully when used to drive reporter expression in D. melanogaster. It is concluded from these experiments, despite flexibility in the cis-regulatory element structure, regulatory logic has been conserved during evolution of the cis-regulatory module sequences to support sog expression (Liberman, 2009).

Though this comparative analysis identified limited sequence homology, what sequence conservation that was present facilitated efforts to examine the core regulatory elements required for patterning the neurogenic ectoderm. Using site-directed mutagenesis to eliminate sites within the sog cis-regulatory sequence, results were obtained that suggest that Dorsal functions together with the ubiquitous activator Zelda to control sog expression within the neurogenic ectoderm. Furthermore, synthetic cis-regulatory elements were constructed, consisting of Dorsal and Zelda or Dorsal and D-STAT sites, which are both able to support expression in the broad lateral domain of Drosophila early embryo. From these results it is concluded that broad lateral expression is achieved by a combination of Dorsal sites and sites for the ubiquitous activator Zelda, which suggests that a more general mechanism to create broad expression may involve interactions between Dorsal and other broadly expressed transcription factors (Liberman, 2009).

Mutagenesis and mutant analysis results demonstrate that Dorsal and Zelda support expression of sog along the dorsal-ventral axis. In the absence of Dorsal protein, expression of sog is gone; however when Dorsal binding sites were mutagenized, weak ventral-lateral reporter expression remains that could be due to unknown Dorsal binding sites that were not detected by PWM searches or due to input from another transcription factor. In the absence of Zelda binding sites or in Zelda mutants, expression is slightly broader than when Dorsal sites are eliminated. This residual expression could be due to Dorsal and/or another transcription factor (e.g. bHLH) functioning to direct expression, in a Zelda-independent manner, within the ventral-neurogenic ectoderm; however, the data suggests that Twist is not likely involved, as the domain of sog expression along the dorsal-ventral axis is not severely affected in twist mutants (Liberman, 2009).

Previous genetic studies have demonstrated that Dorsal is required for specification of the presumptive neurogenic ectoderm, but binding sites for Dorsal alone are not sufficient to generate expression within the broad lateral domain of embryos. Dorsal has been shown to function synergistically with Twist to pattern the presumptive mesoderm and ventral neurogenic ectoderm. This study presents evidence that Dorsal and Zelda function synergistically to regulate expression that is able to encompass the entire presumptive neurogenic ectoderm domain. Some method of cooperativity likely exists between Dorsal and Zelda, at the level of DNA binding or downstream, and is responsible for extending the expression domain into dorsal-lateral regions of the embryos, where the levels of nuclear Dorsal are low (Liberman, 2009).

It is proposed that Dorsal functions as a spatial regulator in the neurogenic ectoderm and that additional transcription factors like Zelda, act as co-activators to regulate the precise onset of expression. Furthermore, it is suggested that multiple ubiquitous or broadly expressed activators may function with Dorsal to support expression in a broad lateral domain (e.g. Zelda, STAT, and bHLH transcription factors such as Daughterless (Da). This study has demonstrated that STAT binding sites can also function together with Dorsal to drive expression in a broad lateral domain. Further support for this idea includes the observation that sog as well as ths exhibit broad expression early. Sites for Zelda are also present in the ths cis-regulatory module, and these sites likely direct the almost-ubiquitous early expression of ths observed. Interaction of Dorsal with distinct co-activators may not only regulate the spatial domain of expression supported, but also the temporal output. Zelda along with Dorsal or a Dorsal target initiates the earliest zygotic expression detected; perhaps interactions between Dorsal and other activators facilitate expression within a broad lateral domain (or other defined pattern) at later time-points. It is asserted that gene expression is achieved at the intersection of the Dorsal nuclear gradient and the additional activator which could either be ubiquitous in the case of Zelda or localized in the case of Twist (Liberman, 2009).

Even equipped with this new knowledge, other cis-regulatory modules that support co-expression of genes SoxN, pyramus and Neu3 have proven difficult to identify. To date, SoxN and pyramus regulatory elements remain unidentified. Flexible regulatory structures could account for some of the obscurity that has been encountered in the identification of cis-regulatory modules that support expression of genes within Drosophila early embryos. Flexibility in binding site composition, orientation and number of sites has also been demonstrated in the regulation of co-expressed genes in Ciona by extensive co-expression analyses. Possibly the observed flux in binding site composition and arrangement provides a mechanism that facilitates the introduction of mutations, which may be selected when a fitness advantage is provided to the developing embryo (Liberman, 2009).

Recently, a second regulatory element for sog located upstream of the gene was identified which also drives expression in a broad lateral stripe in the presumptive neurogenic ectoderm of cellularized embryos. This novel regulatory element as well as the known regulatory element, the intronic enhancer examined in this study, probably function together to control the full expression pattern of sog in the developing embryo. While both cis-regulatory sequences contain Dorsal and Zelda binding sites, the novel enhancer contains many more bHLH sites (L. Liberman, unpub. obs.), which is in stark contrast to the intronic sog regulatory element, which contains only one bHLH site and exhibits very little change of expression in twist mutant embryos. This new regulatory element presents further evidence that there exist multiple solutions for the developmental problem of producing spatially and temporally regulated expression. Future experiments will address whether these early embryonic enhancers controlling the expression of the sog gene within similar domains use the same mechanism (i.e. Dorsal + Zelda cooperativity) to support expression in a broad lateral stripe or whether different mechanisms are used (Liberman, 2009).

Evolutionary comparisons of sequences from diverged species can be very useful for the dissection of underlying cis-regulatory logic, as has been shown in this study; yet the important variable is that the proper comparisons of sequences must be made (i.e. species of appropriate evolutionary distance) and this is not always easy to define. In vertebrate systems, analyses of cis-regulatory modules usually focus on modules identified by methods that select for high degrees of conservation, which inherently have a low amount of flexibility. Arguments have been made that deciphering the underlying regulatory logic from evolutionary comparisons of sequences, when conservation is too high, is hard to interpret. However, it is contended that the relevant comparisons that provide insights into cis-regulatory logic are context-dependent. In analysis of the sog and Neu3 cis-regulatory modules, only limited sequence conservation was identified in comparisons of homologous sequences isolated from D. melanogaster and other Drosophilids. In the sog early embryonic regulatory element that was analyzed in this study, 71 (of 395) base-pairs of non-contiguous sequence exhibits conservation. The degree of conservation that was retained however was useful for dissecting the underlying regulatory logic (Liberman, 2009).

Identifying regulatory regions with flexible structure is more challenging than scanning for a stringent set of binding sites, but it may also reveal alternative mechanisms for specification that were not previously considered. It is predicted that studies that dissect the flexibility of cis-regulatory modules may one day provide insights to facilitate dissection of vertebrate regulatory elements in general, including ones that exhibit flexibility of sequence. It seems plausible that stringently conserved regulatory elements control gene expression of certain classes of genes, like those required for certain essential processes. Flexible regulatory architectures may provide a mechanism for generating variability throughout evolution. Ultimately it will prove useful to make evolutionary comparisons with both highly conserved sites and flexible architectures to determine how each contributes to establishment or maintenance of gene regulation (Liberman, 2009).

Combinatorial activation and concentration-dependent repression of the Drosophila even skipped stripe 3+7 enhancer

Despite years of study, the precise mechanisms that control position-specific gene expression during development are not understood. This study analyzed an enhancer element from the even skipped (eve) gene, which activates and positions two stripes of expression (stripes 3 and 7) in blastoderm stage Drosophila embryos. Previous genetic studies showed that the JAK-STAT pathway is required for full activation of the enhancer, whereas the gap genes hunchback (hb) and knirps (kni) are required for placement of the boundaries of both stripes. The maternal zinc-finger protein Zelda (Zld) is absolutely required for activation, and evidence is presented that Zld binds to multiple non-canonical sites. A combination of in vitro binding experiments and bioinformatics analysis was used to redefine the Kni-binding motif, and mutational analysis and in vivo tests to show that Kni and Hb are dedicated repressors that function by direct DNA binding. These experiments significantly extend understanding of how the eve enhancer integrates positive and negative transcriptional activities to generate sharp boundaries in the early embryo (Struffi, 2011).

The experiments described in this study significantly refine understanding of how the eve 3+7 enhancer functions in the early embryo. In particular, it was shown that the maternal zinc-finger protein Zld is absolutely required for STAT-mediated enhancer activation, and that the gap proteins Kni and Hb establish stripe boundaries by directly binding to multiple sites within the enhancer (Struffi, 2011).

When first activated in late nuclear cycle 13, the minimal eve 3+7 enhancer drives weak stochastic expression in a broad central pattern, which refines in cycle 14 to a stripe that is about four nuclei wide. By contrast, stripe 7 expression, which is visible by enzymatic staining methods, is nearly undetectable using fluorescence in situ hybridization (Struffi, 2011).

Previous work showed that stripe 7 shares regulatory information with stripe 3 but is also controlled by sequences located between the minimal stripe 3+7 and stripe 2 enhancers, and possibly by sequences within and downstream of the stripe 2 enhancer. Thus, stripe 7 is unique among the eve stripes in that it is not regulated by a discrete modular element (Struffi, 2011).

Previous work showed that the terminal gap gene tailless (tll) is required for activation of eve 7. However, since the Tll protein probably functions as a dedicated repressor, it is likely that activation of eve 7 by Tll occurs indirectly, through repression of one or more repressors (Struffi, 2011).

The ubiquitous maternal protein Zld is required for the in vivo function of both the eve 3+7 and eve 2 enhancers, which are activated by the JAK-STAT pathway and Bicoid (Bcd), respectively. Zld was previously shown to bind to five sequence motifs that are over-represented in the regulatory regions of early developmental genes. Mutations of the single TAGteam site in the eve 3+7 enhancer caused a reduction in expression, but zld M- embryos, mutant for maternal zld expression, showed complete abolishment of eve 3+7-lacZ reporter gene expression. Also, the eve 2 enhancer, which does not contain any canonical TAGteam sites, is nonetheless inactive in zld M- embryos. This study showed that this enhancer contains at least four variants of the TAGteam sites, which suggests that Zld binding to non-canonical sites is crucial for its function in embryogenesis. ChIP-Chip data show that Zld binding extends throughout much of the eve 5' and 3' regulatory regions (Struffi, 2011).

The implication of such broad binding and the requirement for Zld for activation of two eve enhancers are consistent with its proposed role as a global activator of zygotic transcription. How might this work? One possibility is that there are cooperative interactions between Zld and the other activators of these stripes. A non-exclusive alternative is that Zld binding creates a permissive environment in broad regions of the genome, possibly by changing the chromatin configuration and making it more likely that the other activator proteins can bind. However, it is important to note that eve expression is not completely abolished in zld M- embryos, so at least some eve regulatory elements could function in the absence of Zld. Future experiments will be required to further characterize the role of Zld in the regulation of the entire eve locus (Struffi, 2011).

The genetic removal of kni causes a broad expansion of eve 3+7- lacZ expression in posterior regions of the embryo, and ectopic Kni causes a strong repression of both stripes. Interestingly, the posterior boundary of eve stripe 3 is positioned in regions with extremely low levels of Kni protein. If the stripe 3 posterior boundary is solely formed by Kni, the enhancer must be exquisitely sensitive to its repression, possibly through the high number of sites in the eve 3+7 enhancer. Previous attempts to mutate sites based on computational predictions failed to mimic the genetic loss of kni, so this study used a biochemical approach to identify Kni sites in an unbiased manner. EMSA analyses identified 11 Kni sites, and the PWM derived from these sites alone is very similar to the Kni matrix derived in a bacterial one-hybrid study. Thus, these studies provide biochemical support for the bacterial one-hybrid method as an accurate predictor of the DNA-binding activity of this particular protein (Struffi, 2011).

It was further shown that specific point mutations abolish binding to nine of the 11 sites, and when these mutations were tested in a reporter gene they caused an expansion that is indistinguishable from that detected in kni mutants. This result strongly suggests that Kni-mediated repression involves direct binding to the eve 3+7 enhancer, and that Kni alone can account for all repressive activity in nuclei that lie in the region between stripes 3 and 7. However, this work does not address the exact mechanism of Kni-mediated repression. The simplest possibility is that Kni competes with activator proteins for binding to overlapping or adjacent sites. This mechanism is considered unlikely because only one of the 11 Kni sites overlaps with an activator site. Also, the in vivo misexpression of a truncated Kni protein (Kni 1-105) that contains only the DNA-binding domain and the nuclear localization signal has no discernible effect on the endogenous eve expression pattern, whereas a similar misexpression of Kni 1-330 or Kni 1-429 strongly represses eve 3+7 (Struffi et al., 2004) (Struffi, 2011).

Whereas Kni-mediated repression forms the inside boundaries of the eve 3+7 pattern, forming the outside boundaries is dependent on Hb, which abuts the anterior boundary of stripe 3 and overlaps with stripe 7. Both stripes expand towards the poles of the embryo in zygotic hb mutants, and these expansions are mimicked by mutations in four or all nine Hb sites within the eve 3+7 enhancer. Further anterior expansions of the pattern are prevented by an unknown Bcd-dependent repressor (X) and the Torso (Tor)-dependent terminal system. Indeed, eve 3+7-lacZ expression expands all the way to the anterior tip in mutants that remove bcd and the terminal system (Struffi, 2011).

The mutational analyses suggest that Hb is a dedicated repressor of the eve 3+7 enhancer, and argue against a dual role in which high Hb levels repress, whereas lower concentrations activate, transcription. One caveat is that activation of the stripe might occur via maternal Hb in the absence of zygotic expression. However, triple mutants that remove zygotic hb, kni and tor, a terminal system component, show eve 3+7 enhancer expression that extends from ~75% embryo length (100% is the anterior pole) to the posterior pole. It is extremely unlikely that the maternal Hb gradient, which is not perturbed in this mutant combination, could activate expression throughout the posterior region. It is proposed that any activating role for Hb on this enhancer is indirect and might occur by repressing kni, which helps to define a space where the concentrations of both repressors are sufficiently low for activation to occur. kni expands anteriorly in hb mutants and is very sensitive to repression by ectopic Hb, consistent with an indirect role in activation. A similar mechanism has been shown to be important for the correct positioning of eve stripe 2. In this case, the anterior Giant (Gt) domain appears to be required for eve 2 activation, but it does so by strongly repressing Kr, thus creating space for activation in the region between Gt and Kr (Struffi, 2011).

The correct ordering of gene expression boundaries along the AP axis is crucial for establishing the Drosophila body plan. All gap genes analyzed so far seem to function as repressors that differentially position multiple boundaries. However, it is still unclear how differential sensitivity is achieved at the molecular level. Simple correlations of binding site number and affinity with boundary positioning cannot explain the exquisite differences in the sensitivity of individual enhancers, suggesting that they do more than 'count' binding sites and that specific arrangements of repressor and activator sites might control this process. The experiments described here better define the binding characteristics of both Hb and Kni and provide a firm foundation for future experiments designed to decipher the regulatory logic that controls differential sensitivity (Struffi, 2011).

Functions of Zelda orthologs in other species

Zelda and the maternal-to-zygotic transition in cockroaches

In the endopterygote Drosophila melanogaster, Zelda is an activator of the zygotic genome during the maternal-to-zygotic transition (MZT). Zelda binds cis-regulatory elements (TAGteam heptamers), making chromatin accessible for gene transcription. This study examined Zelda in the cockroach Blattella germanica, a hemimetabolan, short germ-band, and polyneopteran species. B. germanica Zelda has the complete set of functional domains, which is typical of species displaying ancestral features concerning embryogenesis. Interestingly, D. melanogaster TAGteam heptamers were found in the B. germanica genome. The canonical one, CAGGTAG, is present at a similar proportion in the genome of these two species and in the genome of other insects, suggesting that the genome admits as many CAGGTAG motifs as its length allows. Zelda-depleted embryos of B. germanica show defects involving blastoderm formation and abdomen development, and genes contributing to these processes are down-regulated. It is concluded that in B. germanica Zelda strictly activates the zygotic genome, within the MZT, a role conserved in more derived endopterygote insects. In B. germanica, zelda is expressed during MZT, whereas in D. melanogaster and T. castaneum it is expressed beyond this transition. In these species and A. mellifera, Zelda has functions even in postembryonic development. The expansion of zelda expression beyond the MZT in endopterygotes might be related with the evolutionary innovation of holometabolan metamorphosis (Ventos-Alfonso, 2019).


Blythe, S. A. and Wieschaus, E. F. (2016). Establishment and maintenance of heritable chromatin structure during early Drosophila embryogenesis. Elife 5. PubMed ID: 27879204

Cho, C. Y. and O'Farrell, P. H. (2023). Stepwise modifications of transcriptional hubs link pioneer factor activity to a burst of transcription. Nat Commun 14(1): 4848. PubMed ID: 37563108

De, S., Mitra, A., Cheng, Y., Pfeifer, K. and Kassis, J. A. (2016). Formation of a polycomb-domain in the absence of strong polycomb response elements. PLoS Genet 12(7): e1006200. PubMed ID: 27466807

De Renzis, S. D., Elemento, O., Tavazoie, S. and Wieschaus, E. F. (2007). Unmasking activation of the zygotic genome using chromosomal deletions in the Drosophila embryo. PLoS Biol. 5: 1036-1051. PubMed ID: 17456005

De Renzis, S., Elemento, O., Tavazoie, S. and Wieschaus, E. F. (2007). Unmasking activation of the zygotic genome using chromosomal deletions in the Drosophila embryo. PLoS Biol. 5: e117. PubMed ID: 17456005

Edgar, B. A. and Schubiger, G. (1986). Parameters controlling transcriptional activation during early Drosophila development. Cell 44: 871-877. PubMed ID: 2420468

Eagen, K. P., Aiden, E. L. and Kornberg, R. D. (2017). Polycomb-mediated chromatin loops revealed by a subkilobase-resolution chromatin interaction map. Proc Natl Acad Sci U S A 114(33): 8764-8769. PubMed ID: 28765367

Fernandes, G., Tran, H., Andrieu, M., Diaw, Y., Perez Romero, C., Fradin, C., Coppey, M., Walczak, A. M. and Dostatni, N. (2022). Synthetic reconstruction of the hunchback promoter specifies the role of Bicoid, Zelda and Hunchback in the dynamics of its transcription. Elife 11. PubMed ID: 35363606

Foo, S. M., Sun, Y., Lim, B., Ziukaite, R., O'Brien, K., Nien, C. Y., Kirov, N., Shvartsman, S. Y. and Rushlow, C. A. (2014). Zelda potentiates morphogen activity by increasing chromatin accessibility. Curr Biol 24: 1341-1346. PubMed ID: 24909324

Gawlinski, P., Nikolay, R., Goursot, C., Lawo, S., Chaurasia, B., Herz, H. M., Kussler-Schneider, Y., Ruppert, T., Mayer, M., and Grosshans, J. (2007). The Drosophila mitotic inhibitor Fruhstart specifically binds to the hydrophobic patch of cyclins. EMBO Rep. 8: 490-496. PubMed ID: 17431409

Giorgetti, L., Lajoie, B. R., Carter, A. C., Attia, M., Zhan, Y., Xu, J., Chen, C. J., Kaplan, N., Chang, H. Y., Heard, E. and Dekker, J. (2016). Structural organization of the inactive X chromosome in the mouse. Nature 535(7613): 575-579. PubMed ID: 27437574

Gross, S. P., Guo, Y., Martinez, J. E. and Welte, M. A. (2003). A determinant for directionality of organelle transport in Drosophila embryos. Curr. Biol. 13: 1660-1668. PubMed ID: 14521831

Grosshans, J., Muller, H.A., and Wieschaus, E. (2003). Control of cleavage cycles in Drosophila embryos by fruhstart. Dev. Cell 5: 285-294. PubMed ID: 12919679

Grosshans, J., Müller, H. and Wieschaus, E. (2003). Control of cleavage cycles in Drosophila embryos by frühstart. Dev. Cell 5: 285-294. PubMed ID: 12919679

Harrison, M. M., Botchan, M. R. and Cline, T. W. (2010). Grainyhead and Zelda compete for binding to the promoters of the earliest-expressed Drosophila genes. Dev. Biol. 345(2): 248-55. PubMed ID: 20599892

Harrison, M. M., et al. (2011). Zelda binding in the early Drosophila melanogaster embryo marks regions subsequently activated at the maternal-to-zygotic transition. PLoS Genet. 7(10): e1002266. PubMed ID: 22028662

He, S., Zhang, G., Wang, J., Gao, Y., Sun, R., Cao, Z., Chen, Z., Zheng, X., Yuan, J., Luo, Y., Wang, X., Zhang, W., Zhang, P., Zhao, Y., He, C., Tao, Y., Sun, Q. and Chen, D. (2019). 6mA-DNA-binding factor Jumu controls maternal-to-zygotic transition upstream of Zelda. Nat Commun 10(1): 2219. PubMed ID: 31101825

Hug, C. B., Grimaldi, A. G., Kruse, K. and Vaquerizas, J. M. (2017). Chromatin architecture emerges during zygotic genome activation independent of transcription. Cell 169(2): 216-228. PubMed ID: 28388407

Ke, Y., Xu, Y., Chen, X., Feng, S., Liu, Z., Sun, Y., Yao, X., Li, F., Zhu, W., Gao, L., Chen, H., Du, Z., Xie, W., Xu, X., Huang, X. and Liu, J. (2017). 3D chromatin structures of mature gametes and structural reprogramming during mammalian embryogenesis. Cell 170(2): 367-381 e320. PubMed ID: 28709003

Kvon, E. Z., et al. (2012). HOT regions function as patterned developmental enhancers and have a distinct cis-regulatory signature. Genes Dev. 26(9): 908-13. PubMed ID: 22499593

Lam, K. C., Muhlpfordt, F., Vaquerizas, J. M., Raja, S. J., Holz, H., Luscombe, N. M., Manke, T. and Akhtar, A. (2012). The NSL complex regulates housekeeping genes in Drosophila. PLoS Genet 8(6): e1002736. PubMed ID: 22723752

Larson, E. D., Komori, H., Gibson, T. J., Ostgaard, C. M., Hamm, D. C., Schnell, J. M., Lee, C. Y. and Harrison, M. M. (2021). Cell-type-specific chromatin occupancy by the pioneer factor Zelda drives key developmental transitions in Drosophila. Nat Commun 12(1): 7153. PubMed ID: 34887421

Lee, D., Karchin, R. and and Beer, M. A. (2011). Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res 21: 2167-2180. PubMed ID: 21875935

Li, X. et al. (2008). Transcription factors bind thousands of active and inactive regions in the Drosophila blastoderm. PLoS Biol. 6: 365-388. PubMed ID: 18271625

Li, X. Y., Harrison, M. M., Villalta, J. E., Kaplan, T. and Eisen, M. B. (2014). Establishment of regions of genomic activity during the maternal to zygotic transition. Elife 3 [Epub ahead of print]. PubMed ID: 25313869

Liang, H. L., Nien, C. Y., Liu, H. Y., Metzstein, M. M., Kirov, N. and Rushlow, C. (2008). The zinc-finger protein Zelda is a key activator of the early zygotic genome in Drosophila. Nature 456(7220): 400-3. PubMed ID: 18931655

Liberman, L. M. and Stathopoulos, A. (2009). Design flexibility in cis-regulatory control of gene expression: synthetic and comparative evidence. Dev. Biol. 327(2): 578-89. PubMed ID: 19135437

Lieberman-Aiden, E., van Berkum, N. L., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B. R., Sabo, P. J., Dorschner, M. O., Sandstrom, R., Bernstein, B., Bender, M. A., Groudine, M., Gnirke, A., Stamatoyannopoulos, J., Mirny, L. A., Lander, E. S. and Dekker, J. (2009). Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326(5950): 289-293. PubMed ID: 19815776

Ling, J., Umezawa, K. Y., Scott, T. and Small, S. (2019). Bicoid-dependent activation of the target gene Hunchback requires a two-motif sequence code in a specific basal promoter. Mol Cell 75(6): 1178-1187 e1174. PubMed ID: 31402096

Mirny, L. A. (2010). Nucleosome-mediated cooperativity between transcription factors. Proc Natl Acad Sci U S A 107: 22534-22539. PubMed ID: 21149679

Nien, C. Y., et al. (2011). Temporal coordination of gene networks by Zelda in the early Drosophila embryo. PLoS Genet. 7(10): e1002339. PubMed ID: 22028675

Ogiyama, Y., Schuettengruber, B., Papadopoulos, G. L., Chang, J. M. and Cavalli, G. (2018). Polycomb-dependent chromatin looping contributes to gene silencing during Drosophila development. Mol Cell 71(1): 73-88. PubMed ID: 30008320

Satija, R. and Bradley, R. K. (2012). The TAGteam motif facilitates binding of 21 sequence-specific transcription factors in the Drosophila embryo. Genome Res. 22(4): 656-65. PubMed ID: 22247430

Schulz, K. N., Bondra, E. R., Moshe, A., Villalta, J. E., Lieb, J. D., Kaplan, T., McKay, D. J. and Harrison, M. M. (2015). Zelda is differentially required for chromatin accessibility, transcription-factor binding and gene expression in the early Drosophila embryo. Genome Res [Epub ahead of print]. PubMed ID: 26335634

Staudt, N., Fellert, S., Chung, H., Jäckle, H. and Vorbrüggen, G. (2006). Mutations of the Drosophila zinc finger-encoding gene vielfältig impair mitotic cell divisions and cause improper chromosome segregation. Mol. Biol. Cell 17: 2356-2365. PubMed ID: 16525017

Struffi, P., Corado, M., Kaplan, L., Yu, D., Rushlow, C. and Small, S. (2011). Combinatorial activation and concentration-dependent repression of the Drosophila even skipped stripe 3+7 enhancer. Development 138(19): 4291-9. PubMed ID: 21865322

Sun, Y., Nien, C. Y., Chen, K., Liu, H. Y., Johnston, J., Zeitlinger, J. and Rushlow, C. (2015). Zelda overcomes the high intrinsic nucleosome barrier at enhancers during Drosophila zygotic genome activation. Genome Res. PubMed ID: 26335633

Sung, H.-w. et al. (2012). Number of nuclear divisions in the Drosophila blastoderm controlled by onset of zygotic transcription. Curr. Biol. 23(2): 133-8. PubMed ID: 23290555

ten Bosch, J. R., Benavides, J. A. and Cline, T. W. (2006). The TAGteam DNA motif controls the timing of Drosophila pre-blastoderm transcription. Development 133: 1967-1977. PubMed ID: 16624855

Tsurumi, A., et al. (2011). STAT is an essential activator of the zygotic genome in the early Drosophila embryo. PLoS Genet. 7(5): e1002086. PubMed ID: 21637778

Ventos-Alfonso, A., Ylla, G. and Belles, X. (2019). Zelda and the maternal-to-zygotic transition in cockroaches. FEBS J. PubMed ID: 30993896

Yanez-Cuna, J. O., et al. (2012). Uncovering cis-regulatory sequence requirements for context specific transcription factor binding. Genome Res. [Epub ahead of print] PubMed ID: 22534400

Biological Overview vielfaltig

date revised: 10 June 2024

Home page: The Interactive Fly © 2008 Thomas Brody, Ph.D.

The Interactive Fly resides on the
Society for Developmental Biology's Web server.