Motif 1 Binding Protein: Biological Overview | References
| Gene name - Motif 1 Binding Protein
Cytological map position - 85A9-85A9
Function - transcription factor
Keywords - a transcriptional activator that associates with a core promoter element known as Motif 1 - a member of the Enhancers of trithorax and polycomb (ETP) family - boundary element binding protein - participates in co-regulation of ribosomal protein genes
Symbol - M1BP
FlyBase ID: FBgn0037621
Genetic map position - chr3R:8,729,790-8,731,698
Classification - Zinc-finger associated domain (zf-AD); Zinc-finger double domain
Cellular location - nuclear
Despite an abundance of new studies about topologically associating domains (TADs), the role of genetic information in TAD formation is still not fully understood. This study used HiCExplorer to annotate >2800 high-resolution (570 bp) TAD boundaries in Drosophila melanogaster. Eight DNA motifs enriched at boundaries were identified, including a motif bound by the M1BP protein, and two new boundary motifs. In contrast to mammals, the CTCF motif is only enriched on a small fraction of boundaries flanking inactive chromatin while most active boundaries contain the motifs bound by the M1BP or Beaf-32 proteins. Boundaries can be accurately predicted using only the motif sequences at open chromatin sites. It is proposed that DNA sequence guides the genome architecture by allocation of boundary proteins in the genome. Finally, an interactive online database is presented to access and explore the spatial organization of fly, mouse and human genomes (Ramirez, 2018).
How the DNA packs into the nucleus and coordinates functional activities is a long-standing question in biology. Recent studies have shown that the genome of different organisms is partitioned into chromatin domains, usually called topologically associated domains (TADs), which are invariable between cell types and evolutionary conserved in related species (Ramirez, 2018).
To understand TAD formation, researchers had focused on the proteins found at TAD boundaries. In mammalian cells, the CCCTC-binding factor (CTCF) protein has been shown to be enriched at chromatin loops, which also demarcate a subset of TAD boundaries (referred to as 'loop domains'). A proposed mechanism, based on the extrusion of DNA by cohesin, suggests that the DNA-binding motif of CTCF and its orientation determine the start and end of the loop. In line with this hypothesis, deletions of the CTCF DNA-motif effectively removed or altered the loop or caused changes in gene~enhancer interactions that lead to developmental abnormalities in mouse embryos. Additionally, acute depletion of CTCF leads to loss-of-TAD structure on CTCF containing boundaries. However, CTCF-cohesin loops only explain a fraction (<39%) of human TAD boundaries, while plants and bacteria lack CTCF homologs but also show TAD-like structures. Thus, it is possible that additional factors are involved in the formation of TADs (Ramirez, 2018).
In contrast to mammals, the genetic manipulation tools available in flies have allowed the characterization of several proteins that, like CTCF, are capable of inhibiting enhancer-promoter interactions. Throughout this paper these proteins are refered to as 'insulator proteins' and their binding motifs as 'insulators' or 'insulator motifs'. In flies, apart from CTCF, the following DNA-binding insulator proteins have been associated to boundaries: Boundary Element Associated Factor-32 (Beaf-32), Suppressor of Hairy-wing (Su(Hw)), and GAGA factor (GAF). Also, Zest white 5 (Zw5) has been proposed to bind boundaries. These insulator proteins recruit co-factors critical for their function, such as Centrosomal Protein-190 (CP190) and Mod(mdg4)12. Recently, novel insulator proteins have been described as binding partners of CP190: the zinc finger protein interacting with CP190 (ZIPIC), Pita which appear to have human homologs and localizes to TAD boundaries, and the Insulator binding factors 1 and 2 (Ibf1 and Ibf2). Except for CP190 and Mod(mdg4), all previously characterized boundary associated proteins bind to specific DNA motifs, suggesting that the 3D conformation of chromatin can be encoded by these motifs (Ramirez, 2018).
This study sought to identify the DNA encoding behind TAD boundaries in flies. First, software (HiCExplorer) was developed to obtain boundary positions at 0.5 kilobase resolution based on published Hi-C sequencing data from Drosophila melanogaster Kc167 cell line. Using these high-resolution TAD boundaries, eight significantly enriched DNA-motifs were identified. Five of these motifs are known to be bound by the insulator proteins: Beaf-32, CTCF, the heterodimer Ibf1 and Ibf2, Su(Hw) and ZIPIC. A large fraction of boundaries contain the motif bound by the motif-1 binding protein (M1BP), a protein associated to constitutively expressed genes. This motif has recently been found at boundaries. The two remaining DNA-motifs have not been associated to boundaries before. Surprisingly, it was found that depletion of Beaf-32 has no major effect on chromosome organisation, while the depletion of M1BP leads to cell arrest in M-phase and dramatically affects the Hi-C results. Using machine learning methods based on the acquired DNA-motif information, boundaries were accurately distinguish from non-boundaries and TAD boundaries that were missed when using only Hi-C data were identified. The results suggest that the genome architecture of flies can be explained predominantly by the genetic information. The methods for Hi-C data processing, TAD calling and visualization were implemented into an easy to use tool called HiCExplorer. To facilitate exploration of available Hi-C data, an interactive online database was provided containing processed high-resolution Hi-C data sets from fly, mouse and human genome (Ramirez, 2018).
This study used high resolution (DpnII restriction enzyme) and deeply sequenced (~246 million reads) Hi-C data to map the genomic positions of TAD boundaries within ~600 bp in D. melanogaster. This analysis revealed a larger number of TADs, including many small active TADs (23 kb mean length), that were absent in previous reports. TAD size, boundary strength, chromatin marks, gene orientation, and transcription at the TADs were characterized. Motif calling was performed at boundaries, validating the presence of known insulators, along with M1BP motif, which recently has also been shown to be associated to boundaries and core promoter motif 6 and motif 8, which have not been associated to boundaries before. Using different machine learning methods, this study found that DNA motifs and open chromatin are sufficient to accurately predict a major fraction of fly boundaries. Finally, a set of useful tools and a resource is presented for visualization and annotation of TADs in different organisms (Ramirez, 2018).
This study verifies various properties of fly boundaries indicated in previous publications. Most boundaries associate with promoters and active chromatin (Hou, 2012) and known insulator proteins are enriched at boundaries. A comprehensive set of core promoter motifs are detected at boundaries, including the newly discovered M1BP motif, and motifs which have been associated to housekeeping gene expression. However, some of the results contradict previous observations. For example, it was find that genes at boundaries have higher expression and lower variability of expression throughout fly development. This in line with (Hug, 2017; Ulianov, 2016) but in contrast with Hou (2012), who suggest that gene density and not the transcriptional state is important for boundary formation. Unlike Hou this study found that genes at boundaries tend to be divergently transcribed. In contrast to various earlier studies, CTCF does not appear to be a major boundary associated insulator in flies. This study also shows that the number of insulator motifs at boundaries correlates very little with boundary strength (Ramirez, 2018).
Most of these differences are due to the increased resolution of detected boundaries and the combined analysis of DNA motifs with ChIP-seq data, rather than ChIP-Seq peaks alone. This study shows that correlating boundaries with ChIP-Seq peaks alone is not a good measure when it comes to determinants of boundary formation. Many DNA-binding proteins show co-localization in ChIP-Seq data without presence of the corresponding DNA motifs. This is possible due to cross-linking artifacts and indirect binding, which is, in fact, aggravated at boundaries, which tend to contact each other in 3D space (Ramirez, 2018).
Another argument for considering motifs is the contradicting case of CTCF at boundaries. In contrast to earlier studies based on CTCF ChIP-seq, this study found that the CTCF motif is rarely associated to boundaries. This difference is caused by the quality of the ChIP-seq data that can produce spurious peaks. For example, a significant enrichment of CTCF at boundaries was observed in the ChIP-Seq data from another study. On the other hand, ChIP-seq data sets show significant enrichment if ChIP-seq peaks are only considered that contain the CTCF motif. For CTCF, and in general for ChIP-seq experiments in flies, 'phantom peaks' are known to occur at active promoters. Thus, to avoid misleading results the current analyses are based on motif presence when possible and for ChIP-Seq data sets, significance threshold along with motif binding affinity are used for analysis (instead of taking a significance cutoff alone) (Ramirez, 2018).
This study observed that boundary strength is associated with the chromatin states of flanking TADs and particular motif combinations, but is not associated with the number of co-occurring boundary motifs. Boundary strength is higher between active and inactive/PcG TADs while is lower at boundaries separating two TADs within the same state (e.g., active-active, inactive-inactive). Boundaries containing Beaf-32 are stronger when present together with either motif 6, Pita, or ZIPIC motif while weaker with motif 8. Although, the mechanism by which combinations of insulators alter the boundary strength still remains unclear, an association was observed of Nup98 with Pita motif, motif 6, and CTCF, suggesting that association with nuclear pore proteins may result in stronger boundaries. Nup98 has now been shown to be functionally important in mediating enhancer-promoter looping in the Drosophila genome (Ramirez, 2018).
The current results indicate that the two sets of boundary motifs (promoter and non-promoter) participate in the compartmentalization of different types of chromatin. Boundaries containing core promoter motifs are either flanking, or surrounded by active chromatin regions. In contrast, the boundaries containing non-promoter motifs tend to be within or at the borders of inactive or repressed chromatin. This finding is in line with previous reports showing an enrichment of CTCF at the borders of H3K27me3 domains and an enrichment of Beaf-32 in active chromatin. This indicates that insulator proteins might serve different functions guided by the DNA sequence. For example, this study observed that GAF motif, whose presence is negatively associated with TAD boundaries, is rather detected alone at 'loop domains' (Ramirez, 2018).
These analyses indicate that the depletion of the well-studied insulator protein Beaf-32, has no significant effect on the chromosome conformation. However, in Drosophila melanogaster, both the Beaf-32 and DREF proteins bind exactly the same DNA motif. Thus, the current results, as well as others point out that DREF, a protein that unlike Beaf-32 is conserved in humans, might have a more prominent role in genome organization than previously thought (Ramirez, 2018).
On the other hand, cells under M1BP knockdown grow slower in culture and get arrested in M-Phase, probably because M1BP is a transcription factor of constitutively expressed genes. Since M1BP depleted cells show cell cycle defects, it is difficult to separate the direct role of M1BP at boundaries from the indirect effects caused by deregulation of thousands of genes. To study the direct role of M1BP at boundaries, it would be useful to perform either deletion of M1BP motif on boundaries using CRISPR, as shown for CTCF in mammals or through acute and complete depletion of M1BP9 (Ramirez, 2018).
This study presents evidence that the DNA sequence contains features that can guide the formation of higher order chromosome organisation. The association of boundary types with a combination of motifs, and the fact that boundaries can be predicted using DNA sequence alone, in absence of any information about associated protein or histone marks leads the authors to propose a DNA-guided chromatin assembly model. In this model, the boundary elements are recognized by their proteins, which help loading TAD assembly factors onto chromatin. Promoter and non-promoter boundaries can thus have different mechanisms of formation. DNA motifs at inactive regions can attract proteins that may establish TAD domains by setting up barriers for chromatin marks. Although overall barrier activity of insulator proteins have been controversial, it is plausible that the barrier mechanism is active only at a subset of boundaries (like those of inactive TAD domains). DNA motifs at gene promoters can associate with core-promoter proteins which then guide the assembly of Pol-II pre-initiation complex. The pre-initiation complex can then recruit condensins. Once recruited, condensins can perform loop extrusion independent of Pol-II transcriptional activity, leading to emergence of TADs. Condensins can also remain associated to chromatin during mitosis, to re-establish TADs after the cell division. In general, the results indicate that active transcription and chromosome conformation are related. Future studies investigating the association of Pol-II pre-initiation complex and condensin activity on gene promoters would advance understanding of mechanism of TAD formation (Ramirez, 2018).
The core promoter of protein-encoding genes plays a central role in regulating transcription. M1BP is a transcriptional activator that associates with a core promoter element known as Motif 1 that resides at thousand of genes in Drosophila. To gain insight into how M1BP functions, this study identified an interacting protein called GFZF. GFZF had been previously identified in genetic screens for factors involved in maintenance of hybrid inviability, the G2-M DNA damage checkpoint, and RAS/MAPK signaling but its contribution to these processes was unknown. This study shows that GFZF resides in the nucleus and functions as a transcriptional co-activator. In addition, GFZF is a glutathione S-transferase(GST). Thus, GFZF is the first transcriptional co-activator with intrinsic GST activity, and its identification as a transcriptional co-activator provides an explanation for its role in numerous biological processes (Baumann, 2017b).
Regulation of RNA polymerase II (Pol II)-transcribed genes is one of the primary mechanisms by which cells coordinate the myriad of processes required for survival, proliferation, and development. The core promoter, defined as the 80- to 100-bp region centered on the transcription start site (TSS), is the hub of transcription regulation. Transcription initiates when general transcription factors (GTFs) bind elements within the core promoter region, forming a complex consisting of Pol II and other highly conserved Pol II-associated transcription factors. In recent years, understanding has advanced from a model where the core promoter and the GTFs act as static integrators of signals from sequence-specific transcription factors that bind enhancer regions and modulate transcription levels to one where the core promoter and its machinery constitute a more dynamic assembly with different enhancer specificities and intrinsic regulatory properties (Baumann, 2017b).
One particular core promoter element that provides a clear contrast to the models arising from canonical promoters has emerged. The element, named Motif 1, is present in the promoter regions of thousands of genes in Drosophila. M1BP binds this conserved element and named it M1BP (Li, 2013). M1BP is enriched at housekeeping gene promoters, and M1BP-bound genes tend to have moderate to high levels of paused Pol II, are constitutively expressed, and show little spatiotemporal fluctuation in transcription levels (Li, 2013). Additionally, Motif 1 and, by extension, M1BP-bound promoters tend to lack many of the elements once thought to be essential for initiation, such as the TATA box and initiator, so how initiation occurs at these promoters remains a mystery. Thus, the study of M1BP promoters might provide insights into previously unknown mechanisms of transcription initiation and activation (Baumann, 2017b).
This study characterized a factor called GFZF that M1BP recruits to promoters. GFZF turns out to be a novel transcriptional coactivator that has glutathione S-transferase (GST) activity. GFZF has been identified in many genetic screens since its initial characterization (Dai, 2004). These screens have implicated GFZF in a wide variety of processes, including regulation of the cell cycle (Ambrus, 2009), DNA damage checkpoints during the transition from G2 to M phase, transcriptional and splicing control of RAS/mitogen-activated protein kinase (MAPK) signaling, response to oxidative stress, three-dimensional organization of polycomb complexes, and speciation, among other processes. Despite its involvement in these critical cellular processes, little is known about the mechanism by which it carries out these seemingly disparate functions. Early work reported that GFZF resides in the cytoplasm. This study present data supporting a parsimonious conclusion that GFZF is a transcription factor required for expression of the many factors that carry out the functions described in the above-mentioned screens (Baumann, 2017b).
Historically, GSTs have been studied for their role in cellular detoxification. However, there are notable examples of GSTs performing additional cellular functions, which include the regulation of signal transduction, inhibition of apoptosis, and the response to oxidative stress. Thus, it seems that GSTs play a critical, and perhaps underappreciated, role in cellular function and homeostasis. The unprecedented finding of a transcription factor with GST activity raises the possibility of additional layers of complexity to the already-complex process of metazoan transcriptional regulation (Baumann, 2017b).
In order to understand the function of the core promoter, it is essential to know what factors associate with it. To identify factors that associate with Motif 1-containing core promoters, promoter DNA was immobilized from a mitochondrial ribosomal protein subunit gene (mRpS30) with a strong consensus Motif 1 and incubated this with Drosophila embryo nuclear extract. As a negative control, extract was incubated with a mutant version of the promoter DNA that no longer binds M1BP. Bound proteins were then detected by SDS-PAGE and identified by mass spectrometry. Comparison of the factors bound to these two promoters identified several factors, including Putzig and GFZF. The identification of Putzig is consistent with previous findings that Putzig exists in a complex with TRF2 (Hochheimer, 2002) and that TRF2 interacts with M1BP (Baumann, 2017b).
To determine if M1BP and GFZF interact in the absence of a DNA template, pulldowns were performed with purified maltose binding protein (Mal) fusions. Using either the alpha fragment of LacZ as a control or full-length M1BP fused to Mal, it was determined that GFZF interacts specifically with M1BP. Notably, the immobilized template pulldown showed a roughly stoichiometric recovery of both GFZF and M1BP, whereas in the case of the Mal fusion pulldowns, GFZF is recovered substoichiometrically. This suggests that GFZF may have a greater propensity to bind M1BP in a DNA-templated context (Baumann, 2017b).
While GFZF was originally reported to be primarily a cytoplasmic protein (Dai, 2004), the results of immobilized template pulldown experiments indicated that GFZF might associate with chromosomes. To test this, immunofluorescence microscopy with GFZF antibody was used to detect GFZF on polytene chromosomes. Antibody against GFZF localized it to distinct bands broadly distributed across each chromosome. Since the pulldown analysis indicated that M1BP and GFZF associate with each other, their distributions on chromosomes were compared. A comparison of M1BP and GFZF staining patterns on different polytene chromosome spreads revealed very similar staining patterns. However, since both M1BP and GFZF antisera were prepared in rabbits, it was not possible to detect both proteins at the same time on the same specimens. To circumvent this problem, a transgenic fly line was constructed that expresses FLAG-tagged M1BP and the two proteins were localized with a mouse monoclonal antibody targeting the FLAG epitope on M1BP and rabbit antibody targeting GFZF. This revealed significant overlap in staining for the two factors, suggesting that M1BP and GFZF bind the same genomic regions (Baumann, 2017b).
To gain further insights into GFZF's role, the distribution of GFZF on the genome was mapped using chromatin immunoprecipitation with exonuclease (ChIP-exo). In ChIP-exo, chromatin is isolated from cross-linked cells, an immunoprecipitation is performed, and libraries are generated. In the course of library preparation, lambda exonuclease is applied to the immunoprecipitates and digests DNA in the 5'-to-3' direction until its progression is stopped, typically when it encounters a cross-link point that prevents its continued digestion. This provides a snapshot of the 5' cross-link borders of factors present at the time of cross-linking. This was applied to both M1BP and GFZF and it was found that GFZF was present on over 1,000 promoters in proliferating Drosophila S2R+ cells. A composite plot shows that the ChIP-exo signal for GFZF largely overlaps with M1BP and is concentrated in a 100-bp region just upstream from the transcription start site. Additionally, after calling peaks for both factors, genes were identified that have an M1BP or GFZF peak within 100 bp of the TSS. A total of 3,013 genes are bound by M1BP, while 1,885 are bound by GFZF. Furthermore, both factors are bound almost exclusively to the promoter regions of active genes. Gene ontology analysis of genes with a GFZF peak within 100 bp of the TSS revealed that, like M1BP (Li, 2013), GFZF is highly enriched at the promoters of genes that perform housekeeping functions (i.e., metabolism, organization, and cellular physiology). Thus, it is concluded that GFZF and M1BP show remarkable overlap throughout the genome (Baumann, 2017b).
The extensive colocalization of GFZF with M1BP, a known transcription factor (Li, 2013), raises the possibility that GFZF is a transcription factor. To test whether GFZF activates transcription, a dual-luciferase reporter assay was performed following GFZF depletion in S2R+ cells. The GFZF-associated promoters for the ribosomal protein gene RpLP1, sex-lethal (Sxl) gene, roX2 gene, or abnormal wing disc gene (awd) were used to drive transcription of a firefly luciferase reporter. These promoters were chosen because previous studies had linked GFZF to processes and pathways in which their genes or gene products are involved. As an internal control for the transfection efficiency, the RpIII128 promoter, which lacks M1BP and GFZF, was used to drive expression of a sequence coding Renilla luciferase. Both firefly and Renilla luciferase-encoding plasmids were transfected with either an empty expression vector or one that expressed a FLAG-tagged version of GFZF. Cells were treated for 1 day with double-stranded RNA (dsRNA) targeting either lacZ as a control, exon 2 of GFZF, or the 5' untranslated region (5' UTR) of GFZF and subsequently transfected with reporter plasmids. Two days later, cells were lysed and assayed for firefly and Renilla luciferase activity. Ectopically expressed FLAG-GFZF activated each of the promoters in the presence of the lacZ RNA interference (RNAi) control. This suggests that GFZF levels in the cell are limiting. RNAi targeting exon 2 of both the endogenous and ectopic GFZF inhibited GFZF-dependent activation. In contrast, RNAi targeting the 5' UTR of endogenous GFZF, which is different from the 5' UTR of ectopic GFZF, did not inhibit activation by FLAG-GFZF. Instead, the level of expression mediated by endogenous GFZF was diminished (Baumann, 2017b).
To determine if GFZF is involved in activation of endogenous genes, the level of GFZF was knocked down, and ChIP was used to monitor the association of GFZF, M1BP, and Pol II with the same promoters that were tested in a transient-expression assay. RNAi targeting GFZF caused significant decreases in the level of GFZF associated with the RpLP1, Sxl, roX2, and awd promoters. These results confirm that the ChIP-exo analysis with the GFZF antibody indeed monitors GFZF. Furthermore, the ChIP analysis also confirmed the ChIP-exo data showing that little if any GFZF associates with the hsp70 promoter. Knockdown of GFZF also caused decreases in the level of M1BP associating with the promoters. This was unexpected since biochemical analysis showed that M1BP bound a promoter fragment independently of GFZF. Western blot analysis showed that the knockdown of GFZF does not affect the level of M1BP. Thus, the contribution GFZF to M1BP promoter occupancy must reflect some role for GFZF contributing to M1BP in the cellular context. GFZF might be stabilizing the binding of M1BP or inducing some conformational change that affects the cross-linking efficiency. In accordance with the transient-expression data, the knockdown of GFZF caused a marked decrease in the level of a Pol II subunit, Rpb3, detected at GFZF-associated promoters but had an insignificant impact on Rpb3 associated with the hsp70 promoter. Taken together, the transient expression data and the ChIP analysis establish that GFZF is a transcriptional coactivator (Baumann, 2017b).
An intriguing feature of GFZF is its glutathione S-transferase (GST) homology region, which is unprecedented for a transcription factor. A previous study demonstrated that GFZF binds a glutathione (GSH) column and can be eluted with GSH in a dose-dependent fashion. To test whether GFZF functions as a glutathione S-transferase and to measure its affinity for GSH, the GST domain of GFZF was expressed with a His tag in E. coli and purified using metal affinity and ion-exchange chromatography. At the same time, a catalytic mutant (S876A) of GFZF was designed and expressed using the structure of a related GST in silkworm for reference. GST activity was assayed by monitoring the increase in absorbance at 340 nm that results when GSH is conjugated to 1-chloro-2,4-dinitrobenzene (CDNB). Based on initial reaction velocities, the Kms for glutathione for the wild-type (wt) and the S876A mutant were determined to be 0.07 mM and 3.28 mM, respectively. The Km of wt GFZF falls well below the physiological range of GSH concentrations, which has been reported to be between 1 and 10 mM, though it has been reported that GSH concentrations are lower in the nucleus. Thus, GFZF's high affinity for GSH suggests that it is probably almost always bound in a cellular context (Baumann, 2017b).
To determine if the GST activity was involved in transcriptional activation, activation of the luciferase reporter genes was measured in the presence of a wild-type GFZF, a mutant GFZF (S876A), or a truncated GFZF which has the GST domain deleted. The wt and S876A mutant activated transcription to similar extents, while the truncated GST-less mutant had approximately half as much activity. While wt GFZF activates transcription more robustly than the GST-less mutant in the luciferase assay, it was critical to assess whether differences in protein expression could account for the differences between those samples. To that end, Western blotting was performed against the FLAG epitope to quantify ectopic GFZF expression in cells. Upon comparing the fold increase in luciferase activity with the fold increase in ectopic protein expression, it is concluded that the GST portion of GFZF does not contribute to its ability to activate transcription in this assay (Baumann, 2017b).
Since its initial discovery, GFZF has appeared as a 'hit' in numerous screens. While possible explanations for GFZF's appearance in these screens have been put forth, they have lacked a unifying cellular function that could explain GFZF's seemingly disparate roles. This study shows that GFZF binds approximately 1,800 genes and functions as a transcriptional coactivator. This new information can explain the broad functionality of GFZF. The GFZF gene was first identified in Drosophila as a suppressor of a gene called killer of prune (also known as awd). Mutations in awd alone cause no phenotype, but these mutations cause lethality in flies that are homozygous for nonlethal mutations in another gene, called prune. It was proposed that mutations in the GFZF gene suppressed the lethality caused by the combination of mutations in awd and prune because wild-type GFZF was generating something toxic by conjugating glutathione to a metabolic product derived from the activities of mutant prune (encoding a cyclic AMP phosphodiesterase) and mutant awd (encoding a nucleoside diphosphate kinase). However, the data provide a simpler explanation: GFZF associates with the awd promoter and activates transcription. Hence, mutations in the GFZF gene would reduce the level of expression of mutant awd so there would no longer be sufficient mutant Awd protein to cause lethality with mutant Prune protein. In another case, GFZF's appearance in a screen for RAS-mediated MAPK activation can be explained by GFZF's binding to the core promoter region of mek (Dsor1). In accordance with GFZF's function as a transcriptional coactivator, knockdown of GFZF results in reduced levels of mek transcripts. Likewise, GFZF's appearance in the G2-M DNA damage checkpoint screen could be simply explained by GFZF being required for the transcription of other factors involved in this DNA damage checkpoint. ChIP-exo analysis indicates that GFZF associates with 22 of the 64 genes that were identified in this screen, including the promoters of factors known to have roles in this DNA damage checkpoint, including myt1, 14-3-3ε, and tefu (Baumann, 2017b).
GFZF was also identified in a screen for mutations that affect hybrid inviability. When female Drosophila melanogaster organisms are mated to male Drosophila simulans organisms, no male progeny are produced. Mutations in GFZF in male D. simulans allowed production of male progeny in this interspecies mating. GFZF binds to the promoters of three (msl-1, msl-2, and mle) out of five subunits that comprise the male-specific lethal (MSL) complex in flies. Additionally, it binds to the promoter region of roX2, one of the noncoding RNAs (ncRNAs) that is part of the MSL complex. The MSL complex functions in dosage compensation in male flies by doubling the amount of transcription arising from genes on the X chromosome; disrupting the function of the MSL complex causes male lethality. Since GFZF is a transcriptional coactivator and binds the promoters of several genes encoding the MSL complex, it is speculated that hybrid-specific GFZF-mediated misregulation of MSL components might be contributing to male lethality. This would be consistent with the work of others who have provided evidence that defects in dosage compensation contribute to hybrid inviability. However, a follow-up study which tested the hypothesis that defects in the MSL complex contribute to hybrid inviability concluded that defects in MSL function cannot fully explain hybrid inviability. It could be that GFZF's role in hybrid inviability is more nuanced than misregulation of MSL complex components and might involve misexpression of other factors involved in maintaining incompatibility. Whatever the case, it is reasonable to speculate that GFZF's role will involve misregulation of genes required for maintenance of hybrid inviability (Baumann, 2017b).
GFZF is unusual because of its unique combination of zinc fingers and a functional GST domain. Search for homologous genes in other organisms indicates that genes sharing homology to the entirety of GFZF are limited to Schizophora, the section of true flies which includes the common housefly. Since other neopterans, including mosquitoes, have GST proteins that share homology with GFZF's GST domain but lack GFZF's zinc fingers, it is likely that GFZF evolved recently as a result of a gene fusion. In accordance with this hypothesis, mRNA expression data show that there is a second promoter in the intron that immediately precedes the GST domain of the full-length GFZF gene, and the resulting transcript is predicted to encode a functional GST. This transcript is detected from 14-h-old embryos to adults, whereas the full-length GFZF is detected throughout development beginning with 0- to 2-h-old embryos (Baumann, 2017b).
At this point, the function of the GST domain is unclear. Deletion of this domain reduced the level of expression of the remainder of the protein and that the remaining part still activated transcription. Since function was assayed for only on transiently transfected DNA, it remains possible that the GST activity is important in a natural chromatin context, which is not formed on transiently transfected DNA. Mutations in the GST domain of GFZF that cause larval lethality have been identified, so the domain appears to be essential (Baumann, 2017b).
It is possible that the gene fusion resulting in GFZF is fortuitous and that the GST domain's function is not linked to gene regulation. On the other hand, this fusion raises the intriguing possibility that GST activity is important for gene expression and that other organisms bring GST activity to a gene's promoter through protein-protein interactions. GST proteins are best known for their roles in protecting cells from toxic endogenous and xenobiotic compounds, so GST might function at promoters to inhibit DNA damage. Another possibility is that GFZF serves as a sensor of the redox potential of the cell. Having a GST transcription factor act as a nuclear sensor of the redox state of the cell could ensure that cells can quickly alter their transcriptional program in response to stress and chemical insult. There is precedent for redox regulation of transcription factors, both directly and through signal transduction. Brf2, a Pol III core transcription factor, has a single oxidation-prone cysteine residue that when oxidized inhibits Brf2's ability to form a complex with TATA binding protein (TBP) at some Pol III-dependent promoters. In cells, oxidative stress caused a sharp decline in Brf2-dependent gene transcripts. In an example of redox regulation through signal transduction, a GST protein acts to inhibit c-Jun N-terminal kinase (JNK) activity under normal physiological conditions. However, when cells are treated with hydrogen peroxide or UV irradiation, the GST dimerizes and no longer inhibits JNK, thus allowing the signaling cascade to commence. As further evidence of redox-driven transcriptional regulation, sublethal levels of hydrogen peroxide globally reduce the turnover rate of Pol II paused in the promoter-proximal regions of genes. Finally, PrfA, a protein in the intracellular pathogenic bacterium Listeria monocytogenes, appears to be allosterically regulated by glutathione. If, as in the above examples, such a molecular switch regulates GFZF function in response to redox perturbations, it would represent an elegant means of quickly altering the expression of a multitude of genes in response to stress (Baumann, 2017b).
In metazoans, the pausing of RNA polymerase II at the promoter (paused Pol II) has emerged as a widespread and conserved mechanism in the regulation of gene transcription. While critical in recruiting Pol II to the promoter, the role transcription factors play in transitioning paused Pol II into productive Pol II is, however, little known. By studying how Drosophila Hox transcription factors control transcription, this study uncovered a molecular mechanism that increases productive transcription. The Hox proteins AbdA and Ubx target gene promoters previously bound by the transcription pausing factor M1BP, containing paused Pol II and enriched with promoter-proximal Polycomb Group (PcG) proteins, yet lacking the classical H3K27me3 PcG signature. AbdA binding to M1BP-regulated genes results in reduction in PcG binding, the release of paused Pol II, increases in promoter H3K4me3 histone marks and increased gene transcription. Linking transcription factors, PcG proteins and paused Pol II states, these data identify a two-step mechanism of Hox-driven transcription, with M1BP binding leading to Pol II recruitment followed by AbdA targeting, which results in a change in the chromatin landscape and enhanced transcription (Zouaz, 2017).
Understanding Hox transcriptional networks is central to understanding their wide repertoire of functions, yet observing where they bind in the genome does not explain why they bind there. In using a homogenous cell-based system devoid of endogenous Hox expression to conditionally express the Hox protein Ubx or AbdA, this study has demonstrated that Drosophila Hox proteins target proximal promoters genome-wide, which is conserved (for Ubx at least) in developing embryos. While studies into Hox genomic binding have historically focussed on enhancer elements in spatially and temporarily controlling individual gene expression, genome-wide promoter enrichment of Hox proteins is known to occur for mouse HoxB4 in hematopoietic stem cells, mouse Hoxa2 in the second branchial arches and for zebrafish Hoxb1a in early embryogenesis. However, why Hox proteins target the promoter-proximal region has been little explored. A major advantage of the Drosophila S2 cell system is that the conditional Hox expression system allows studying in fine detail the sequence of events occurring upon promoter binding and the impact on gene expression (Zouaz, 2017).
The promoters targeted by both AbdA and Ubx in Drosophila are essentially promoters containing either GAF or M1BP. GAF controls mainly development and morphogenic genes, whereas M1BP controls genes mainly involved in basic cellular processes (Li, 2013), and this distinction in gene ontology is reflected in the genes whose promoters are targeted by AbdA. As AbdA and Ubx also target enhancer regions, it cannot be ruled out that the observed promoter binding is the result of enhancer~promoter interaction. However, given that the majority of genes controlled by M1BP do not have distal enhancers (Zabidi, 2015), it is unlikely that this is the case for M1BP-targeted promoters. Both GAF and M1BP are important and distinct Drosophila Pol II pausing factors, a role that proved important in understanding the nature of promoters targeted by AbdA and Ubx, since the majority of all promoters targeted by the Hox proteins contained poised Pol II. GAF binding sites have previously been shown enriched at Ubx targets, although a link between Hox and GAF in regulating gene transcription was not demonstrated. Similarly, in S2-AbdA cells, AbdA binding at GAF-regulated promoters has little clear-cut effect on poised Pol II status, although the amount of elongating Pol II and gene transcription appears reduced. It was at M1BP-bound promoters where AbdA was found to have an effect on Pol II pausing, whereby AbdA binding results in a reduction in poised Pol II giving rise to increased productive transcription. Taken together with the findings that both Ubx and AbdA target nearly identical promoters, that AbdA and M1BP synergise in reporter gene expression, that both Ubx and AbdA interact with M1BP in embryos and AbdA colocalises with M1BP on polytene chromosomes, these data suggest functional cooperation between M1BP and AbdA/Ubx. To this end, demonstrating that M1BP expression is essential in inhibiting autophagy in the larval fat body, an Exd-independent cellular function shared by all Drosophila Hox proteins where the loss of expression of all Hox genes is essential for autophagy induction, suggests that M1BP may function with Hox proteins in their generic function of autophagy inhibition (Zouaz, 2017).
Similar to the distinct mechanisms at play in pausing Pol II at M1BP- and GAF-controlled promoters, these data suggest that distinct mechanisms of release of paused Pol II exist at the two classes of poised Pol II promoters: additional factors that are not present in S2 cells are likely required to permit Hox-induced productive transcription at GAF-controlled promoters since little evidence is foundthat AbdA binding affects Pol II pausing, whereas at M1BP-controlled genes, AbdA binding is sufficient to increase gene transcription through the release of paused Pol II (Zouaz, 2017).
Testing the association of AbdA ChIP peaks in S2-AbdA cells with those of numerous publicly available histone-modifying proteins and histone marks in S2 cells, this study found that the M1BP- and GAF-poised Pol II promoters targeted by AbdA were enriched for PcG proteins and H3K4me3. The finding that AbdA-enhanced transcription at M1BP promoters was more consistently concomitant with a loss of promoter PcG protein binding than at GAF-controlled promoter, suggests that the emerging role for PcG proteins in maintaining a poised Pol II state can be perturbed by Hox binding. Indeed, it is noteworthy that of the PcG proteins tested here, it is promoter-bound dRing that is most affected upon AbdA binding, suggesting that, like in vertebrates where Ring1 plays a major role in restraining the poised Pol II at promoters, dRing may play a major role in tethering the poised Pol II state in Drosophila. Given that no clear effect on PcG binding occurs at GAF-controlled promoters, even when these genes are repressed upon AbdA binding, it reinforces the notion that contrary to M1BP targets, the control of gene expression by AbdA at GAF genes is unlikely to occur through the regulation of poised Pol II status (Zouaz, 2017).
Where PcG proteins are linked to maintaining gene repression, trithorax group proteins (trxG) are the PcG antagonists, responsible for maintaining gene expression. As a transcription factor, GAF has been traditionally classified as a trxG protein although it displays repressive activity and can recruit PcG complexes. As such, GAF can be classified as a member of the growing family of genes that display both PcG and trxG phenotypes, the so-called enhancers of trithorax and polycomb (ETP) family. This study shows that M1BP colocalises with PcG proteins at promoters in S2 cells and phenotypically enhances the PcG homeotic phenotype of extra male sex combs on the second and third pairs of legs. Indeed, M1BP-/Pc- transheterozygous males have an average of 5.3 legs displaying sex combs, which is more than most combinations of PcG mutant transheterozygotes, demonstrating the large increase in penetrance of the Pc phenotype upon M1BP mutation. As such, M1BP would be genetically classified as a PcG gene. However, PcG genes, by definition, display homeotic phenotypes due to the derepression of Hox genes when mutated and so since this study observed neither increased derepression of the upstream Hox gene responsible for sex comb development, Scr, upon M1BP mutation nor Hox expression in fat body cells following RNAi, M1BP cannot thus be classified as a PcG gene. Given that M1BP is a transcription factor involved in gene expression (Li, 2013), the hypothesis is therefore favoured that, like GAF, M1BP is likely to be a member of the ETP family. How ETP proteins can enhance the phenotypes of both repressors (PcG) and activators (trxG) has long remained a mystery. Demonstrating here that GAF and M1BP colocalise with PcG at poised Pol II promoters with the loss of PcG at those genes displaying increased expression upon AbdA binding, may go a long way to better understand how transcription factors and transcriptional repressors intricately cooperate to regulate gene transcription (Zouaz, 2017).
In summary, this work identifies a novel mechanism for Pol II pausing release mediated by AbdA: at genes bound by M1BP, targeting of AbdA results in the specific loss of PcG proteins, the release of poised Pol II and increases in H3K4me3 histone marks, which results in promoting productive transcription. Identified in S2 cells where Hox PBC-class cofactors are absent, this mechanism may more generally apply to Hox-generic functions that are independent of PBC-class cofactors, such as the repression of autophagy in the Drosophila fat body or sex comb development in Drosophila males. It may also apply to PBC-dependent Hox target gene regulatien by cooperating with Hox PBC-bound genomic regions located remote from the promoter. Further work aimed at studying Hox PBC-bound enhancers together with poised Pol II status, and promoter-proximal Hox and PcG binding, should provide further insight into how enhancer-bound protein complexes influence the basic mechanisms of transcription regulated through poised Pol II. Uncovering such a Hox-driven mechanism of gene regulation by sequence-specific transcription factors, PcG proteins and poised Pol II in the developing animal would have been fraught with difficulties, not least of which is the quagmire of PcG proteins being essential global repressors of all Hox genes (Zouaz, 2017).
Ribosomal protein (RP) genes must be coordinately expressed for proper assembly of the ribosome yet the mechanisms that control expression of RP genes in metazoans are poorly understood. Recently, TATA-binding protein-related factor 2 (TRF2) rather than the TATA-binding protein (TBP) was found to function in transcription of RP genes in Drosophila. Unlike TBP, TRF2 lacks sequence-specific DNA binding activity, so the mechanism by which TRF2 is recruited to promoters is unclear. This study shows that the transcription factor M1BP, which associates with the core promoter region, activates transcription of RP genes. Moreover, M1BP directly interacts with TRF2 to recruit it to the RP gene promoter. High resolution ChIP-exo was used to analyze in vivo the association of M1BP, TRF2 and TFIID subunit, TAF1. Despite recent work suggesting that TFIID does not associate with RP genes in Drosophila, it was found that TAF1 is present at RP gene promoters and that its interaction might also be directed by M1BP. Although M1BP associates with thousands of genes, its colocalization with TRF2 is largely restricted to RP genes, suggesting that this combination is key to coordinately regulating transcription of the majority of RP genes in Drosophila (Baumann, 2017a).
This study shows that M1BP activates transcription of RP genes in Drosophila and that it can do so by recruiting TRF2 to RP gene promoters in cells. These conclusions are based on the demonstration that M1BP is detected in the core promoter region of the majority of RP genes in cells and that mutation of Motif 1 diminished the level of expression from RP reporter genes. Additionally, it was demonstrated that M1BP activates transcription of RP gene promoters in nuclear extracts. Also, M1BP was shown to recruit TRF2 to promoter DNA in vitro and that M1BP and TRF2 colocalize on the RP gene promoters in cells. M1BP, therefore, is the first sequence-specific DNA-binding protein that has been directly shown to activate RP gene transcription in metazoans. DREF is possibly the only other protein, but it remains to be determined if it activates RP genes in vitro. Since these transcription factors associate with a broad spectrum of genes, loss of function assays in cells must be viewed with caution as it is difficult to distinguish between direct and indirect effects regardless of whether the protein can be detected at a particular genez (Baumann, 2017a).
Mechanisms by which TRF2 associates with promoters are not well understood. DREF was purified in a complex with TRF2 but no direct measurement of TRF2 recruitment to DNA by this complex was provided. An uncharacterized TRF2 complex associates with promoters bearing the DPE and canonical initiator, but RP genes lack both of these DNA elements. This study provides a direct mechanism that involves M1BP associating with its cognate-binding site and interacting directly with TRF2. Since there is little overlap between M1BP and TRF2 outside of RP gene promoters, it follows that additional cis-elements are required for TRF2's association with M1BP. It is suspected that the TCT motif, along with M1BP and DREF, may be an additional key contributor to TRF2's association with gene promoters. Additionally, ChIP-exo data reveals that TAF1 is present at virtually all promoters that are associated with Pol II including most promoters that associate with TRF2. Therefore, since TAF1 recognizes sequences at the initiator and the DPE, TAF1 could be part of the currently uncharacterized TRF2-containing complex that selectively binds the initiator-DPE-containing promoters (Baumann, 2017a).
The total number of TRF2 peaks that were observed is considerably lower than that reported previously. There could be a couple reasons for this discrepancy. First, the previous study used 2~4 h embryos, whereas this study used S2R+ cells. It is possible that TRF2 functions at a broader spectrum of developmentally regulated genes in the early embryo than in S2R+ cells. Additionally, the difference could be due to the increased signal to noise ratio afforded by ChIP-exo which results in more reliable peak detection (Baumann, 2017a).
Detection of TAF1 on RP gene promoters was unexpected because TAF1 is best known for being a subunit of the general transcription factor, TFIID and biochemical evidence argued against TFIID being involved in RP gene transcription. Moreover, previous analysis of the PCNA promoter showed that immunodepletion of TFIID with TAF1 antibody from a Drosophila transcription reaction did not inhibit transcription of a TRF2-dependent promoter. ChIP-exo data provide evidence for M1BP being in close proximity to and potentially interacting with, TRF2 and TAF1 on RP gene promoters. The ChIP-exo data showed a peak of M1BP contacts downstream from the TSS yet Motif 1 that binds M1BP typically resides upstream from the TSS. Since the ChIP-exo data for M1BP and TAF1 display overlapping peaks in the +30 to +50 region, it is proposed that M1BP is in contact with TAF1 and that the ChIP-exo signal for M1BP in this region is a consequence of M1BP crosslinking to TAF1 and TAF1 in turn crosslinking to the +30 to +50 region. In contrast, the ChIP exo signals for M1BP and TRF2 are shifted relative to each other by ∼10 nts suggesting that M1BP might position TRF2 on the DNA adjacent to M1BP (Baumann, 2017a).
A unique feature of the RP gene promoters in Drosophila and humans is the presence of the TCT motif located at the TSS. What recognizes this motif is currently not known. Since TAF1 is known to recognize the canonical initiator element, its presence at RP gene promoters raises the possibility that this TAF also recognizes the TCT motif. DNAse I footprinting analysis of TFIID binding to RP gene promoters indicated that binding was extremely weak. However, close inspection of the DNase I cutting patterns in the absence and presence of TFIID reveals the appearance of weak hypersensitive cut sites near the TCT initiator. One possibility is that M1BP together with TRF2 enhance the affinity of TAF1 for the RP gene initiator (Baumann, 2017a).
It is proposed that M1BP functions as a hub to recruit TRF2 and TAF1. Since the only known TAF1-containing complex in metazoans is TFIID, it is proposed that TFIID still binds to RP gene promoters along with TRF2. One possibility is that TRF2 displaces TBP at the RP gene promoter. A recent model of TFIID bound to promoter DNA indicates that TFIIA is involved in connecting TBP to TAF1. Since TRF2 associates with TFIIA , displacement of TBP from TAF1 by TRF2 is tenable (Baumann, 2017a).
This analysis of STARR-seq data indicates that RP gene promoters can act as enhancers and that they are selective in activating housekeeping gene core promoters and not core promoters of developmental and stress activated genes. The RP gene promoters more strongly activated the candidate RP gene promoter over all the other tested candidates. This selectivity could establish a network in which active RP genes and other housekeeping genes act reciprocally to activate each other. In addition, the selectivity of the enhancer activity of these RP promoters would prevent them from inadvertently activating nearby developmentally regulated genes (Baumann, 2017a).
Thousands of genes in Drosophila have Pol II paused in the promoter proximal region. Almost half of these genes are associated with either GAGA factor (GAF) or a newly discovered factor called M1BP. Although both factors dictate the association of Pol II at their target promoters, they are nearly mutually exclusive on the genome and mediate different mechanisms of regulation. High-resolution mapping of Pol II using permanganate-ChIP-seq indicates that pausing on M1BP genes is transient and could involve the +1 nucleosome. In contrast, pausing on GAF genes is much stronger and largely independent of nucleosomes. Distinct regulatory mechanisms are reflected by transcriptional plasticity: M1BP genes are constitutively expressed throughout development while GAF genes exhibit much greater developmental specificity. M1BP binds a core promoter element called Motif 1. Motif 1 potentially directs a distinct transcriptional mechanism from the canonical TATA box, which does not correlate with paused Pol II on the genomic scale. In contrast to M1BP and GAF genes, a significant portion of TATA box genes appear to be controlled at preinitiation complex formation (Li, 2012).
Search PubMed for articles about Drosophila M1bp
Ambrus, A. M., Rasheva, V. I., Nicolay, B. N. and Frolov, M. V. (2009). Mosaic genetic screen for suppressors of the de2f1 mutant phenotype in Drosophila. Genetics 183(1): 79-92. PubMed ID: 19546319
Baumann, D. G. and Gilmour, D. S. (2017a). A sequence-specific core promoter-binding transcription factor recruits TRF2 to coordinately transcribe ribosomal protein genes. Nucleic Acids Res 45(18): 10481-10491. PubMed ID: 28977400
Baumann, D. G., Dai, M. S., Lu, H. and Gilmour, D. S. (2017b). GFZF, a glutathione S-transferase protein implicated in cell cycle regulation and hybrid inviability, is a transcriptional co-activator. Mol Cell Biol [Epub ahead of print]. PubMed ID: 29158293
Dai, M. S., Sun, X. X., Qin, J., Smolik, S. M. and Lu, H. (2004). Identification and characterization of a novel Drosophila melanogaster glutathione S-transferase-containing FLYWCH zinc finger protein. Gene 342(1): 49-56. PubMed ID: 15527965
Hochheimer, A., Zhou, S., Zheng, S., Holmes, M. C. and Tjian, R. (2002). TRF2 associates with DREF and directs promoter-selective gene expression in Drosophila. Nature 420(6914): 439-445. PubMed ID: 12459787
Hou, C., Li, L., Qin, Z. S. and Corces, V. G. (2012). Gene density, transcription, and insulators contribute to the partition of the Drosophila genome into physical domains. Mol Cell 48(3): 471-484. PubMed ID: 23041285
Hug, C. B., Grimaldi, A. G., Kruse, K. and Vaquerizas, J. M. (2017). Chromatin architecture emerges during zygotic genome activation independent of transcription. Cell 169(2): 216-228. PubMed ID: 28388407
Li, J. and Gilmour, D. S. (2013). Distinct mechanisms of transcriptional pausing orchestrated by GAGA factor and M1BP, a novel transcription factor. EMBO J 32(13): 1829-1841. PubMed ID: 23708796
Ramirez, F., Bhardwaj, V., Arrigoni, L., Lam, K. C., Gruning, B. A., Villaveces, J., Habermann, B., Akhtar, A. and Manke, T. (2018). High-resolution TADs reveal DNA sequences underlying genome organization in flies. Nat Commun 9(1): 189. PubMed ID: 29335486
Ulianov, S. V., Khrameeva, E. E., Gavrilov, A. A., Flyamer, I. M., Kos, P., Mikhaleva, E. A., Penin, A. A., Logacheva, M. D., Imakaev, M. V., Chertovich, A., Gelfand, M. S., Shevelyov, Y. Y. and Razin, S. V. (2016). Active chromatin and transcription play a key role in chromosome partitioning into topologically associating domains. Genome Res 26(1): 70-84. PubMed ID: 26518482
Zabidi, M. A., Arnold, C. D., Schernhuber, K., Pagani, M., Rath, M., Frank, O. and Stark, A. (2015). Enhancer-core-promoter specificity separates developmental and housekeeping gene regulation. Nature 518: 556-559. PubMed ID: 25517091
Zouaz, A., Auradkar, A., Delfini, M. C., Macchi, M., Barthez, M., Ela Akoa, S., Bastianelli, L., Xie, G., Deng, W. M., Levine, S. S., Graba, Y. and Saurin, A. J. (2017). The Hox proteins Ubx and AbdA collaborate with the transcription pausing factor M1BP to regulate gene transcription. EMBO J. PubMed ID: 28871058
date revised: 6 January 2019
Home page: The
Interactive Fly © 2011 Thomas Brody, Ph.D.