InteractiveFly: GeneBrief

Gene name - Lost PHDs of trr

Synonyms - cara mitad

Cytological map position - 60A9-60A9

Keywords - cofactor for TRR, associates with EcR-USP receptor, required for hormone-dependent transcription, chromatin modification

Symbol - Lpt

FlyBase ID: FBgn0263667

Genetic map position - chr2R:19786102-19791187

Classification - RING-finger, Bromo Adjacent Homology domain, High Mobility Group, PHD zinc finger

Cellular location - nuclear

NCBI link: EntrezGene

Lpt orthologs: Biolitmine

Recent literature

Zraly, C. B., Zakkar, A., Perez, J. H., Ng, J., White, K. P., Slattery, M. and Dingwall, A. K. (2020). The Drosophila MLR COMPASS complex is essential for programming cis-regulatory information and maintaining epigenetic memory during development. Nucleic Acids Res. PubMed ID: 32052053
Summary:
The MLR COMPASS complex monomethylates H3K4 that serves to epigenetically mark transcriptional enhancers to drive proper gene expression during animal development. Chromatin genrichment analyses of the Drosophila MLR complex reveals dynamic association with promoters and enhancers in embryos with late stage enrichments biased toward both active and poised enhancers. RNAi depletion of the Cmi (also known as Lpt) subunit that contains the chromatin binding PHD finger domains attenuates enhancer functions, but unexpectedly results in inappropriate enhancer activation during stages when hormone responsive enhancers are poised, revealing critical epigenetic roles involved in both the activation and repression of enhancers depending on developmental context. Cmi is necessary for robust H3K4 monomethylation and H3K27 acetylation that mark active enhancers, but not for the chromatin binding of Trr, the MLR methyltransferase. These data reveal two likely major regulatory modes of MLR function, contributions to enhancer commissioning in early embryogenesis and bookmarking enhancers to enable rapid transcriptional re-activation at subsequent developmental stages.

Huang, W., Zhu, J. Y., Fu, Y., van de Leemput, J. and Han, Z. (2022). Lpt, trr, and Hcf regulate histone mono- and dimethylation that are essential for Drosophila heart development. Dev Biol 490: 53-65. PubMed ID: 35853502
Summary:
Mammalian KMT2C, KMT2D, and HCFC1 are expressed during heart development and have been associated with congenital heart disease, but their roles in heart development remain elusive. This study found that the Drosophila Lpt and trr genes encode the N-terminal and C-terminal homologs, respectively, of mammalian KMT2C or KMT2D. Lpt and trr mutant embryos showed reduced cardiac progenitor cells. Silencing of Lpt, trr, or both simultaneously in the heart led to similar abnormal cardiac morphology, tissue fibrosis, and cardiac functional defects. Like KMT2D, Lpt and trr were found to modulate histone H3K4 mono- and dimethylation, but not trimethylation. Investigation of downstream genes regulated by mouse KMT2D in the heart showed that their fly homologs are similarly regulated by Lpt or trr in the fly heart, suggesting that Lpt and trr regulate an evolutionarily conserved transcriptional network for heart development. Moreover, this study showed that cardiac silencing of Hcf, the fly homolog of mammalian HCFC1, leads to heart defects similar to those observed in Lpt and trr silencing, as well as reduced H3K4 monomethylation. These findings suggest that Lpt and trr function together to execute the conserved function of mammalian KMT2C and KMT2D in histone H3 lysine K4 mono- and dimethylation required for heart development. Possibly aided by Hcf, which plays a related role in H3K4 methylation during fly heart development.

Zraly, C. B., Schultz, R., Diaz, M. O., Dingwall, A. K. (2023). New twists of a TAIL: novel insights into the histone binding properties of a highly conserved PHD finger cluster within the MLR family of H3K4 mono-methyltransferases. Nucleic Acids Res, 51(18):9672-9689 PubMed ID: 37638761
Summary:
Enhancer activation by the MLR family of H3K4 mono-methyltransferases requires proper recognition of histones for the deposition of the mono-methyl mark. MLR proteins contain two clusters of PHD zinc finger domains implicated in chromatin regulation. The second cluster is the most highly conserved, preserved as an ancient three finger functional unit throughout evolution. Studies of the isolated 3rd PHD finger within this cluster suggested specificity for the H4 [aa16-20] tail region. The histone binding properties were determined of the full three PHD finger cluster b module (PHDb) from the Drosophila Cmi protein which revealed unexpected recognition of an extended region of H3. Importantly, the zinc finger spacer separating the first two PHDb fingers from the third is critical for proper alignment and coordination among fingers for maximal histone engagement. Human homologs, MLL3 and MLL4, also show conservation of H3 binding, expanding current views of histone recognition for this class of proteins. Chromatin remodeling was further implicated by the SWI/SNF complex as a possible mechanism for the accessibility of PHDb to globular regions of histone H3 beyond the tail region. These results suggest a two-tail histone recognition mechanism by the conserved PHDb domain involving a flexible hinge to promote interdomain coordination.

BIOLOGICAL OVERVIEW

MLL2 and MLL3 histone lysine methyltransferases are conserved components of COMPASS-like co-activator complexes. In vertebrates, the paralogous MLL2 and MLL3 contain multiple domains required for epigenetic reading and writing of the histone code involved in hormone-stimulated gene programming, including receptor-binding motifs, SET methyltransferase, HMG and PHD domains. The genes encoding MLL2 and MLL3 arose from a common ancestor. Phylogenetic analyses reveal that the ancestral gene underwent a fission event in some Brachycera dipterans, including Drosophila species, creating two independent genes corresponding to the N- and C-terminal portions. In Drosophila, the C-terminal SET domain is encoded by trithorax-related (trr), which is required for hormone-dependent gene activation. This study identified the cara mitad (cmi) gene, which encodes the previously undiscovered N-terminal region consisting of PHD and HMG domains and receptor-binding motifs. The cmi gene is essential and its functions are dosage sensitive. CMI associates with TRR, as well as the EcR-USP receptor, and is required for hormone-dependent transcription. Unexpectedly, although the CMI and MLL2 PHDf3 domains could bind histone H3, neither showed preference for trimethylated lysine 4. Genetic tests reveal that cmi is required for proper global trimethylation of H3K4 and that hormone-stimulated transcription requires chromatin binding by CMI, methylation of H3K4 by TRR and demethylation of H3K27 by the demethylase UTX. The evolutionary split of MLL2 into two distinct genes in Drosophila provides important insight into distinct epigenetic functions of conserved readers and writers of the histone code (Chauhan, 2012).

Nuclear receptors (NRs) function as transcription factors that respond to cellular signals to initiate new gene expression programs and have essential roles in embryonic development, growth and differentiation. NRs collaborate with greater than 300 co-factors that provide important enzymatic and regulatory functions. Co-factors can be activators or repressors and are typically recruited to gene promoters through associations with receptors (Bulynko, 2011). Some co-factors direct changes in the epigenetic environment of target genes by direct covalent chromatin modification or nucleosome remodeling. Co-activators are recruited in a ligand-dependent manner, whereas unliganded receptors often associate with co-repressors. Co-activators exist in large complexes required for the transcription of genes that are regulated by at least 48 vertebrate NRs, including retinoic acid receptor (RAR), liver-X-receptor (LXR), farnesoid-X-receptor (FXR), as well as a co-activator for p53. Disruptions of both NRs and their co-regulators have been linked to many cancers and developmental disorders (Chauhan, 2012).

Hormone signaling pathways in Drosophila melanogaster rely on two primary hormones, the steroid hormone 20-hydroxyecdysone (20HE) and sesquiterpenoid juvenile hormone (JH), and 18 receptors representing all major conserved nuclear receptor subfamilies. Drosophila Ecdysone Receptor (EcR) is an FXR/LXR ortholog, whereas its heterodimeric partner Ultraspiracle (USP) is an RXR ortholog (Chauhan, 2012).

Drosophila Trithorax-related (TRR) is a co-activator of EcR-USP. TRR is a histone lysine methyltransferase (HMT) that trimethylates histone 3 on lysine 4 (H3K4me3) and TRR functions are essential for activating ecdysone-regulated genes (Sedkov, 2003). TRR is closely related to another Drosophila protein, Trithorax (TRX), which regulates homeotic (Hox) gene expression through similar methyltransferase activity. The mammalian counterparts of TRR are MLL2 (also known as ALR or MLL4) and MLL3 (also known as HALR). MLL2 and MLL3 are enormous (5537 aa and 4911 aa, respectively), with multiple conserved domains, including histone methyltransferase (SET domain), five plant homeodomain (PHD) zinc fingers, an HMG-I binding motif, LXXLL NR binding motifs and FY-rich regions. Through the SET domain, both MLL2 and MLL3 directly methylate histone H3 to mediate transcription activation (Chauhan, 2012).

MLL2 and MLL3 are components of large SET1/COMPASS-like co-activator complexes that are required for NR-directed gene regulation. These complexes have important human disease connections, including developmental disorders and cancers. MLL2 and MLL3 are mutated in many Kabuki syndrome patients. MLL2 is frequently mutated in childhood medulloblastomas (14%), follicular lymphoma (89%) and diffuse large B-cell lymphoma (32%) (the two most common forms of non-Hodgkin lymphoma), suggesting that MLL2 and MLL3 COMPASS-like complex activities have important epigenetic gene regulatory roles that normally function to inhibit cancer progression (Chauhan, 2012).

Proteins that co-purify with the MLL2 include ASH2, RBBP5 (RBQ3), DPY30, WDR5, adaptor protein ASC2, PTIP, PA1 and histone demethylase UTX. Recently, TRR was found in Drosophila COMPASS-like complexes (Mohan, 2011). Despite functional similarities, TRR is much smaller than MLL2 or MLL3 with homology limited to the C-terminal SET domain portion (Sedkov, 2003). TRR lacks the N-terminal PHD and HMG domains that might contribute to chromatin binding. MLL2-related family members are always encoded by large single genes in species other than Brachycera dipterans. To further studies on epigenetic regulation of ecdysone target genes, Drosophila genes were sought that could encode a protein highly related to the N-terminal half of MLL2, and a single open reading frame (CG5591) was identified. The gene was named cara mitad (cmi; translated as 'dear half'). Although cmi is unlinked to trr in the genome, genetic studies using null mutants, in vivo depletion and overexpression revealed functions for cmi as a nuclear receptor co-factor necessary for hormone-regulated gene expression. Unexpectedly, the CMI type 3 PHD finger (PHDf3) was found to accommodate non-methylated, mono- and dimethylated H3K4, rather than trimethylated H3K4. Moreover, CMI-dependent activation also required demethylation functions of UTX, suggesting that NR-stimulated transcription involved at least three steps: binding of H3K4me1/2 by CMI, trimethylation of H3K4 by TRR and demethylation of H3K27 by UTX. The intriguing possibility that COMPASS-like functions in NR-directed transcription are associated with two independent proteins in flies suggests that recognition and binding to modified histones is a distinct step, separate from the epigenetic modification associated with other enzymes in the complex. This presents a unique opportunity to examine functions of histone recognition/binding and covalent histone tail lysine modifications as separate and essential features of NR-directed activation (Chauhan, 2012).

Although the precise roles of proteins directly participating in nuclear receptor signaling remain largely speculative, many are thought to regulate transcription through effects on chromatin. The MLL2 and MLL3 co-activators function to epigenetically decode or modify histone lysine residues and provide activation functions for NR signaling at target genes. In Drosophila, CMI and TRR together have a single MLL family homolog. This is the first example of an evolutionary 'splitting' of an epigenetic regulator involved in nuclear receptor signaling, whereby the essential gene regulatory functions of one protein have been parsed into two distinct proteins. CMI forms complexes with TRR, associates directly with hormone receptors and interacts with other putative COMPASS-like components, suggesting that Drosophila contains a functional counterpart to the mammalian ASCOM-MLL2 nuclear receptor co-activator complex (Chauhan, 2012).

The MLL histone lysine methyltransferases (KMTs) can be divided into two conserved groups, the MLL1-MLL4(2) and MLL2(4)/ALR-MLL3/HALR subfamilies. Each MLL member is capable of forming related discrete complexes with several common components. The MLL-based complexes activate transcription in part through methyltransferase activity on histone H3 Lys4 residues within promoter-associated nucleosomes. There might be partial functional overlap between MLL2 and MLL3; however, they are not redundant with the MLL1-MLL4 subfamily. The SET-domain methyltransferase activity of the MLL proteins is essential for transcription activation through histone lysine methylation, but the precise biological role of PHD fingers remains somewhat elusive. Closely related PHDf3 fingers bind H3K4me3/2, the product of the methyltransferase activity. Within the context of a single protein, such as MLL1, the PHDf3 recognition and binding of H3K4me3 is required for transcription activation of target genes (Chang, 2010; Chauhan, 2012).

The findings that CMI and TRR function coordinately in a COMPASS-like complex suggest that cmi and trr probably split from a common ancestor. Gene-protein fusions are four times more common than fissions, perhaps reflecting a simpler genetic event. In cases in which fissions occur, it has been suggested that many involve subunits of multimeric complexes in which the two independent proteins interact physically. The process of splitting into two independent genes might involve gene duplication with subsequent partial degeneration, as has been observed in the monkey king (mkg) gene family in Drosophila (Chauhan, 2012).

The notion that a large protein contains domains that function both together and independently is not without precedent. TRX and MLL1 are cleaved by a specific protease, taspase-1. The two 'halves' interact with each other in a functional complex, but there is evidence that the N-terminal TRX peptide (TRX-N) binds chromatin without its TRX-C partner in transcribed regions of Hox genes (Schuettengruber, 2009; Schwartz, 2010). Transcription factor TFIIA and herpes simplex virus host cell factor (HCF1) are cleaved during maturation, with both halves necessary for a functional product. There is presently no evidence that MLL2 or MLL3 are cleaved or processed (Chauhan, 2012).

An important question is whether both the chromatin-binding and methyltransferase functions of the MLL family are required for transcription activation. The data indicate that depletion of trr can suppress the effects of overexpressing cmi, suggesting that the activation potential of CMI depends on TRR methyltransferase activity. Similarly, simultaneous depletion of cmi and trr produces stronger phenotypes than depletion of either alone, indicative of cooperation on similar gene targets. Moreover, in vivo depletion of cmi results in reduced global H3 trimethylation, despite a functional trr gene (Chauhan, 2012).

Phenotypes associated with changes in CMI levels reveal important functions in hormone-regulated development. The larval defects in molting, morphogenetic furrow progression and necrosis associated with a cmi null allele, similar to trr, are consistent with impaired hormone signaling. Similarly, depletion of MLL2 in HeLa cells using siRNA led to reduced expression of genes known to be important for development and trimethylation of H3K4 was reduced at some promoters. Knockdown of MLL2 in MCF-7 cells impaired estrogen receptor (ERα) transcription activity and inhibited estrogen-dependent growth. Inactivation of the murine Mll3 resulted in stunted growth and reduced PPAR?-dependent adipogenesis with increased insulin sensitivity (Lee, 2008). Perhaps reflecting synonymous functions in Drosophila, cmi/CG5591 was found to be important for regulating muscle triglyceride levels, suggesting conserved adipogenic functions (Pospisilik, 2010). Furthermore, CG5591 (cmi) is involved in phagocytosis (Stroschein-Stevenson, 2006) and regulation of caspase functions in response to cellular stress (Yi, 2007), implicating cmi in immune-cell regulation. The increased hemocyte number associated with elevated CMI suggests functions in hemocyte development, perhaps as an effector of chromatin remodeling or signaling (JAK/STAT, Hedgehog, Notch) pathways. It was previously shown that trr was important for Hedgehog (HH)-dependent signaling during eye development (Sedkov, 2003) and cmi overexpression and depletion data are consistent with that possibility. However, the dosage-dependent cmi wing phenotypes are not consistent with changes in HH signaling, raising the possibility that cmi and trr are important for other growth and signaling pathways in wing development, including Decapentaplegic (DPP/TGFβ) and Wingless (WG/WNT) pathways (Chauhan, 2012).

Several steps are involved in activation of hormone-responsive target genes, including methylation of H3K4 by the MLL2-MLL3 COMPASS-like complex and displacement of demethylases (Vicent, 2011). Reduced cmi function resulted in lower hormone-responsive enhancer activation and genetic interactions between cmi, trr and Utx revealed that chromatin binding by CMI was important for gene activation in vivo. Furthermore, RNAi depletion of Utx suppressed HA-cmi overexpression wing phenotypes, suggesting that demethylation of H3K27 is a pre-requisite for activation of some hormone target genes. This is supported by genetic evidence from C. elegans that indicated both histone H3K4 methylation by SET-16 (MLL2/MLL3 ortholog) and H3K27 demethylation by UTX-1 were required for attenuation of RAS signaling in the vulva (Fisher, 2010; Li, 2011) and MLL2-MLL3 complex-related components were required for proper germ line development (Li, 2011). Genetic epistasis data reveals that Utx, trr and cmi functions are all required for activation in Drosophila (Chauhan, 2012).

Unexpectedly, the CMI PHDf3.b showed binding to mono- and dimethylated H3K4, rather than trimethylated H3K4 (Sanchez, 2011). Although CMI contains two PHDf3 domains in two clusters similar to MLL3, MLL2 contains one PHDf3 most closely related to the CMI and MLL3 PHDf3.b domains. The second cluster appears in all isoforms of MLL3, whereas the N-terminal 'a' cluster is optional. Additionally, the 'b' cluster is more closely related to the PHD cluster found in other MLL family proteins. PHD modules are thought to bind histones and present tail residues to the modifying enzyme subunits or stabilize those enzymes with their substrates. Recently, RNAi knockdown of trr in S2 cells was shown to affect H3K4 mono-, di-, and trimethylation, revealing widespread functions in regulating methylation in vivo (Ardehali, 2011) and suggesting that loss of TRR might destabilize the co-activator complex leading to de-protection of H3K4 methylation. One possibility is that CMI binds mono- and dimethylated H3K4 to prevent demethylation and stabilize TRR to allow for hormone-stimulated methylation and gene activation. CMI might disengage to allow for removal of methylation marks as hormone levels decrease and gene transcription is reduced. In contrast to MLL1-TRX function in maintenance of active gene transcription, CMI and TRR might be required for NR-targeted gene activation in response to temporally restricted hormone-dependent genome reprogramming (Chauhan, 2012).

Transcriptional cofactors display specificity for distinct types of core promoters

Transcriptional cofactors (COFs) communicate regulatory cues from enhancers to promoters and are central effectors of transcription activation and gene expression. Although some COFs have been shown to prefer certain promoter types over others, the extent to which different COFs display intrinsic specificities for distinct promoters is unclear. This study used a high-throughput promoter-activity assay in Drosophila melanogaster S2 cells to screen 23 COFs for their ability to activate 72,000 candidate core promoters (CPs). Differential activation of CPs was observed, indicating distinct regulatory preferences or 'compatibilities' between COFs and specific types of CPs. These functionally distinct CP types are differentially enriched for known sequence elements, such as the TATA box, downstream promoter element (DPE) or TCT motif, and display distinct chromatin properties at endogenous loci. Notably, the CP types differ in their relative abundance of H3K4me3 and H3K4me1 marks, suggesting that these histone modifications might distinguish trans-regulatory factors rather than promoter- versus enhancer-type cis-regulatory elements. The existence was confirmed of distinct COF-CP compatibilities in two additional Drosophila cell lines and in human cells, for which COFs were found that prefer TATA-box or CpG-island promoters, respectively. Distinct compatibilities between COFs and promoters can explain how different enhancers specifically activate distinct sets of genes, alternative promoters within the same genes, and distinct transcription start sites within the same promoter. Thus, COF-promoter compatibilities may underlie distinct transcriptional programs in species as divergent as flies and humans (Haberle, 2019).

To systematically test intrinsic COF-CP preferences for many CPs in a standardized setup, a plasmid-based high-throughput promoter-activity assay and self-transcribing active core promoter-sequencing (STAP-seq) were combined with the specific GAL4 DNA-binding-domain (GAL4-DBD)-mediated recruitment of individual COFs. Using this assay in S2 cells, is this study tested whether 13 different individually tethered D. melanogaster COFs, representing different functional classes and enzymatic activities (two acetyltransferases (P300/CBP and Mof), three H3K4-methyltransferase-complex components (Lpt, Trr and Trx), two chromo and chromo-shadow-domain COFs (Chro and Mof) and three bromodomain COFs (Brd4, Brd8 and Brd9), the mediator complex subunits MED15 and MED25, and two less well-characterized COFs (EMSY and Gfzf) could activate transcription from any of 72,000 CP candidates, 133 base pair (bp) long DNA fragments around a comprehensive genome-wide set of transcription start sites (TSSs) and negative controls. If a tethered COF activates a candidate CP, this generates reporter RNAs with a short 5' sequence tag, derived from the 3' end of the corresponding CP. These reporter transcripts were captured with a 5' RNA linker that includes a 10 nucleotide (nt) long unique molecular identifier (UMI), enabling counting of individual reporter RNA molecules and quantifying of productive transcription initiation events at single-base-pair resolution for all candidate CPs in the library (Haberle, 2019).

Three independent COF-STAP-seq screens for each of the 13 COFs and positive (P65) and negative (GFP) controls in S2 cells were highly similar (all pairwise Pearson's correlation coefficients (PCCs) ≥ 0.89) and showed more initiation events for P65 and the 13 COFs than for GFP, as expected. Initiation mainly occurred at CPs corresponding to annotated gene starts, whereas random negative controls showed the least initiation, corroborating previous findings that gene CPs are specialized sequences, able to strongly respond to activating enhancers (Haberle, 2019).

Each COF showed differential activation of CPs and activated a unique set of CPs. For example, within a representative genomic locus, MED25 and Lpt most strongly activated the CP of CG9782, Chro and Gfzf most strongly activated the CP of RpS19a, and Mof most strongly activated the CPs of mbt and SmG. Indeed, the activation profiles of the COFs across all CPs were characteristically different, as revealed by hierarchical clustering. The differential CP activation by luciferase reporter assays were validated with MED25, Lpt, Mof and Chro for 50 CPs. The two assays agreed well (PCCs ≥ 0.72 except Mof, with PCC = 0.58), and it was confirmed that COFs activate some CPs more strongly than others (MED25, for example, preferentially activates the CPs on the left, Mof preferentially activates those in the middle, and Chro preferentially activates those on the right), which are refered to as distinct preferences, specificities, or 'compatibilities' towards different CPs (Haberle, 2019).

To test whether the COF-CP compatibilities generalize beyond S2 cells, three independent COF-STAP-seq screens were performed for six COFs (MED25, P300, Lpt, Gfzf, Chro and Mof) in two additional D. melanogaster cell lines, one derived from embryos (Kc167) and one from adult ovaries (ovarian somatic cells (OSCs)). For each of the six COFs, the screens were highly similar across all three cell lines (all PCCs ≥ 0.69), validating the distinct CP preferences of the COFs and the observed COF-CP compatibilities. These results establish the observed COF-CP compatibilities as a cell-type-independent, COF- and CP-sequence-intrinsic regulatory principle (Haberle, 2019).

To test whether the COF-CP preferences reflect endogenous gene regulation, the binding was assessed of each COF to genomic CPs of genes expressed in S2 cells. Published chromatin immunoprecipitation followed by sequencing (ChIP-seq) data for P300, Brd4, Trx18, Trr18, Lpt19 and Mof20 from S2 cells, and Chro from D. melanogaster embryos showed stronger COF binding at CPs that were strongly activated in STAP-seq by the respective COF (top 25%) and weaker binding at CPs that were more weakly activated (bottom 25%). Next, the COF-CP preferences were compared with the impact of COF inhibition or depletion on endogenous gene expression. Analyses of published gene expression data upon COF inhibition with small molecules (P300) or RNA interference (RNAi; Brd4 and Trx18) revealed that genes associated with the top 25% of CPs preferentially activated by P300, Brd4 or Trx displayed stronger downregulation upon inhibition of the respective COF, compared to genes associated with the bottom 25% of CPs. Conversely, the CPs of all genes that are downregulated upon inhibition of P300, Brd4 or Trx showed stronger activation by the respective COF in STAP-seq than the CPs of genes not affected by COF inhibition. Together, these results suggest that the distinct COF-CP preferences that were observed are employed during endogenous gene regulation in vivo (Haberle, 2019).

The observed COF-CP compatibilities suggest the existence of distinct CP classes that differentially respond to specific COFs. To address this, K-means clustering was used to define groups of CPs with similar responses. Around 75% of the variance can be explained by five CP groups, which were activated preferentially by: (1) MED25, P300, and strongly by P65; (2) MED25, P300, and weakly by P65; (3) Mof, and weakly by Lpt and Chro; (4) Chro and Gfzf; and (5) Gfzf. Although additional types of CPs are likely to exist in more specialized cell types such as germline cells, screening ten additional COFs—including subunits of prominent COF complexes with diverse enzymatic activities (for example, SAGA, ATAC, NuA4/Tip60 and Enok) and general transcription factors (GTFs; for example, TBP, Trf2 and Taf4)—did not reveal additional CP types in S2 cells, presumably because each of the additional COFs was highly similar to at least one of the original 13 COFs (Haberle, 2019).

Given that COF-STAP-seq measures COF-CP compatibility in an otherwise constant reporter setup, distinct compatibilities are likely to arise from differences in CP sequences. Indeed, the five groups of CPs displayed marked differences in the occurrence of known CP motifs. Group 1 is strongly enriched for the TATA box and a variant of the DPE, whereas group 2 is enriched for a different DPE variant. By contrast, groups 1 and 2 are depleted in motifs Ohler 1, 6 and 7, and in the DNA replication-related element (DRE), all of which are enriched in group 3 and to a lesser extent in group 4. Group 4 is the only group with a strong enrichment for the TCT motif that is known to occur in the promoters of genes encoding ribosomal proteins and other proteins involved in translation, which are indeed among the top 10% of CPs preferentially activated by Chro. In accordance with the differential occurrence of CP motifs, published datasets reveal differential binding of GTFs to these CPs in their endogenous genomic contexts. For instance: the TATA-binding protein (TBP) bound more strongly to group 1 CPs, which are enriched for the TATA box; TAF1 bound more strongly to group 1 and 2 CPs, which are enriched for the Inr motif; and motif 1-binding protein (M1BP) and DRE factor (DREF) bound more strongly to group 3 CPs, which are enriched for motif 1 and DRE. Last, the TBP paralogue TRF2 bound more strongly to group 3, 4 and 5 CPs, consistent with reports that TRF2 regulates ribosomal protein genes. The differences in motif occurrence and GTF binding between the CP groups suggest that COF compatibility might relate to GTF composition at the CP, which is determined by the CP sequence (Haberle, 2019).

The CP groups defined by their COF responsiveness are reminiscent of groups previously defined on the basis of motif content and transcription initiation patterns that differ in chromatin properties, gene function and expression, and enhancer responsiveness. The dataset used in this study might provide a functional link between these observations and the activation of distinct CP types by specific COFs. Indeed, group 1 and 2 CPs are associated with genes that are expressed highly variably across cells in Drosophila embryos and have cell-type-specific or developmental functions, whereas group 3 and 4 CPs are associated with genes that are expressed more uniformly and have housekeeping functions. Furthermore, both the upstream sequences and the nearest enhancers of these CPs were enriched for transcription-factor binding motifs known to occur preferentially in developmental versus housekeeping enhancers, and developmental and housekeeping enhancers indeed preferentially activated group 1 and 2 versus group 3 and 4 CPs, respectively, when tested by STAP-seq. Together, these results directly link enhancer-CP specificity to COF-CP compatibility (Haberle, 2019).

Because COFs can modify nucleosomes and alter the chromatin structure, the endogenous genomic contexts of the five CP groups were tested in S2 cells (only considering CPs of active genes). Nucleosome positioning, DNA accessibility and histone modifications all differed between the CP groups: group 1 and 2 CPs have broader DNA accessible regions around the TSSs and lower nucleosome occupancy and nucleosome phasing downstream of the TSS, compared with group 3 and 4 CPs, which have more narrow nucleosome-depleted regions around the TSS and strongly phased downstream nucleosomes (Haberle, 2019).

Unexpectedly, the CP groups also differed in the methylation status of histone 3 lysine 4 (H3K4). H3K4me3 is thought to be universally associated with active promoters, and indeed strongly marks CPs of groups 3, 4 and 5. By contrast, group 1 and 2 CPs have lower levels of H3K4me3 but higher levels of H3K4me1 compared with group 3, 4 and 5 CPs, a modification that is typically considered an enhancer mark. This difference is consistent with the differential binding of Trr and Set1, which deposit H3K4me1 and H3K4me3, respectively, and does not seem to stem from higher levels of Pol II binding or transcription at group 3, 4 or 5 CPs. Consistent with reports that developmental promoters lack H3K4me3, these results suggest that high levels of H3K4me3 versus H3K4me1 might not be a universal feature of promoters that distinguishes them from enhancers as previously suggested, and instead might depend on the COFs that regulate the respective promoters. Indeed, ranking all active CPs in S2 cells by their H3K4me1:H3K4me3 ratio revealed that those with the highest ratio are preferentially activated by P300 and MED25, and those with the lowest ratio are preferentially activated by Mof or Chro (Haberle, 2019).

To test whether regulatory compatibilities between COFs and CPs exist in other species, proof-of-principle screens were performed in human HCT116 cells for five human COFs (BRD4, MED15, EP300, MLL3 and EMSY) and P65, using a focused library containing 12,000 human CP candidates selected to cover the diversity of human CPs. These screens reveal that CPs also respond differently to different COFs in human cells: whereas the TATA-box-containing CP of REN is, for example, only activated by MED15 and P65, the CpG-island CP of IRAK1 responds most strongly to MLL3; and the tested COFs consistently displayed distinct CP-preferences across the entire CP library. Overall, the CPs most strongly activated by MED15 are enriched for TATA boxes, whereas CPs preferentially activated by MLL3 exhibit a higher GC and CpG content, suggesting that MLL3--but not MED15--preferentially activates CpG-island promoters. Together, this establishes that sequence-encoded COF-CP compatibilities exist in species as divergent as fly and human, suggesting that they constitute a general principle with important implications for transcriptional regulation (Haberle, 2019).

The regulatory compatibilities between COFs and CPs that were observed enable separate transcriptional programs to independently regulate not only different genes, but also alternative promoters and thus different isoforms of the same gene. Notably, composite promoters with differentially activated closely spaced TSSs exist and enable regulation by different COFs and programs, potentially in different developmental contexts. As the CP types differ in sequence elements, these might instruct the assembly of functionally distinct pre-initiation complexes (PICs) that differ in GTF composition or create distinct rate-limiting steps that require activation by different COFs, enabling specific and synergistic regulation. The existence of regulatory COF-CP compatibilities impacts promoter activation and gene expression in endogenous contexts and biotechnological applications and, together with other mechanisms that determine enhancer-promoter targeting in the context of the three-dimensional chromatinized genome, helps to explain how different genes or alternative promoters can be distinctly regulated in species as divergent as flies and humans (Haberle, 2019).

The COMPASS family of H3K4 methylases in Drosophila

Methylation of histone H3 lysine 4 (H3K4) in Saccharomyces cerevisiae is implemented by Set1/COMPASS, which was originally purified based on the similarity of yeast Set1 to human MLL1 and Drosophila Trithorax (Trx). While humans have six COMPASS family members, Drosophila has a representative of the three subclasses within COMPASS-like complexes: dSet1 (human SET1A/SET1B), Trx (human MLL1/2), and Trr (human MLL3/4). This study reports the biochemical purification and molecular characterization of the Drosophila COMPASS family. A one-to-one similarity occurs in subunit composition with their mammalian counterparts, with the exception of (lost plant homeodomains [PHDs] of Trr), which copurifies with the Trr complex. LPT is a previously uncharacterized protein that is homologous to the multiple PHD fingers found in the N-terminal regions of mammalian MLL3/4 but not Drosophila Trr, indicating that Trr and LPT constitute a split gene of an MLL3/4 ancestor. This study demonstrates that all three complexes in Drosophila are H3K4 methyltransferases; however, dSet1/COMPASS is the major monoubiquitination-dependent H3K4 di- and trimethylase in Drosophila. Taken together, this study provides a springboard for the functional dissection of the COMPASS family members and their role in the regulation of histone H3K4 methylation throughout development in Drosophila (Mohan, 2011).

Histone H3 lysine 4 methylation (H3K4me) is associated with the transcriptionally active regions of the genome in yeast, flies, and mammals. Set1 was identified as a component of a macromolecular protein complex named COMPASS (complex of proteins associated with Set 1), as the first H3K4 methylase, and it is responsible for all mono-, di-, and trimethylation of H3K4 in yeast. In Drosophila, four SET domain-containing proteins, namely, Trithorax (Trx), Trithorax-related (Trr), dSet1, and Ash1, have been reported to implement H3K4 methylation. All but Ash1, which has subsequently been demonstrated to be an H3K36 methyltransferase, are related to subunits of the six COMPASS and COMPASS-like complexes in mammals. trx was originally characterized as a gene that when mutated caused homeotic transformations. Detailed genetic and molecular analyses showed that Trx is required to maintain activation states of its target genes throughout development and counteracts the repressive effects of the Polycomb group proteins (PcG). Trr was identified based on sequence similarity to Trx but was shown to function in the regulation of hormone-responsive gene expression (Sedkov, 2003). dSet1 was identified based on sequence homology to the Saccharomyces cerevisiae and mammalian Set1 proteins (Mohan, 2011).

In mammals, there are at least six SET1-related proteins that form COMPASS-like complexes, namely, SET1A, SET1B, and MLL1 to MLL4. SET1A and SET1B are orthologous to dSet1; MLL1 and MLL2 are orthologous to Drosophila Trx; MLL3 and MLL4 (also known as ALR) are orthologous to Drosophila Trr (Mohan, 2010; Shilatifard, 2008; Smith, 2010). All of the mammalian COMPASS family of H3K4 methylases share ASH2L, RBBP5, DPY30, and WDR5 as common components. Analysis of the mammalian complexes allows classification into three classes based on unique components within each class: COMPASS, represented by SET1A and SET1B, contains WDR82 and CXXC1, proteins implicated in regulating trimethylation by yeast COMPASS; the MLL1/2 complexes contain Menin, implicated in targeting MLL1 to the Hox genes; the MLL3/4 complexes contain PTIP, PA-1, and NCOA6 (Cho, 2007), which are important for the gene-specific targeting of these complexes, and UTX, a histone H3K27 demethylase thought to be involved in counteracting PcG-mediated gene silencing (Eissenberg, 2010: Hughes, 2005; Lee, 2007; Mohan, 2011 and references therein).

This study purified and characterized the dSet1, Trx, and Trr complexes. In contrast to a previous report that Trx formed a heterotrimeric complex with CBP and SBF1, this study found instead that Trx forms a COMPASS-like complex containing orthologs of all known components of the MLL1 complex in mammals. These studies also demonstrate that Drosophila Set1 is the major contributor to the bulk in vivo dimethylation and trimethylation of H3K4 and that this depends on a conserved form of histone cross talk, where monoubiquitinated H2B is required for H3K4 trimethylation by dSet1. It was also found that mammalian MLL3/4 are represented in flies by two genes, Trr and LPT, and that the encoded proteins exist together in a COMPASS-like Trr complex. Taken together, this evidence for the existence of one representative complex in Drosophila for each of the three classes of the six COMPASS family proteins in mammals provides a unique opportunity to discover the differences in the targeting and function of H3K4 methylation by these complexes (Mohan, 2011).

Methylation of histone H3 lysine 4 (H3K4) in Saccharomyces cerevisiae is implemented by Set1/COMPASS, which was originally purified based on the similarity of yeast Set1 to human MLL1 and Drosophila Trithorax (Trx). While humans have six COMPASS family members, Drosophila possesses a representative of the three subclasses within COMPASS-like complexes: dSet1 (human SET1A/SET1B), Trx (human MLL1/2), and Trr (human MLL3/4). This study reports the biochemical purification and molecular characterization of the Drosophila COMPASS family. A one-to-one similarity in subunit composition with their mammalian counterparts was observed, with the exception of LPT (lost plant homeodomains [PHDs] of Trr), which copurifies with the Trr complex. LPT is a previously uncharacterized protein that is homologous to the multiple PHD fingers found in the N-terminal regions of mammalian MLL3/4 but not Drosophila Trr, indicating that Trr and LPT constitute a split gene of an MLL3/4 ancestor. This study demonstrates that all three complexes in Drosophila are H3K4 methyltransferases; however, dSet1/COMPASS is the major monoubiquitination-dependent H3K4 di- and trimethylase in Drosophila. Taken together, this study provides a springboard for the functional dissection of the COMPASS family members and their role in the regulation of histone H3K4 methylation throughout development in Drosophila (Mohan, 2011).

Modifications of histones and the protein machinery for the generation and removal of such modifications are highly conserved and are associated with processes such as transcription, replication, recombination, repair, and RNA processing. Histone H3K4 methylation, particularly trimethylation, has been mapped to transcription start sites in all eukaryotes tested and is generally believed to be a hallmark of active transcription. The H3K4 methylation machinery was first identified in yeast and named Set1/COMPASS. Six H3K4 methyltransferase complexes have been identified in humans, including SET1A/B, which are subunits of human COMPASS, and MLL1 to MLL4, which are found in COMPASS-like complexes (Mohan, 2011).

Although Trx and Trr were identified quite some time ago, their relative contributions to different states of overall H3K4 methylation were not known. Studies of human cells and Drosophila cells has shown that SET1 is the major contributor of H3K4 trimethylation levels in cell. During the preparation of the manuscript, a study of Drosophila also showed that dSet1, as a part of COMPASS, is responsible for the majority of H3K4 di- and trimethylation (Ardehali, 2011), which is in line with the findings presented in this study. These findings suggest that dSet1 could be responsible for the deposition of H3K4 trimethylation at the transcription start sites of the most actively transcribed genes as a consequence of postinitiation recruitment via the PAF complex (Smith, 2010: see Recruitment of histone-modifying activities by RNA Pol II). Trx and Trr both show extensive distribution along polytene chromosomes, although neither protein is required for bulk levels of H3K4me3. Perhaps Trx and Trr implement H3K4 methylation in a more gene-specific manner, at distinct stages of transcriptional regulation, or alternatively, have other substrates or functions (Mohan, 2011).

These biochemical studies have demonstrated that the Drosophila complexes are very similar to their mammalian counterparts in subunit composition. These studies have also demonstrated the utility of a baculovirus superinfection system for expressing proteins in Drosophila cells. Large-scale transient transfections offer several potential advantages over generating clonal stable cell lines, one of which is that the overexpression of some proteins could be toxic to cells. This can be a problem even when using inducible promoters, such as the Mtn promoter, due to leaky expression under uninduced conditions. Moreover, the baculovirus infection and expression strategy took about 3 weeks from the cloning of the cDNA into the viral vector, generating the virus, infection of S2 cells, and purification of the complexes from nuclear extracts. In contrast, conventional cloning took 4 months from cloning the cDNA into the vector to generating and characterizing the clonal cell lines. FLAG-HA-dWDR82 was purified from both stably transfected S2 cells and from the superinfection system and both strategies yielded a strikingly similar enrichment of target proteins (Mohan, 2011).

All of the COMPASS family members in Drosophila have several common subunits, namely, Ash2, Rbbp5, Wdr5, and Dpy30, which are homologs of CPS60, CPS50, CPS30, and CPS25, respectively, as well as each having complex-specific subunits. Many of these subunits have established, conserved roles in both the yeast and mammalian complexes: ASH2L is required for proper H3K4 trimethylation, as is CPS60 in yeast; both WDR5 in humans and CPS30 in yeast are required for the mono-, di-, and trimethylation of H3K4, and each is required for proper formation of the COMPASS and MLL complexes. Conservation of this degree in the H3K4 methylation machinery suggests that Drosophila might have similar machinery. However, it had previously been reported that Trx forms a complex with CBP and SBF, but no corresponding complexes have been found in mammals (Mohan, 2011).

The demonstration of the presence of shared components between COMPASS and COMPASS-like complexes in Drosophila supports the findings that these proteins are required for the proper functional architecture critical for the methylation of H3K4. The complex-specific components found in association with the dSet1, Trx, and Trr complexes further demonstrate a one-to-one correspondence of subunits between the Drosophila and human COMPASS family members that will allow the use of Drosophila as a model system for understanding the function of the human complexes. For example, while Set1/COMPASS is conserved from yeast to humans, it is possible that the metazoan complexes have additional functions needed for development. As the subunit compositions of both the SET1A and SET1B complexes are identical, it is likely that their functional analysis would be hindered by redundancy between the two complexes. The presence of a single dSet1 complex in flies may serve as an excellent starting point to dissect the metazoan-specific functions of the SET1 complexes (Mohan, 2011).

MLL-related proteins are multidomain proteins with the capacity to bind to many other proteins that may modulate their function. For example, Menin binds to the extreme N terminus of MLL1/2 and is required for proper targeting of the MLL1/2 complex to chromatin. Owing to its conserved components and interactions, but nonredundant nature, investigation of the Drosophila Trx complex promises to aid in understanding of the MLL1 and MLL2 complexes, specifically in their role in development (Mohan, 2011).

Currently there is very limited understanding of the functions of the various domains within the MLL3/4 proteins. The identification of LPT, which is homologous to the N terminus of MLL3/4, as a component of the Trr complex indicates the importance of PHD fingers residing in the LPT protein for the proper functioning and/or targeting of the Trr complex to chromatin. This separation of the MLL3/4 protein in Drosophila as Trr and LPT could allow dissection of the functions of N and C termini. Various studies have identified mutations in MLL3, MLL4, and UTX in a variety of cancers. Therefore, studies of the LPT-Trr complex could improve understanding of the targeting and regulation of these complexes with relevance to human disease (Mohan, 2011).

Importantly, Drosophila has a single representative of each class of COMPASS family members found in mammals, in which two representatives of each complex exist. In contrast, nematodes, such as the genetically tractable C. elegans, contain only a Set1 and MLL3/4-related protein, but no MLL1/2 representative. Given the power of genetic manipulation, the identification of the COMPASS, Trx, and Trr complexes in Drosophila that share similar subunits with their mammalian counterparts will greatly facilitate an understanding of the biological functions of the H3K4 methylation machinery in development and differentiation (Mohan, 2011).

REFERENCES

Search PubMed for articles about Drosophila Lost PHDs of trr

Ardehali, M. B., et al. (2011). Drosophila Set1 is the major histone H3 lysine 4 trimethyltransferase with role in transcription. EMBO J. 30: 2817-2828. PubMed ID: 21694722

Bulynko, Y. A. and O'Malley B. W. (2011). Nuclear receptor coactivators: Structural and functional biochemistry. Biochemistry 50: 313-328. PubMed ID: 21141906

Chang, P. Y., et al. (2010). Binding of the MLL PHD3 finger to histone H3K4me3 is required for MLL-dependent gene transcription. J. Mol. Biol. 400: 137-144. PubMed ID: 20452361

Chauhan, C., Zraly, C. B., Parilla, M., Diaz, M. O. and Dingwall. A. K. (2012). Histone recognition and nuclear receptor co-activator functions of Drosophila Cara Mitad, a homolog of the N-terminal portion of mammalian MLL2 and MLL3. Development 139(11): 1997-2008. PubMed ID: 22569554

Eissenberg, J. C. and Shilatifard, A. (2010). Histone H3 lysine 4 (H3K4) methylation in development and differentiation. Dev. Biol. 339: 240-249. PubMed ID: 19703438

Fisher, K., et al. (2010). Methylation and demethylation activities of a C. elegans MLL-like complex attenuate RAS signalling. Dev. Biol. 341: 142153. PubMed ID: 20188723

Haberle, V., Arnold, C. D., Pagani, M., Rath, M., Schernhuber, K. and Stark, A. (2019). Transcriptional cofactors display specificity for distinct types of core promoters. Nature 570(7759):122-126. PubMed ID: 31092928

Hughes, C. M., et al. (2004). Menin associates with a trithorax family histone methyltransferase complex and with the hoxc8 locus. Mol. Cell 13: 587-597. PubMed ID: 14992727

Lee, S., et al. (2008). Activating signal cointegrator-2 is an essential adaptor to recruit histone H3 lysine 4 methyltransferases MLL3 and MLL4 to the liver X receptors. Mol. Endocrinol. 22: 1312-1319. PubMed ID: 18372346

Li, T. and Kelly, W. G. (2011). A role for Set1/MLL-related components in epigenetic regulation of the Caenorhabditis elegans germ line. PLoS Genet. 7: e1001349. PubMed ID: 21455483

Mohan, M., Lin C., Guest, E. and Shilatifard, A. (2010). Licensed to elongate: a molecular mechanism for MLL-based leukaemogenesis. Nat. Rev. Cancer 10: 721-728. PubMed ID: 20844554

Mohan M., et al. (2011). The COMPASS family of H3K4 methylases in Drosophila. Mol. Cell. Biol. 31: 4310-4318. PubMed ID: 21875999

Pospisilik, J. A., et al. (2010). Drosophila genome-wide obesity screen reveals hedgehog as a determinant of brown versus white adipose cell fate. Cell 140: 148-160. PubMed ID: 20074523

Sanchez, R. and Zhou, M. M. (2011). The PHD finger: a versatile epigenome reader. Trends Biochem. Sci. 36: 364-372. PubMed ID: 21514168

Schuettengruber, B., et al. (2009). Functional anatomy of polycomb and trithorax chromatin landscapes in Drosophila embryos. PLoS Biol. 7: e13. PubMed ID: 19143474

Schwartz, Y. B., (2010). Alternative epigenetic chromatin states of polycomb target genes. PLoS Genet. 6: e1000805. PubMed ID: 20062800

Sedkov, Y., et al. (2003). Methylation at lysine 4 of histone H3 in ecdysone-dependent development of Drosophila. Nature 426: 78-83. PubMed ID: 14603321

Shilatifard, A. (2008). Molecular implementation and physiological roles for histone H3 lysine 4 (H3K4) methylation. Curr. Opin. Cell Biol. 20:341-348. PubMed ID: 18508253

Smith E. and Shilatifard, A. (2010). The chromatin signaling pathway: diverse mechanisms of recruitment of histone-modifying enzymes and varied biological outcomes. Mol. Cell 40: 689-701. PubMed ID: 21145479

Stroschein-Stevenson, S. L., et al. (2006). Identification of Drosophila gene products required for phagocytosis of Candida albicans. PLoS Biol. 4: e4. PubMed ID: 16336044

Vicent, G. P., et al. (2011). Four enzymes cooperate to displace histone H1 during the first minute of hormonal gene activation. Genes Dev. 25: 845-862. PubMed ID: 21447625

Yi, C. H., et al. (2007). A genome-wide RNAi screen reveals multiple regulators of caspase activation. J. Cell Biol. 179: 619-626. PubMed ID: 17998402

Biological Overview

date revised: 25 April 2024

The Interactive Fly resides on the
Society for Developmental Biology's Web server.