The Interactive Fly

Zygotically transcribed genes

RNA polymerase and general transcription factors


Proteins involved in messenger RNA synthesis

General Transcription Factors, as the protein factors involved in messenger RNA synthesis are known, are conserved across species as diverse as Saccharomyces cerevisiae, Drosophila and humans. TF stands for transcription factor; they were named in chronological order of their discovery. The entire set of General Transcription Factors is composed of about 30 subunits. Although the model below assumes that the factors are assembled by stages, there is some reason to believe that all thirty are also found assembled in a holoenzyme (Orphanides, 1996 and references).

Note: General Transcription Factors are listed below in order of recruitment to the promoter.

TFIID

TFIID is multiprotein complex containing the TATA box binding protein (TBP) and (in Drosophila) at least seven other proteins known as TAFs or TBP associated factors. The first protein recruited to the promoter is TBP, which serves to induce a bend in the DNA. The 240 kD subunit (TAF250kd) contains an HMG-box, bromodomains, a serine kinase, and histone acetyltransferase activity. The smaller subunits are similar in structure to histones. Drosophila TBP-associated factor 60kD (also known as dTAFII62) and TBP-associated factor 40kD (also know as dTAFII42) are homologous to human hTAFII80 and hTAFII31 respectively; Drosophila and human proteins are homologous to histone H3 and histone H4, respectively. Both Drosophila and human TFIID also contain dTAFII30 alpha and hTAFII20 that are putatitive histone H2B homologues. In solution and in the crystalline state, the dTAFII42/dTAFII62 complex exists as a heterotetramer, resembling the (H3/H4)2 heterotetrameric core of the histone octamer, suggesting that TFIID contains a histone octamer-like substructure. TBP participates in TFIID function even in promoters lacking a TATA box (Xie, 1996).
     Drosophila                        FlyBase ID       Human homologs        Yeast homologs

     -----------------                 ----------       --------------------  --------------     

 

     TATA binding protein              FBgn0003687      TATA binding protein  TATA binding protein


     Tbp-related factor (Trf-1)        FBgn0010287      unknown


     Trf2                              FBgn0026758      TLF/TRF2


     TBP-associated factor (TAF) 250kD FBgn0010355      TAFII250              p130


     Bip2  (TAFII155)                  FBgn0026262      TAFII140               yTAFII47   


     TBP-associated factor 150kD       FBgn0011836      Not characterized     p150   


     TBP-associated factor 110kD       FBgn0010280      TAFII135              not characterized
     No hitter (testis specific)       FBgn0041103      


     TBP-associated factor 80kD        FBgn0010356      TAFII85               p90
     Cannonball (testis specific)      FBgn0011569      

     Cabeza                            FBgn0011571      TAFII68               

     TBP-associated factor 60kD        FBgn0010417      TAFII80               p60


     Taf55                             FBgn0024909      TAFII55               TAFII67


     TBP-associated factor  40kD       FBgn0011302      TAFII31               not characterized 


     TAF 30kD subunit alpha            FBgn0011290      hTAFII20              not characterized          


     TAF 30kD subunit beta             FBgn0011291      hTAFII28              p40          

     
     TATA binding protein associated 
               factor 24kD subunit     FBgn0028398      TAFII30


     Taf18                             FBgn0026324      TAFII18               TAFII19


     TBP-associated factor 16          FBgn0026324      TAFII60


     ENL/AF9                           FBgn0026441      TAFII60               TAFII30

TFIIB TFIIB associates with TBP on the opposite side of the DNA helix. The TFIIB-TBP-DNA ternary complex is formed by TFIIB
clamping the acidic C-terminal stirrup of TBP in its basic cleft, and interacting with the phosphoribose backbone
upstream and downstream of the center of the TATA element.

TFIIB physically links TFIID at the promoter with the pol II/TFIIF complex.


     Drosophila                        FlyBase ID       Human homologs

     -----------------                 ----------       ------------------         

     Transcription factor IIB          FBgn0004915      TFIIB
TFIIA Required for activation of transcription
     Drosophila                        FlyBase ID       Human homologs

     -----------------                 ----------       ------------------    

     Transcription factor IIA S        FBgn0013347      TFIIA gamma     


     Transcription factor IIA L        FBgn0011289      TFIIA alpha and beta     

TFIIE TFIIE contains a zinc-binding domain and is involved in promoter melting. TFIIE recruits TFIIH to the promoter.

     Drosophila                        FlyBase ID       Human homologs

     -----------------                 ----------       ------------------         


     Transcription factor IIEalpha     FBgn0015828      TFIIEalpha (56 kD)    


     Transcription factor IIEbeta      FBgn0015829      TFIIEbeta (34 kD)     

TFIIF TFIIF is the homolog of bacterial sigma subunit. Polymerase II cannot stably associate with the TFIID and TFIIB assembly at
the promoter and must be escorted to the promoter by TFIIF. TFIIF stimulates elongation.


     Drosophila                        FlyBase ID       Human homologs

     -----------------                 ----------       ------------------    


     Transcription factor TFIIFalpha   FBgn0010282      TFIIF RAP74    


     Transcription factor TFIIFbeta    FBgn0010421      TFIIF RAP30     

RNA polymerase For RNA polymerase II, the transition from initiation to elongation is accompanied by covalent modification of an unusual
structure at the carboxy terminus of its largest subunit. This evolutionarily conserved structure consists of multiple
tandem repeats of a heptapeptide, the RNA pol II carboxy-terminal domain (CTD). The number of times this sequence is
repeated varies from 26 in yeast to 52 in humans and seems to be directly related to genome complexity. The
phosphorylation of the CTD is central to the transcription mechanism of pol II. The unphosphorylated form of pol II is the
form recruited to the initiation complex. During initiation of RNA synthesis, the CTD becomes extensively phosphorylated
on serine and threonine residues.

     Drosophila                        FlyBase ID       Human homologs

     -----------------                 ----------       -----------------

     RNA polymerase II 215kD subunit   FBgn0003277      RNA polymerase II large subunit   


     RNA polymerase II 140kD subunit   FBgn0003276      RNA polymerase II small subunit      


TFIIH TFIIH is a multisubunit factor with 3'-5' helicase activity. The Drosophila TFIIH consists of 8 subunits (two listed here)
similar to their human counterparts. Besides the helicase activity, there is present RNA polII C-terminal domain kinase
activity (CDK7) and a cyclin partner for the kinase (Cyclin H). Cyclin H forms a ternary complex with CDK7 and MAT1.
This tripartite Cdk-activating kinase occurs in a free form and in association with 'core' TFIIH.

     Drosophila                        FlyBase ID       Human homologs

     -----------------                 ----------       ------------------          

     Transcription factor IIH          FBgn0015830      TFIIH (ERCC3)


     Cyclin-dependent kinase 7         FBgn0015617      CDK7  




P-TEFb A dimer of Cdk9 and Cyclin T that targets RNA polymerase II C-terminal domain.
Functions to overcome promoter-proximal pausing and premature termination -
promotes polymerase entry into productive elongation.

     Drosophila                        FlyBase ID       Human homologs

     -----------------                 ----------       ------------------          

     Cyclin dependent kinase 9         FBgn0019949      Cdk9


     Cyclin T                          FBgn0025455      Cyclin T  


TFIIS critical for efficient release of stalled RNA Pol II from intrinsic stop sites in promoter regions -
promotes transcriptional elongation and decreases pausing

Drosophila                                  FlyBase ID       Human homologs

-----------------                           ----------       ------------------          

RNA polymerase II elongation factor         FBgn0010422      TfIIS



Factors involved in function of RNA polymerase II

Factors involved in function of RNA polymerase III

Paf1 complex (coordinates histone modifications and changes in nucleosome structure with transcription activation and Pol II elongation)

How does messenger RNA synthesis take place?

The conventional model for formation of a preinitiation complex and ordered transcription by RNA polymerase II (pol II) is characterized by a distinct series of events: (1) recognition of core promoter elements by TFIID (containing TBP and several other protein subunits), (2) recognition of and binding to the TFIID-promoter complex by TFIIB, (3) recruitment of a TFIIE/pol II complex by TFIIB, (4) binding of TFIIE (related to bacterial sigma) and TFIIH (containing a helicase required for promoter melting) to complete the preinitiation complex, (5) promoter melting and formation of an "open" initiation complex, (6) synthesis of the first phosphodiester bond of the nascent mRNA transcript, (7) release of pol II contacts with the promoter (promoter clearance, and (8) elongation of the RNA transcript. TFIIA can join the complex at any stage after TFIID binding and stabilizes the initiation complex. TFIID can remain bound to the core promoter supporting reinitiation of transcription. (Orphanides, 1996 and Nikolov, 1997).

This model has been further refined to incorporate known alterations in the level of phosphorylation of the carboxy-terminal domain (CTD) of RNA polymerase II (Cho, 1999). Stable association of RNAPII with promoter sequences requires TFIID (or TBP), TFIIB, and TFIIF. However, the RNAPII transcription system is unique because, after the polymerase has stably associated with promoter sequences, two additional factors, TFIIE and TFIIH, are necessary for transcription. This requirement is likely related to a unique structure found at the carboxyl terminus of the largest subunit of RNAPII known as the carboxy-terminal domain (CTD). This conserved structure consists of multiple tandem repeats of the heptapeptide Tyr-Ser-Pro-Thr-Ser-Pro-Ser, which serves as a substrate for a number of protein kinases. At least two forms of RNAPII have been detected in cells. The most abundant form contains a phosphorylated CTD (RNAPIIO). A second form contains an unphosphorylated CTD and is known as RNAPIIA. The phosphorylation of the CTD has been correlated with function. It was found that the nonphosphorylated form of RNAPII is recruited to the initiation complex, whereas the elongating polymerase is found with a phosphorylated CTD. TFIIH contains a CTD kinase activity and this activity is efficient after RNAPII has associated with promoter sequences. A 150-kD polypeptide termed FCP1 has now been isolated. Together with RNAPII, FCP1 reconstitutes a highly specific CTD phosphatase activity. Functional analysis demonstrates that the CTD phosphatase allows recycling of RNAPII. Upon reaching termination sequences, the CTD becomes dephosphorylated by the FCP1 phosphatase within the ternary complex (consisting of DNA, polymerase and phosphatase) or immediately after the release of RNAPII from the DNA template. The phosphatase dephosphorylates the CTD allowing efficient recycling of RNAPII into transcription initiation complexes, which result in increased transcription. The phosphatase is found to stimulate elongation by RNAPII; however, this function is independent of its catalytic activity (Cho, 1999 and references).

A model is presented detailing the role of cycling of CTD phosphorylation in the function of RNAPII. After the termination of the previous transciption cycle, TBP remains bound to the TATA motif and provides the foundation for association of TFIIB. RNAPII, through its interactions with TFIIF, recognizes the TBP-TFIIB complex association with the TATA motif. Because TFIIF has been found to interact with both the phosphorylated and nonphosphorylated forms of RNAPII and FCP1 and to stimulate FCP1 activity, its association with RNAPII prior to association with the TB complex may be important in attaining an RNAPII that is fully dephosphorylated. The association of RNAPII with promoter sequences provides the foundation for the entry of TFIIE and allows the association of TFIIH, resulting in the formation of a fully competent transcription initiation complex. During the process of initiation and prior to the formation of a fully competent elongation complex, the CTD becomes phosphorylated in a TFIIH-dependent manner. Phosphorylation of the CTD does not affect elongation efficiency, but allows RNAPII to disengage from the promoter and from transcription initiation factors. In the presence of the ribonucleoside triphosphates, the transcription initiation complex disassembles with the release of TFIIB, TFIIE, and TFIIH. CTD phosphorylation provides a foundation for the association of factors involved in RNA processing, such as the capping enzyme, splicing factors, and factors involved in 3'-end formation. Upon transcription of termination/polyadenylation signals, the elongating complex is altered, resulting in the release of RNAPII from the template by an unknown process. It is possible that RNAPII is converted to the nonphosphorylated form prior to, or concomitant with, its release from the DNA template. This possibility is supported by studies demonstrating that FCP1 is capable of dephosphorylating the CTD of RNAPII not only in solution prior to incorporation into transcription initiation complexes, but also in active ternary elongation complexes stalled as a result of nucleotide starvation. The finding that FCP1 also stimulates elongation by RNAPII, independent of its phosphatase activity, suggests that FCP1 may remain associated with RNAPII during elongation. The finding that FCP1 is active in ternary complexes has implications for the mechanism of transcription termination as well as for the down-regulation of RNA processing. Similar to the signal imposed on phosphorylation of the CTD (disengagement of RNAPII from the promoter and from interaction with initiation factors), dephosphorylation of the CTD may result in a signal that releases factors from RNAPII that are involved in RNA maturation (Cho, 1999 and references).

Evolution of general transcription factors

How have the factors required for transcription initiation (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, TFIIH, and RNA polymerase II [pol II]) evolved to accommodate the elaborate transcriptional programs required for growth, differentiation, and development of multicellular organisms? Analysis of the complete Drosophila genome sequence, as well as those of C. elegans, Saccharomyces cerevisiae, and humans sheds light on this well studied question in eukaryotic biology. All four organisms encode single isoforms of RNA pol II, TFIIB, TFIIE, TFIIF, and TFIIH components, but multiple, sequence-related isoforms of TFIID components. In addition, Drosophila and humans encode multiple isoforms of TFIIA components. Current evidence indicates that tissue- and cell type-specific transcription is directed by differentially expressed TFIID and possibly TFIIA isoforms. Thus, in accord with experimental data, this analysis points to TFIIA and TFIID as the factors that help generate the broad transcriptional repertoire of multicellular organisms. The identification of the complete set of TFIIA and TFIID components in a genetically and biochemically tractable organism like Drosophila is an important step toward understanding the mechanisms governing developmentally regulated transcription not only in Drosophila but also in humans (Aoyagia, 2000 and references therein).

Biochemical fractionation of Drosophila embryos, human cells, and yeast cells has defined a set of multiprotein complexes termed general transcription factors (GTFs; TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIIH) required for mRNA transcription initiation in vitro. Transcription is initiated by recognition of core promoter elements by TFIID and sequential or concerted assembly of the other GTFs and RNA pol II to form the preinitiation complex (PIC). Although GTFs play essential roles during transcription initiation, it is the factors that regulate the ability of the GTFs to assemble and stably bind a core promoter that are probably major determinants of gene-specific transcription levels. For example, activators and coactivators are thought to stimulate transcription by recruiting GTFs to a promoter, thereby accelerating PIC assembly (Aoyagia, 2000 and references therein).

The GTF TFIID is composed of TATA-binding protein (TBP) and coactivator subunits termed TBP-associated factors (TAFIIs). TAFIIs not only function as 'conventional' coactivators by serving as physical links between DNA-binding activator proteins and the PIC but also possess enzymatic or promoter recognition activities that presumably enhance the efficiency of PIC assembly. TFIIA has also been described as a coactivator and displays a number of TAFII-like properties: it binds to TBP and TAFIIs; it interacts with specific transcriptional activators; it is generally required for activated transcription in vitro; and it contributes to promoter selectivity (Aoyagia, 2000 and references therein).

Inactivation of individual TAFIIs in Drosophila , mammalian, and yeast cells has demonstrated that TAFIIs are not required for the transcription of all RNA pol II genes, and in fact there is great variation in regard to the identity and number of gene targets for individual TAFIIs. Furthermore, different domains within a single TAFII can play gene-specific roles in transcription. The isolation of a human B cell-specific isoform of TAFII130 (TAFII105) raises the possibility that substoichiometric subunits of TFIID mediate tissue- or cell type-specific transcription and that additional components of TFIID may have escaped detection because of their low abundance. These possibilities have been born out in Drosophila where isoforms of TAFII110 and TAFII80 (No hitter [Nht] and Cannonball [Can], respectively) are expressed exclusively in testis and regulate transcription of a subset of genes required for spermatogenesis, and isoforms of TBP (TBP-related factors [TRF1 and TRF2]) are expressed in a tissue-specific manner and bind different genes in salivary gland cells. Similarly, analysis of the human TFIIA-L isoform ALF (TFIIAalpha/ß-like factor) reveals that its expression is restricted to the testis; however, it remains to be determined if it is used for the transcription of testis-specific genes. In Drosophila , TFIIA-S is expressed in a dynamic pattern during eye development and is transiently upregulated in photoreceptor precursor cells before their fate is determined. Therefore, the role of TFIIA and TFIID in transcription initiation is governed by the expression patterns and activities of their varied components (Aoyagia, 2000 and references therein).

Finally, it is critical to note that analysis of the function of TAFIIs is complicated by the fact that they are components of at least two other complexes that lack TBP: p300/CBP-associated factor (PCAF) and TBP-free TAFII-containing complex (TFTC). The human PCAF histone acetyltransferase (HAT) complex contains three TAFIIs that are shared with TFIID (TAFII31/32, TAFII20/15, and TAFII30) and three TAFII isoforms (PCAF-associated factor 65ß [PAF65ß], PAF65alpha, and SPT3) related to TAFII100, TAFII70/80, and TAFII18, respectively. Yeast possess an analogous complex, Spr-Ada-Gcn5-acetyltransferase (SAGA), containing TFIID TAFIIs and the Gcn5 HAT, and Drosophila may also, since it contains a Gcn5/PCAF homolog that interacts with TAFII24 (Aoyagia, 2000 and references therein).

Searches of the completed Drosophila, C. elegans, and yeast genomes and the partial human genome for sequence homologs of biochemically identified components of the general transcription machinery have led to the following conclusions: (1) all of the components of RNA pol II, TFIIB, TFIIE, TFIIF, and TFIIH are encoded by single copy genes in Drosophila , C. elegans, and yeast;(2) multiple isoforms of TFIID components are encoded in Drosophila , C. elegans, humans, and yeast, and multiple isoforms of TFIIA components are encoded in Drosophila and humans; (3) each organism encodes isoforms of different sets of TFIIA and TFIID components, some which are unique to a particular organism (Aoyagia, 2000 and references therein).

Sequence comparisons uncovered Drosophila homologs of TAFIIs previously identified in yeast or humans by biochemical means but which had not been described in Drosophila (yeast TAFII67/human TAFII55, yeast TAFII30/ human ENL/AF-9, and yeast TAFII19/human TAFII18). Thus, all TAFIIs present in both yeast and humans are present in Drosophila , as well as C. elegans. In contrast, yeast TAFII47 and TAFII65 are absent from Drosophila, C. elegans, and apparently from humans, suggesting that these TAFIIs perform a yeast-specific role, such as serving as coactivators for DNA-binding activators that are not present in metazoans. Finally, there are TAFIIs present in Drosophila, C. elegans, and humans that are absent from yeast (human TAFII68/Drosophila Cabeza and multiple TAFII isoforms). In addition to Can and Nht, there are alternatively spliced forms of TAFII30alpha, two genes (TAFII24 and TAFII16) that encode Drosophila homologs of human TAFII30, and TAFII60 and TAF30alpha isoforms (TAFII60-2 and TAF30alpha-2, respectively). TFIIA-S and TFIIA-L are the only other GTF components in Drosophila and humans, respectively, that are expressed in multiple isoforms. The fact that these proteins are unique to multicellular organisms suggests that they play cell-specific roles (Aoyagia, 2000 and references therein).

A number of TAFIIs contain a common structural motif called the histone fold that was originally shown to drive folding and association of each of the core histones (H2A, H2B, H3, and H4) and subsequently shown to play a similar role in association of TAFIIs. TAFII pairs, such as Drosophila TAFII40 and TAFII60, form heterotetramers, analogous to H3 and H4, and numerous other TAFII-TAFII and TAFII-nonTAFII interactions have been shown to involve histone fold motifs. The demonstrated histone fold interaction of human TAFII135 and TAFII20, predicts that Drosophila isoforms of these proteins, Nht and TAFII30alpha-2, respectively, may heterodimerize and hints at the existence of a human TAFII20 isoform that would heterodimerize with the TAFII135 isoform, TAFII105. B cell-specific expression of the hypothetical TAFII20 isoform may explain why TAFII105 associates with TFIID in B cells but not in other cell types (Aoyagia, 2000 and references therein).

In addition to the TAFIIs indicated above, other Drosophila transcription factors contain histone fold motifs, including Prodos, NF-YC-like (CG3075), CG11301, CHRAC-14 (CG13399), CHRAC-16 (CG15736), Dr1 (CG4185), NC2alpha (CG10318), and BIP2 (CG2009). It is interesting to speculate that these factors may be unidentified TAFII components of TFIID or binding partners for known TAFIIs in complexes that lack TBP (Aoyagia, 2000 and references therein).

Analysis of eukaryotic genomes has defined sets of proteins that are similar in sequence to known components of TFIIA and TFIID. Since known components of TFIIA and TFIID have been shown to play key roles in developmentally regulated transcription, it is exciting to speculate that the newly identified genes will play similar roles and that TFIIA and TFIID components have evolved to support tissue- or cell type-specific transcriptional requirements of individual eukaryotic organisms. The challenge now is to determine if TAFIIs that have been identified on the basis of their sequence are components of TBP-containing complexes or other TAFII-containing complexes, whether TAFIIs and TFIIA isoforms are differentially expressed during development, and how differentially expressed TBP, TAFII, and TFIIA isoforms function in concert with the ubiquitously expressed form of TFIID and TFIIA to regulate gene expression. The subunit composition of human PCAF complex leads to the prediction that Drosophila TAFII60-2 and Can and C. elegans Y37E11AL.c are components of PCAF/SAGA and not TFIID. However, protein isoforms that are unique to a particular organism, such as Drosophila TAFII30alpha-2 and C. elegans F54F7.1 and K10D3.3, may be tissue- or cell type-specific components of TFIID and not of PCAF/SAGA. Drosophila may be the most appropriate organism for these studies since the biochemical activities of these factors can be determined using established TFIIA and TFIID purification schemes and in vitro transcription systems, and developmental requirements for these factors can be determined using existing mutants or mutants generated by traditional mutagenesis schemes, P-element insertion, RNA interference (RNAi), or homologous recombination (Aoyagia, 2000 and references therein).

Structures of three distinct activator-TFIID complexes

Sequence-specific DNA-binding activators, key regulators of gene expression, stimulate transcription in part by targeting the core promoter recognition TFIID complex and aiding in its recruitment to promoter DNA. Although it has been established that activators can interact with multiple components of TFIID, it is unknown whether common or distinct surfaces within TFIID are targeted by activators and what changes if any in the structure of TFIID may occur upon binding activators. As a first step toward structurally dissecting activator/TFIID interactions, the three-dimensional structures of TFIID bound to three distinct activators (i.e., the tumor suppressor p53 protein, glutamine-rich Sp1 and the oncoprotein c-Jun) was determined and their structures were compared as determined by electron microscopy and single-particle reconstruction. By a combination of EM and biochemical mapping analysis, these results uncover distinct contact regions within TFIID bound by each activator. Unlike the coactivator CRSP/Mediator complex that undergoes drastic and global structural changes upon activator binding, instead, a rather confined set of local conserved structural changes were observed when each activator binds holo-TFIID. These results suggest that activator contact may induce unique structural features of TFIID, thus providing nanoscale information on activator-dependent TFIID assembly and transcription initiation (Liu, 2009).

Three D density difference maps generated from reconstructions of the three independent activator/TFIID assemblies (i.e., p53-IID, Sp1-IID, and c-Jun-IID) and free holo-TFIID have served as a method to map the most likely contact sites of these activators within the native TBP-TAF complex. Remarkably, each activator contacts TFIID via select TAF interfaces within TFIID. The unique and localized arrangements of these three activators contacting different surfaces of TFIID could be indicative of the wide diversity of potential activator contact points within TFIID that would be dependent on both the specificity of activation domains as well as core promoter DNA sequences appended to target gene promoters. It is also possible, however, that these distinct activator-TFIID contacts can form a common scaffold when TFIID binds to the core promoter DNA (Liu, 2009).

It is well established that activators including p53, Sp1, and c-Jun frequently work synergistically with each other or other activators to potentiate selective gene expression programs in response to a variety of stimuli in vivo. Therefore, combinatorial mechanisms of promoter activation might favor distinct nonoverlapping activator-binding sites within TFIID, which can be achieved by specific interactions between selective TAF subunits and activators. Indeed, it was established that TAF1 and TAF4 serve as coactivators for Sp1, while TAF1, TAF6, and TAF 9 mediate p53-dependent transactivation and TAF1 and TAF7 subunits are thought to be coactivators for c-Jun. Since activators make sequence-specific contacts with the DNA template at various positions upstream of the core promoter, it is also plausible that activators bound to unique surfaces of TFIID can influence specific structures of a promoter as the DNA traverses along TFIID resulting in distinct activator/promoter DNA structures (Liu, 2009).

Activator mapping results also complement and structurally extend the functional relevance of previous biochemical and immunomapping studies of TFIID. For example, label transfer studies show that the N-terminal activation domain of p53 contacts TAF6, confirming previous biochemical evidence showing that amino acids 1-42 of p53 contact TAF6/9. In support of this observation, the p53-IID 3D structure indicates that p53 contacts TFIID at lobes A and C where TAF6/9 are located as determined by EM immunomapping. In addition, previous studies have shown that both TBP and TAF1 can directly contact p53 in the absence of additional TFIID subunits. Interestingly, body-labeled p53 cross-linked to TAF1, TAF5, and weakly to TBP, thus extending the immunomapping studies that determined the locations of TBP and the N terminus of TAF1 at lobe C. Thus, EM activator mapping studies show a significant interface between p53 and specific TAFs located at lobes A and C of TFIID. Likewise, Sp1 label transfer results confirmed previous biochemical data showing a direct interaction between TAF4 and the N-terminal glutamine-rich domains of Sp1. In addition to TAF4, TAF6 was identified as weakly cross-linked to Sp1, suggesting that TAF6 may also be in the vicinity but perhaps more distal to the N terminus of Sp1. The largest TFIID subunit, TAF1, was cross-linked when body-labeled Sp1 was used. This result was not entirely unexpected, since previous studies found that TAF1 is required for Sp1-dependent transactivation, possibly through a direct interaction between TAF1 and Sp1 (Liu, 2009).

In comparison with p53 and Sp1, body-labeled c-Jun was shown to contact TAF1 and TAF6 in label transfer studies with no subunits contacting the N-terminal activation domain of c-Jun. This N-terminal activation domain of c-Jun may be structurally flexible or predominantly unstructured and is apparently positioned away from TFIID contacts. Indeed, successful structural studies of c-Jun thus far have been limited to the C-terminal leucine zipper DNA-binding region when bound to DNA. Previous biochemical assays have shown that the C-terminal basic leucine zipper DNA-binding region also contacts the N terminus of TAF1 (Liu, 2009).

It is worth noting that the extra density representing c-Jun and the other activator polypeptides in EM studies may not reflect the full-expected size of the activators. This is due to the presence of large unstructured regions in these proteins that are averaged out during structural analysis. As activators contain multiple molten globular domains that likely interact with different partners, one would expect a high degree of structural disorder in the domains that are not in direct contact with TFIID. Thus, the extra density associated with each activator determined from the single-particle reconstructions likely only represents minimally the most stably associated portion of activators bound to TFIID. This common situation would invariably lead to underrepresenting the actual size of the activator in a manner not unlike crystal structures of domains with flexible loops that become 'invisible' in the crystal structure (Liu, 2009).

Based on EM immunomapping, there are two copies of TAF6 within TFIID, wherein one copy resides in lobe A and another in lobe B. Collectively, the current studies suggest that two distinct activators (p53 and c-Jun) strongly contact the two different TAF6 subunits that are each located in different lobes of TFIID. It is unknown how p53 or c-Jun discriminates between TAF6 on lobe A versus B when binding to TFIID. In the future, it will be interesting to investigate if these two activators can bind to a single TFIID molecule simultaneously and decipher 3D structures of TFIID assemblies bound to select endogenous promoter DNA sequences in the presence and absence of distinct activators that are engaged in synergistic transcriptional activation (Liu, 2009).

It is of note that unlike the radical, diverse, and global structural changes observed with CRSP/Mediator complexes upon activator binding, TFIID largely retains its overall architecture when bound by three different activators. Interestingly, this study found that two of the activator/IID structures, p53-IID and Sp1-IID assemblies appear to be more constricted around the central cavity with narrower ChB-D and ChA-B channels, while the third structure, c-Jun-IID, remains most similar to free holo-TFIID. In particular, the p53-IID structure more closely resembles the closed conformational state of the previous cryo-TFIID structure. To test if p53-bound TFIID mimics the most closed conformational form of holo-TFIID, 3D reconstructions were performed using either the most closed or 'open' cryo-TFIID structures as an initial reference volume for refinement. Interestingly, it was found that both newly refined 3D structures generated from either the closed or open reference volume are fairly similar, with possibly a partial occupancy of p53 on lobe A. These findings suggest that the overall p53-TFIID structure tends to move toward the closed conformation with moderate movement at the outer tips of lobes A and B, even though p53-IID is predominantly observed in an intermediate average conformational form between the most closed and open forms. Perhaps factors contacting lobe A or C can induce certain coordinated movements within lobes that lead to a closed conformation of TFIID (Liu, 2009).

Although TFIID largely retains its prototypic global architecture upon activator binding, several common localized structural changes induced upon activator binding were observed in the 3D reconstruction. For example, a prominent and consistent induced extra density protrusion located in lobe D was observed when each of the three different activators binds TFIID. Given that all these activators are represented by distinct densities with unique sizes and shapes within the bound TFIID structure, and the fact that it has been demonstrated that they each can target different subunits within TFIID by a number of independent biochemical assays, it seems reasonable to assign 'unique and significant' extra densities located at distinct sites as representing the different bound activators. In contrast, the common similarly sized extra density seen at lobe D of each activator-IID structure most likely represents a conserved conformational change induced by these three different activators. Interestingly, this protrusion in lobe D resides distal to each of the activator-binding sites, suggesting that these three activators may potentially induce a long-range internal conformational change within TFIID. It would be intriguing to identify which TAF subunits are located at the tip of lobe D and eventually determine the function, if any, of this extended lobe in activator-induced transcription initiation. However, despite the potential significance of these structural changes induced by activators, it is premature to speculate regarding their functional importance (Liu, 2009).

Architecture of an RNA polymerase II transcription pre-initiation complex

The protein density and arrangement of subunits of a complete, 32-protein, RNA polymerase II (pol II) transcription pre-initiation complex (PIC) were determined by means of cryogenic electron microscopy and a combination of chemical cross-linking and mass spectrometry. The PIC showed a marked division in two parts, one containing all the general transcription factors (GTFs) and the other pol II. Promoter DNA was associated only with the GTFs, suspended above the pol II cleft and not in contact with pol II. This structural principle of the PIC underlies its conversion to a transcriptionally active state; the PIC is poised for the formation of a transcription bubble and descent of the DNA into the pol II cleft (Murakami, 2013).

This study has revealed a central principle of the PIC: the association of promoter DNA only with the GTFs and not with pol II. Promoter DNA is suspended above the pol II cleft, contacting three GTFs -- TFIIB, TFIID (TBP subunit), and TFIIE -- at the upstream end of the cleft (TATA box) and contacting TFIIH (Ssl2 helicase subunit) at the downstream end. In between, the DNA is free and available for action of the helicase, which untwists the DNA to introduce negative superhelical strain and thereby promote melting at a distance (Murakami, 2013).

This principle of the PIC is a consequence of the rigidity of duplex DNA. The promoter duplex must follow a straight path, whereas bending through ~90° is required for binding in the pol II cleft. Only after melting can the DNA bend for entry in the cleft. Melting is thermally driven, induced by untwisting strain in the DNA above the cleft. A melted region is short-lived and must be captured by binding to pol II, which occurs rapidly enough because the DNA is positioned above the cleft. The GTFs therefore catalyze the formation of a stably melted region (transcription bubble) in two ways, by the introduction of untwisting strain (by the helicase) and by positioning promoter DNA (Murakami, 2013).

Untwisting strain is distributed throughout the DNA above the pol II cleft, so melting may occur at any point, but only a melted region adjacent to TFIIB is stabilized by binding to pol II. The reason is again the rigidity of duplex DNA, and the requirement for a sharp bend adjacent to TFIIB to penetrate the pol II cleft. A single strand of DNA must extend from the point of contact with TFIIB, ~13 bp downstream of the TATA box, through the binding site for the transcription bubble in pol II. TFIIB may also interact with the single strand to stabilize the bubble (Murakami, 2013).

These conclusions are based on results from both cryo-EM and XL-MS, which served to validate one another: Segmentation and labeling of electron density, based on fitting pol II and other known structures, was consistent with all but three of 266 cross-links observed. The PIC structure is also consistent with partial structural information from x-ray crystallography (pol II-TFIIB, pol II-TFIIS, TFIIA-TBP-TFIIB-DNA, and Tfb2-Tfb5), from nuclear magnetic resonance (Tfb1-Tfa1 and Tfa2-DNA), and from EM (core and holo TFIIH). This consistency provides cross-validation, both supporting this PIC structure and establishing the relevance of the partial structural information. Further consistency was found with the results of FeBABE cleavage mapping of complexes formed in yeast nuclear extract; the locations of proteins along the DNA in the PIC structure and those determined with FeBABE cleavage differ by no more than 5 bp. This PIC structure also agrees with results of protein-DNA cross-linking in a reconstituted human transcription system; positions of TFIIE and TFIIH differ between the two studies by ~20 and 10 bp. The location of Ssl2 in this structure, ~30 bp downstream from the TATA box, supports the proposal, made on the basis of previous DNA-protein cross-linking analysis, that helicase action torques the DNA to introduce untwisting strain and thereby to promote melting at a distance (Murakami, 2013).

Association of the winged helix motif of the TFIIEalpha subunit of TFIIE with either the TFIIEbeta subunit or TFIIB distinguishes its functions in transcription

In eukaryotes, the general transcription factor TFIIE consists of two subunits, alpha and beta, and plays essential roles in transcription. Structure-function studies indicate that TFIIE has three-winged helix (WH) motifs, with one in TFIIEα and two in TFIIEβ. Recent studies suggested that, by binding to the clamp region of RNA polymerase II, TFIIEα-WH promotes the conformational change that transforms the promoter-bound inactive preinitiation complex to the active complex. To elucidate its roles in transcription, functional analyses of point-mutated human TFIIEα-WH proteins were carried out. In vitro transcription analyses identified two classes of mutants. One class was defective in transcription initiation, and the other was defective in the transition from initiation to elongation. Analyses of the binding of this motif to other general transcription factors showed that the former class was defective in binding to the basic helix-loop-helix motif of TFIIEβ and the latter class was defective in binding to the N-terminal cyclin homology region of TFIIB. Furthermore, TFIIEα-WH bound to the TFIIH XPB subunit at a third distinct region. Therefore, these results provide further insights into the mechanisms underlying RNA polymerase II activation at the initial stages of transcription (Tanaka, 2015).

dTAF10- and dTAF10b-containing complexes are required for ecdysone-driven larval-pupal morphogenesis in Drosophila melanogaster

In eukaryotes the TFIID complex is required for preinitiation complex assembly which positions RNA polymerase II around transcription start sites. Histone acetyltransferase complexes including SAGA and ATAC, modulate transcription at several steps through modification of specific core histone residues. This study investigated the function of Drosophila proteins TAF10 and TAF10b, which are subunits of dTFIID and dSAGA, respectively. The simultaneous deletion of both dTaf10 genes impaired the recruitment of the dTFIID subunit dTAF5 to polytene chromosomes, while binding of other TFIID subunits, dTAF1 and RNAPII was not affected. The lack of both dTAF10 proteins resulted in failures in the larval-pupal transition during metamorphosis and in transcriptional reprogramming at this developmental stage. Importantly, the phenotype resulting from dTaf10+dTaf10b mutation could be rescued by ectopically added ecdysone, suggesting that dTAF10- and/or dTAF10b-containing complexes are involved in the expression of ecdysone biosynthetic genes. These data support the idea that the presence of dTAF10 proteins in dTFIID and/or dSAGA is required only at specific developmental steps. It is proposed that distinct forms of dTFIID and/or dSAGA exist during Drosophila metamorphosis, wherein different TAF compositions serve to target RNAPII at different developmental stages and tissues (Pahi, 2015).

Rapid dynamics of general transcription factor TFIIB binding during preinitiation complex assembly revealed by single-molecule analysis

Transcription of protein-encoding genes in eukaryotic cells requires the coordinated action of multiple general transcription factors (GTFs) and RNA polymerase II (Pol II; see Drosophila Pol II). A "step-wise" preinitiation complex (PIC) assembly model has been suggested based on conventional ensemble biochemical measurements, in which protein factors bind stably to the promoter DNA sequentially to build a functional PIC. However, recent dynamic measurements in live cells suggest that transcription factors mostly interact with chromatin DNA rather transiently. To gain a clearer dynamic picture of PIC assembly, this study established an integrated in vitro single-molecule transcription platform reconstituted from highly purified human transcription factors and complemented it by live-cell imaging. Real-time measurements were performed of the hierarchal promoter-specific binding of TFIID, TFIIA, and TFIIB. Surprisingly, it was found that while promoter binding of TFIID and TFIIA is stable, promoter binding by TFIIB is highly transient and dynamic (with an average residence time of 1.5 sec). Stable TFIIB-promoter association and progression beyond this apparent PIC assembly checkpoint control occurs only in the presence of Pol II-TFIIF. This transient-to-stable transition of TFIIB-binding dynamics has gone undetected previously and underscores the advantages of single-molecule assays for revealing the dynamic nature of complex biological reactions (Zhang, 2016).

Identification of regions in the Spt5 subunit of DSIF that are involved in promoter proximal pausing

DRB-sensitivity inducing factor (DSIF2, or Spt4/5) is a conserved transcription elongation factor that both inhibits and stimulates transcription elongation in metazoans. In Drosophila and vertebrates, DSIF together with negative elongation factor (NELF) associates with RNA polymerase II (Pol II) during early elongation and causes Pol II to pause in the promoter proximal region of genes. The mechanism of how DSIF establishes pausing is not known. This study constructed Spt5 mutant forms of DSIF and tested their capacity to restore promoter proximal pausing to DSIF-depleted Drosophila nuclear extracts. The C-terminal repeats (CTR) region of Spt5, which has been implicated in both inhibition and stimulation of elongation, is dispensable for promoter proximal pausing. A region encompassing KOW4 and KOW5 of Spt5 is essential for pausing, and mutations in KOW5 specifically shift the location of the pause. RNA crosslinking analysis reveals that KOW5 directly contacts the nascent transcript and deletion of KOW5 disrupts this interaction. These results suggest that KOW5 is involved in promoter proximal pausing through contact with the nascent RNA (Qiu, 2017).

Drosophila TRF2 and TAF9 regulate lipid droplet size and phospholipid fatty acid composition

The general transcription factor TBP (TATA-box binding protein) and its associated factors (TAFs) together form the TFIID complex, which directs transcription initiation. Through RNAi and mutant analysis, this study identified a specific TBP family protein, TRF2, and a set of TAFs that regulate lipid droplet (LD) size in the Drosophila larval fat body. Among the three Drosophila TBP genes, trf2, tbp and trf1, only loss of function of trf2 results in increased LD size. Moreover, TRF2 and TAF9 regulate fatty acid composition of several classes of phospholipids. Through RNA profiling, TRF2 and TAF9 were found to affect the transcription of a common set of genes, including peroxisomal fatty acid beta-oxidation-related genes that affect phospholipid fatty acid composition. Knockdown of several TRF2 and TAF9 target genes results in large LDs, a phenotype which is similar to that of trf2 mutants. Together, these findings provide new insights into the specific role of the general transcription machinery in lipid homeostasis (Fan, 2017).

This study reveals a rather specific role of TRF2 and TAFs, which are general transcription factors, in regulating LD size. In addition, TRF2 and TAF9 affect phospholipid fatty acid composition, most likely through ACOX genes which mediate peroxisomal fatty acid β-oxidation (Fan, 2017).

By binding to their responsive elements in target genes, specific transcription factors like SREBP (see Drosophila Srebp), PPARs and NHR49, play important roles in lipid metabolism. It is interesting to find that the general transcription machineries, in this case TRF2 and core TAFs, also exhibit specificity in regulating lipid metabolism. In the Drosophila late 3rd instar larval fat body, defects in trf2 cause increased LD size, whereas mutation of the other two homologous genes, tbp and trf1, have no obvious effects on lipid storage. Inactivation of taf genes causes a similar phenotype to trf2 mutation, suggesting that TRF2 may associate with these TAF proteins to direct transcription of specific target genes. Moreover, trf2 mutants have large LDs at both 2nd and early 3rd instar larval stages, suggesting that general transcription factors are also required at early developmental stages for LD size regulation. Interestingly, taf9 mutants have no obvious phenotype at these stages. It is possible that TAF9 may act as an accessory factor compared to promoter-binding TRF2. This is consistent with the fact that less genes are affected in taf9 mutants than trf2 mutants in RNA-seq analysis. It was also found that knockdown of trf2 in larval and adult fat body leads to different LD phenotype. This may be due to different lipid storage status or different LD size regulatory mechanisms between larval and adult stages (Fan, 2017).

The finding of this study adds to the growing evidence supporting a specific role of general transcription factors in lipid homeostasis. For example, knockdown of RNA Pol II subunits such as RpII140 and RpII33 leads to small and dispersed LDs in Drosophila S2 cells. Mutation in DNA polymerase δ (POLD1) leads to lipodystrophy with a progressive loss of subcutaneous fat. Furthermore, TAF8 and TAF7L were reported to be involved in adipocyte differentiation. Moreover, previous studies showed that several subunits of the Mediator complex interact with specific transcription factors and play important roles in lipid metabolism. Added together, these lines of evidence strongly support essential and specific roles of the core/basal transcriptional machinery components in lipid metabolism (Fan, 2017).

Using RNA-seq analysis, rescue experiments and ChIP-qPCR, identified several target genes regulated by TRF2 and TAF9. It is possible that other genes may regulate LD size but were missed in the RNA-seq analysis and RNAi screening assay because of either insufficient alterations in genes expression (lower than the twofold threshold) or low efficiency of RNAi. Among all the verified target genes of TRF2 and TAF9,CG10315, which strongly rescues the trf2G0071 mutant phenotype when overexpressed and encodes the eukaryotic translation initiation factor eIF2B-δ, may be a good candidate for further study. Although they are best known for their molecular functions in mRNA translation regulation, eIFs have been implicated in several other processes, including cancer and metabolism. For example, in yeast, eIF2B physically interacts with the VLCFA synthesis enzyme YBR159W. In adipocytes, eIF2α activity is correlated with the anti-lipolytic and adipogenesis inhibitory effects of the AMPK activator AICAR. In addition, given the evidence that some eIFs, such as eIF4G and eIF-4a, localize on LDsand knockdown of some eIFs, including eIF-1A, eIF-2β, eIF3ga, eIF3-S8 and eIF3-S9, results in large LDs in Drosophila S2 cells, it is important to further explore the specific mechanisms of these eIFs in LD size regulation (Fan, 2017).

Although TRF2 exists widely in metazoans and shares sequence homology in its core domain with TBP, it recognizes sequence elements distinct from the TATA-box. A previous study has investigated TRF2- and TBP-bound promoters throughout the Drosophila genome in S2 cells and revealed that some sequence elements, such as DRE, are strongly associated with TRF2 occupancy while the TATA-box is strongly associated with TBP occupancy (Isogai, 2007). This study also identified that DRE is significantly enriched in extended promoters of the 181 target genes. The distribution of TATA-boxes in the core promoters of the 181 target genes compared with all genes was further explored, and it was found that the TATA-box is not enriched in the core promoters of TRF2 target genes. The proportion of TATA-box is 0.155 (75 of 484 isoforms) for the 181 target genes while the proportion is 0.217 (7849 of 36099 isoforms) for all genes as the background. These results suggest that TRF2 and TAF9 may regulate the expression of a subset of genes by recognizing specific sequence elements such as DRE but not the TATA-box (Fan, 2017).

This study shows that expression of peroxisomal fatty acid β-oxidation pathway genes, including two acyl-CoA oxidase (ACOX) genes, CG4586 and CG9527, the β-ketoacyl-CoA thiolase gene CG9149, and the enoyl-CoA hydratase gene CG9577, is regulated by TRF2 and TAF9. Lipidomic analysis indicates that in the fat body of trf2 and taf9 RNAi, many phospholipids, such as PA, PC, PG and PI, contain more long chain fatty acids. Furthermore, knockdown of CG4586 and CG9527 in the fat body also causes similar changes.

These results coincide with the function of ACOX, which is implicated in the peroxisomal fatty acid β-oxidation pathway for catabolizing very long chain fatty acids and some long chain fatty acids. Similar to these findings, a previous study found that defective peroxisomal fatty acid β-oxidation resulted in enlarged LDs in C. elegans and blocked catabolism of LCFAs, such as vaccenic acid, which probably contributed to LD expansion in mutant worms. Since overexpressing CG4586 or CG9527 only marginally rescues the enlarged LD phenotype of trf2 mutants, it remains to be determined whether the increased level of long chain fatty acid-containing phospholipids contributes to LD size. Regarding the regulation of fatty acid chain length in phospholipids, a recent study reported that there was increased acyl chain length in phospholipids of lung squamous cell carcinoma accompanied by significant changes in the expression of fatty acid elongases (ELOVLs) compared to matched normal tissues. A functional screen followed by phospholipidomic analysis revealed that ELOVL6 is mainly responsible for phospholipid acyl chain elongation in cancer cells. The current findings provide new clues about the regulation of fatty acid chain length in phospholipids. ELOVL and the peroxisomal fatty acid β-oxidation pathway may represent two opposing regulators in determining fatty acid chain length in vivo (Fan, 2017).

Previous studies have shown that TRF2 is involved in specific biological processes including embryonic development, metamorphosis, germ cell differentiation and spermiogenesis. The current results reveal a novel function of TRF2 in the regulation of specialized transcriptional programs involved in LD size control and phospholipid fatty acid composition. Since TRF2 is conserved among metazoans, its role in the regulation of lipid metabolism may be of considerable relevance to various organisms including mammals. These findings may provide new insights into both the regulation of lipid metabolism and the physiological functions of TRF2 (Fan, 2017).

Assembly of SNAPc, Bdp1, and TBP on the U6 snRNA gene promoter in Drosophila melanogaster

U6 snRNA is transcribed by RNA polymerase III (Pol III) and has an external upstream promoter that consists of a TATA sequence recognized by the TBP subunit of the Pol III basal transcription factor IIIB, and a proximal sequence element (PSE) recognized by the small nuclear RNA activating protein complex (SNAPc). Previous work found that Drosophila melanogaster SNAPc (DmSNAPc) bound to the U6 PSE can recruit the Pol III general transcription factor Bdp1 to form a stable complex with the DNA. This study shows that DmSNAPc-Bdp1 can recruit TBP to the U6 promoter, and a region of Bdp1 was identified that is sufficient for TBP recruitment. Moreover, it was found that this same region of Bdp1 cross-links to nucleotides within the U6 PSE at positions that also cross-link to DmSNAPc. Finally, cross-linking mass spectrometry reveals likely interactions of specific DmSNAPc subunits with Bdp1 and TBP. These data, together with previous findings, have allowed the build of a more comprehensive model of the DmSNAPc-Bdp1-TBP complex on the U6 promoter that includes nearly all of DmSNAPc, a portion of Bdp1, and the conserved region of TBP (Kim, 2020).

RNA polymerase III (Pol III) transcribes genes for tRNAs, 5S rRNA, and various small nuclear RNAs (snRNAs). Genes for the tRNAs and 5S rRNA have gene-internal promoters that usually are TATA-less. However, other genes, including U6 snRNA, 7SK RNA, tRNAsel, H1, and MRP RNAs, have gene-external promoters that consist of two distinct elements, a TATA sequence and a proximal sequence element (PSE) centered about 30 and 55 bp, respectively, upstream of the transcription start site. The TATA sequence is recognized by the Pol III general transcription factor TFIIIB, and the PSE is recognized by the small nuclear RNA activating protein complex (SNAPc) (Kim, 2020).

TFIIIB contains three subunits, most often TBP, Brf1, and Bdp1. These three subunits form an architectural scaffold for Pol III recruitment and together coordinate conformational changes that lead to the formation of an open complex. Interestingly, depending upon the type of gene and/or the organism, TFIIIB can exhibit subunit heterogeneity. For example, in the fruit fly Drosophila melanogaster, the TFIIIB that assembles on Pol III genes that have internal promoters contains the TBP-related factor 1 (TRF1) in place of TBP (Verma, 2013). However, U6 and U6-type genes with external promoters utilize a TFIIIB that contains the canonical TBP rather than TRF1 (Verma, 2013). In another example, human Pol III-transcribed genes with internal promoters utilize a TFIIIB that contains canonical Brf1, whereas Pol III-transcribed snRNA genes require an alternative Brf known as Brf2 (Kim, 2020).

SNAPc is a multisubunit factor that binds to the PSE (termed the PSEA in fruit flies, the subject of this paper) to activate the transcription of snRNA genes. D. melanogaster SNAPc (DmSNAPc) consists of three subunits, DmSNAP190, DmSNAP50, and DmSNAP43, that are homologs of the three essential subunits of human SNAPc. Although all three DmSNAPc subunits are required for DNA-binding activity, little is understood of the specific roles that the individual fly or human SNAPc subunits play in the recruitment of TFIIIB and the transcriptional activation of snRNA genes (Kim, 2020).

Previously, by using site-specific protein-DNA photo-cross-linking assays, nucleotide positions were identified where each of the individual DmSNAPc subunits cross-linked as part of the complex to U6 snRNA gene promoter DNA. Likewise, interactions were reported of the TFIIIB subunits (in the absence of DmSNAPc) with specific nucleotides in the U6 snRNA gene promoter. Those studies revealed both the linear positions (translational location along the DNA helix) and rotational positions (face of the DNA double helix) occupied by each of the DmSNAPc and TFIIIB subunits on the DNA. Furthermore, by cleaving the DmSNAPc proteins at specific sites after photo-cross-linking, it was possible to identify domains or regions of DmSNAP190, DmSNAP50, and DmSNAP43 that cross-linked to specific nucleotides within or adjacent to the PSEA (Kim, 2020).

Finally, in more recent work, it was found that DmSNAPc can recruit Bdp1 to the U6 snRNA gene promoter in the absence of TBP and Brf1 (Verma, 2018). Furthermore, an 87-amino-acid region of Bdp1 was identified that was required for Bdp1 to be recruited to the U6 snRNA gene promoter by DmSNAPc. Over the years, this has allowed the building of a more and more encompassing picture of the architecture of the protein-DNA complex assembled on the U6 promoter (Kim, 2020).

Given the findings from that previous work, this study has now examined the recruitment of TBP to the U6 snRNA gene promoter by the DmSNAPc-Bdp1 complex. Furthermore, site-specific protein-DNA photo-cross-linking assays were applied to map the DmSNAPc, Bdp1, and TBP interactions with specific nucleotides of the U6 promoter. Finally, the architecture was examined of both the DmSNAPc-Bdp1-U6 promoter complex and the DmSNAPc-Bdp1-TBP-U6 promoter complex by applying cross-linking mass spectrometry (CXMS). The results of these studies allowed development of a more detailed model of the Pol III transcriptional machinery assembled on the U6 snRNA gene promoter that includes nearly all of DmSNAPc and parts of the TFIIIB components Bdp1 and TBP (Kim, 2020).

The canonical pathway for the assembly of the Pol III preinitiation complex (PIC) on tRNA genes involves the binding of TFIIIC to the gene-internal promoter followed by recruitment of TFIIIB (either preassembled or assembled in a stepwise process that involves the initial recruitment of Brf1 and TBP, followed by Bdp1 in a subsequent step) and finally RNA polymerase. (PIC assembly on 5S genes is believed to be similar but requires the prior binding of TFIIIA to aid in the recruitment of TFIIIC.) In contrast, the results raise the interesting possibility that PIC assembly on Pol III genes with external promoters in D. melanogaster proceeds by an alternate pathway that involves the following initial steps: first, DmSNAPc binds to the PSEA; second, DmSNAPc recruits Bdp1; and third, the promoter-bound DmSNAPc-Bdp1 complex and the TATA box recruit TBP. Brf1 and RNA polymerase may, in turn, assemble on the promoter at a subsequent step of PIC formation (Kim, 2020).

This study has proposed a model for the DmSNAPc-Bdp1-TBP complex on the U6 promoter that is consistent with EMSA, site-specific protein-DNA photo-cross-linking, and CXMS experiments. Furthermore, the DmSNAPc model is fully consistent with coimmunoprecipitation experiments that mapped regions of the three DmSNAPc subunits that are required for their assembly with each other. The model further provides a rationale for the recruitment of Bdp1 and TBP by DmSNAPc. Bdp1 cross-links to DNA nucleotide positions that extend upstream of the TATA box into positions that are actually a part of the PSEA. These positions are also occupied by DmSNAP190 and DmSNAP43 (but not DmSNAP50), indicating that Bdp1 must lie in close proximity to DmSNAP190 and DmSNAP43. Also supporting this model, the CXMS experiments revealed cross-linking of Bdp1 with DmSNAP190 and DmSNAP43 (but not with DmSNAP50) (Kim, 2020).

Furthermore, additional evidence was generated, beyond that previously published, that residues 424 to 510 of Bdp1 are involved in the recruitment of Bdp1 by DmSNAPc. For example, an internal deletion of residues 424 to 510 resulted in the complete loss of Bdp1 recruitment by DmSNAPc. Moreover, Bdp1 residues 424 to 510 alone exhibited the same pattern as full-length Bdp1 in site-specific protein-DNA photo-cross-linking, suggesting that this region of Bdp1 extended into the U6 PSEA, where it would reside in close proximity to DmSNAP190 and DmSNAP43. Finally, the CXMS data with full-length Bdp1 showed that Bdp1 residues 424 to 510, together with nearby residues flanking that region, were responsible for the majority of the protein-protein cross-links between DmSNAPc and Bdp1 (Kim, 2020).

In work by others, the N-terminal region of human Bdp1, more so than the C-terminal region, was found to interact with DmSNAPc. Interestingly, the CXMS studies revealed that lysines within the N-terminal region of Bdp1 (lysines 203, 206, and 231) cross-link to both DmSNAP190 and DmSNAP43. Thus, it is possible that a region of fly Bdp1 N-terminal of the SANT domain, as well as residues 424 to 510 C-terminal of the SANT domain, interact with DmSNAPc. Perhaps this potential N-terminal interaction of fly Bdp1 with DmSNAPc is not stable enough to be detected in the current EMSAs (Kim, 2020).

Interestingly, by EMSA, it has not been possible to convincingly demonstrate the existence of a complex that contains both DmSNAPc and Brf1 together with Bdp1 and TBP. Essentially, either DmSNAPc-Bdp1-TBP or Brf1-Bdp1-TBP, which lacks DmSNAPc, is seen. The modeling suggests a rationale for this result. The finding that a region of DmSNAP43 lies on or near the upper surface of TBP suggests that the binding of DmSNAPc and that of Brf1 are mutually exclusive. Yeast Brf1 was modeled into the proposed SNAPc-Bdp1-TBP complex in accordance with a published cryo-EM structure. Depending upon the exact positioning of DmSNAP43, it may sterically or otherwise interfere with the binding of Brf1 along the upper surface of TBP. If this is true, it would suggest some form of regulation to govern the transition from a DmSNAPc-Bdp1-TBP complex to a Brf1-Bdp1-TBP complex (Kim, 2020).

In light of this potential regulation, the finding cannot be ignored that the C-terminal region of fly DmSNAP190 appears to be structurally related to the ligand-binding domains of members of the nuclear hormone receptor superfamily. The location of this domain, near DmSNAP43 and the SANT domain of Bdp1, raises the intriguing possibility that the activity of D. melanogaster SNAPc and the expression of snRNA genes are regulated by an unknown small organic molecule of intracellular or extracellular origin. This could provide an interesting avenue of future research (Kim, 2020).

The work reported in this study furthermore suggests pathways toward U6 preinitiation complex assembly in flies and humans that are analogous but different with respect to the intermediary factor that acts as a stabilizing bridge between SNAPc and TBP. In flies, PSEA-bound DmSNAPc recruits Bdp1 in a TATA box-independent manner and TBP in a TATA-dependent manner, with Bdp1 acting to stabilize DmSNAPc and TBP on the PSEA and TATA box, respectively. In humans, factor assembly appears to occur analogously but involving Brf2 instead of Bdp1: PSE-bound SNAPc interacts with Brf2 independent of a TATA box, and this complex recruits TBP only in the presence of a TATA box. One obvious explanation for the difference is that flies do not have Brf2, so different mechanisms have evolved in flies and humans for TBP recruitment to U6 gene promoters (Kim, 2020).

In a broader sense, work on snRNA genes has extended the perspective on the diversity of the TFIIIB components that can be assembled into the Pol III PIC: TBP (for snRNA genes) versus TRF1 (for tRNA and 5S RNA genes) in flies and Brf2 (for snRNA genes) versus Brf1 (for tRNA and 5S RNA genes) in humans. The only constant TFIIIB component known so far is Bdp1. The snRNA work has also revealed different pathways for TFIIIB assembly, at least in vitro, on SNAPc-dependent genes versus TFIIIC-dependent genes. The former seem to proceed initially by SNAPc-dependent recruitment of Bdp1 or Brf2, followed by TBP recruitment, whereas the latter are thought to occur by TFIIIC-dependent recruitment of TFIIIB either as a preformed complex or proceeding first through Brf1 and TBP recruitment, followed by Bdp1 in a subsequent step (Kim, 2020).

TFIID Enables RNA Polymerase II Promoter-Proximal Pausing

RNA polymerase II (RNAPII) transcription is governed by the pre-initiation complex (PIC), which contains TFIIA, TFIIB, TFIID, TFIIE, TFIIF, TFIIH, RNAPII, and Mediator. After initiation, RNAPII enzymes pause after transcribing less than 100 bases; precisely how RNAPII pausing is enforced and regulated remains unclear. To address specific mechanistic questions, human RNAPII promoter-proximal pausing was reconstituted in vitro, entirely with purified factors (no extracts). As expected, NELF and DSIF increased pausing, and P-TEFb promoted pause release. Unexpectedly, the PIC alone was sufficient to reconstitute pausing, suggesting RNAPII pausing is an inherent PIC function. In agreement, pausing was lost upon replacement of the TFIID complex with TATA-binding protein (TBP), and PRO-seq experiments revealed widespread disruption of RNAPII pausing upon acute depletion (t = 60 min) of TFIID subunits in human or Drosophila cells. These results establish a TFIID requirement for RNAPII pausing and suggest pause regulatory factors may function directly or indirectly through TFIID (Fant, 2020).

RNA polymerase II (RNAPII) transcribes all protein-coding and many non-coding RNAs in the human genome. RNAPII transcription initiation occurs within the pre-initiation complex (PIC), which contains TFIIA, TFIIB, TFIID, TFIIE, TFIIF, TFIIH, RNAPII, and Mediator. After initiation, RNAPII enzymes typically pause after transcribing 20-80 bases, and paused polymerases represent a common regulatory intermediate. Accordingly, paused RNAPII has been implicated in enhancer function, development and homeostasis, and diseases ranging from cancer to viral pathogenesis. Precisely how RNAPII promoter-proximal pausing is enforced and regulated remains unclear; however, protein complexes, such as NELF and DSIF, increase pausing, whereas the activity of CDK9 (P-TEFb complex) correlates with pause release (Fant, 2020).

Although much has been learned about RNAPII promoter-proximal pausing and its regulation, the underlying molecular mechanisms remain enigmatic. One reason for this is the complexity of the human RNAPII transcription machinery, which includes the ∼4.0 MDa PIC and many additional regulatory factors. Another underlying reason is that much current understanding derives from cell-based assays, which are indispensable but cannot reliably address mechanistic questions. For instance, factor knockdowns or knockouts cause unintended secondary effects and the factors and biochemicals present at each gene in a population of cells cannot possibly be defined. In vitro assays can overcome such limitations, but these have typically involved nuclear extracts, which contain a similarly undefined mix of proteins, nucleic acids, and biochemicals. To circumvent these issues, this study sought to reconstitute RNAPII promoter-proximal pausing entirely from purified human factors (no extracts). Success with this task enabled addressing some basic mechanistic questions and opens the door for future studies to better define the contribution of specific factors in RNAPII promoter-proximal pause regulation (Fant, 2020).

Structural data indicate that TFIID lobe C subunits TAF1 (see Drosophila Taf250) and TAF2 bind promoter DNA downstream of the TSS (Louder, 2016; Patel, 2018). Past studies revealed that insertion of 10-bp DNA at the +15 site relative to the TSS disrupted RNAPII pausing at the HSP70 gene in Drosophila S2 cells (Kwak, 2013). This led to a 'complex interaction' model for pausing, in which a promoter-bound factor(s) establishes an interaction (directly or indirectly) with the paused RNAPII complex. In agreement with this model, a TFIID requirement was observed for RNAPII promoter-proximal pausing in vitro, which is further supported by PRO-seq data in TAF-depleted human and Drosophila S2 cells. Additional evidence for TFIID-dependent regulation of RNAPII pausing derives from correlations among paused genes and DNA sequence elements bound by TFIID. Defects in TFIID function are linked to numerous diseases, including cancer and neurodegenerative disorders. Its requirement for RNAPII promoter-proximal pause regulation may underlie these and other biological functions (Fant, 2020).

Biochemical reconstitution of RNAPII promoter-proximal pausing provides a level of mechanistic control that is simply not possible with cell-based assays; consequently, it was discovered that RNAPII pausing is an inherent property of the human PIC and that TFIID is a key PIC factor that establishes pausing. The results also reveal NELF, DSIF, and P-TEFb as auxiliary factors that, although not required for pausing, enable robust regulation of this common transcriptional intermediate state. Time course experiments indicated that polymerases in the paused region remained active and generated elongated transcripts over time. Experiments with P-TEFb showed enhanced release of paused intermediates, providing further evidence that polymerases in the paused region were active and competent for elongation. However, some transcripts remained in the pause region after the 10-min reactions, even with added P-TEFb. This result is also consistent with current models that invoke alternative outcomes for promoter-proximal paused RNAPII, including premature termination, arrest, or a more stable paused intermediate. Addressing the mechanisms and factors that regulate these distinct outcomes could be explored in future studies (Fant, 2020).

Despite its advantages, the reconstituted in vitro transcription assay does not match the complexity of regulatory inputs that converge upon active promoters in a living cell. To test the TFIID requirement for promoter-proximal pausing in cells, it was possible to rapidly deplete TFIID lobe C subunits TAF1 and TAF2 using Trim-Away, and genome-wide changes in nascent transcription were assessed with PRO-seq. Consistent with the in vitro data, global transcription increased at protein-coding genes upon TAF1/2 knockdown, with evidence for enhanced pause release. PRO-seq reads increased at 5' ends and downstream of promoter-proximal pause sites at thousands of genes in TAF1/2-depleted cells. These data are consistent with increased pause release and increased re-initiation, two processes that are coupled in metazoan cells. Unexpectedly, however, increased pause release did not yield similar genome-wide increases in gene body reads. Instead, the PRO-seq data revealed a sharp reduction in reads downstream of promoter-proximal pause sites, at around +300 from the TSS in both human and Drosophila cells. These results implicate additional regulatory mechanisms, downstream of the pause site, that may terminate or arrest RNAPII. Although future studies are needed to identify the factors involved, it is noted that the Integrator complex was recently shown to cleave nascent transcripts downstream of pause sites at hundreds of genes in Drosophila cells (Tatomer, 2019). Because promoter-proximal pausing helps ensure proper capping of transcripts at their 5' ends, downstream regulatory mechanisms may become important when RNAPII promoter-proximal pausing is disrupted (Fant, 2020).

A TFIID requirement for RNAPII promoter-proximal pausing implies that other pause regulatory factors may function directly or indirectly through TFIID. Although additional mechanistic aspects remain to be addressed, it is notable that pause regulatory factors, including P-TEFb and MYC, interact (directly or indirectly) with TFIID; moreover, TFIID is conformationally flexible and likely undergoes structural reorganization during RNAPII transcription initiation and pause release. Such structural transitions may contribute to TFIID-dependent regulation of RNAPII pausing. Whereas nucleosomes likely affect promoter-proximal pausing, they are not required, based upon our results and data in Drosophila and mammalian systems. TFIID possesses multiple domains that bind chromatin marks associated with transcriptionally active loci, including H3K4me3, which suggests TFIID function is regulated in part through epigenetic mechanisms. Future studies should help establish whether specific chromatin marks contribute to TFIID-dependent regulation of RNAPII pausing, potentially by affecting TFIID promoter occupancy or by impacting TFIID structure and function (Fant, 2020).

The Integrator complex cleaves nascent mRNAs to attenuate transcription

Cellular homeostasis requires transcriptional outputs to be coordinated, and many events post-transcription initiation can dictate the levels and functions of mature transcripts. To systematically identify regulators of inducible gene expression, high-throughput RNAi screening of the Drosophila Metallothionein A (MtnA) promoter was performed. This revealed that the Integrator complex, which has a well-established role in 3' end processing of small nuclear RNAs (snRNAs), attenuates MtnA transcription during copper stress. Integrator is an evolutionarily conserved complex that contains 14 subunits and regulates RNA processing and gene transcription by associating with the C-terminal domain of RNA polymerase II large subunit. Integrator complex subunit 11 (IntS11) endonucleolytically cleaves MtnA transcripts, resulting in premature transcription termination and degradation of the nascent RNAs by the RNA exosome, a complex also identified in the screen. Using RNA-seq, >400 additional Drosophila protein-coding genes whose expression increases upon Integrator depletion. This study focused on a subset of these genes and confirmed that Integrator is bound to their 5' ends and negatively regulates their transcription via IntS11 endonuclease activity. Many noncatalytic Integrator subunits, which are largely dispensable for snRNA processing, also have regulatory roles at these protein-coding genes, possibly by controlling Integrator recruitment or RNA polymerase II dynamics. Altogether, these results suggest that attenuation via Integrator cleavage limits production of many full-length mRNAs, allowing precise control of transcription outputs (Tatomer, 2019).

In response to physiological cues, environmental stress, or exposure to pathogens, specific transcriptional programs are induced. These responses are often coordinated, rapid, and robust, in part because many metazoan genes are maintained in a poised state with RNA polymerase II (RNAPII) engaged prior to induction. In addition to promoter-proximal pausing, there are many regulatory steps post transcription initiation that dictate the characteristics and fate of mature transcripts. For example, alternative splicing and/or 3' end processing events can lead to the production of multiple isoforms from a single locus, and these transcripts can have distinct stabilities, translation potential, or subcellular localization (Tatomer, 2019).

It is particularly important that genes produce full-length functional mRNAs and mechanisms such as telescripting, involving U1 snRNP, actively suppress premature cleavage and polyadenylation events in eukaryotic cells. Nevertheless, many promoters are known to generate short unstable RNAs. This suggests that premature transcription termination may often occur, thereby limiting RNAPII elongation and production of full-length mRNAs (for review, see Kamieniarz-Gdula and Proudfoot 2019). Moreover, this process can be regulated. For example, it was recently shown that the cleavage and polyadenylation factor PCF11 stimulates premature termination to attenuate the expression of many transcriptional regulators in human cells (Kamieniarz-Gdula, 2019). Potentially deleterious truncated transcripts generated by premature termination are often removed from cells by RNA surveillance mechanisms, including by the RNA exosome. However, the full repertoire of cellular factors and cofactors that control the metabolic fate of nascent RNAs, especially during the early stages of transcription elongation, is still unknown (Tatomer, 2019).

An unbiased genome-scale RNAi screen was performed in Drosophila cells to reveal factors that control the output of a model inducible eukaryotic promoter. Transcription of Drosophila Metallothionein A (MtnA), which encodes a metal chelator, is rapidly induced when the intracellular concentration of heavy metals (e.g., copper or cadmium) is increased. This increase in transcriptional output is dependent on the MTF-1 transcription factor, which relocalizes to the nucleus upon metal stress and binds to the MtnA promoter. The RNAi screen identified MTF-1 and other known regulators of MtnA transcription, but also surprisingly identified the Integrator complex as a potent inhibitor of MtnA during copper stress. Integrator harbors an endonuclease that cleaves snRNAs and enhancer RNAs, and this study has found that Integrator can likewise cleave nascent MtnA transcripts to limit mRNA production. Using RNA-seq, hundreds of additional Drosophila protein-coding genes were found whose expression increases upon Integrator depletion. Focused studies on a subset of these genes confirmed that Integrator can cleave these nascent RNAs, thereby limiting productive transcription elongation. Altogether, it is proposed that Integrator-catalyzed premature termination can function as a widespread and potent mechanism to attenuate expression of protein-coding genes (Tatomer, 2019).

Altogether, the data indicate that the Integrator complex can attenuate the expression of protein-coding genes by catalyzing premature transcription termination. The IntS11 endonuclease cleaves a subset of nascent mRNAs, which ultimately triggers degradation of the transcripts by the RNA exosome along with RNAPII termination. It is suggested that many protein-coding genes are negatively regulated via this attenuation mechanism, and the Drosophila MtnA promoter highlights context-specific regulation by Intgerator. Transcription of MtnA is induced by copper or cadmium stress, and yet this study finds that Integrator is robustly recruited to the MtnA promoter only under copper stress conditions. This is not because the Integrator complex is generally diassembled or 'poisoned' by cadmium, as Integrator continues to regulate the outputs of other protein-coding genes. It is instead proposed that context-specific regulation of this locus may be related to the fact that cadmium is a strictly toxic metal, while copper is required for the function of a subset of enzymes and must be maintained in a narrow concentration range. Therefore, homeostatic control of MtnA is required to maintain copper levels, while cells need to maximally produce MtnA in the presence of cadmium. It is thus proposed that regulation of MtnA levels by Integrator during copper stress is for fine-tuning purposes, perhaps to limit maximal transcriptional induction and/or facilitate transcriptional shut-off once copper stress has passed. The results suggest that the Integrator complex can be recruited to gene loci only when needed, thereby ensuring tight control over transcriptional output (Tatomer, 2019).

In addition to cleaving MtnA transcripts, Integrator cleaves multiple other RNA classes in metazoan cells, including enhancer RNAs (Lai, 2015), snRNAs (Baillat, 2005), telomerase RNA (Rubtsova, 2019), and some herpesvirus microRNA precursors (Cazalla, 2011; Xie, 2015). Using RNA-seq, this study has expanded this list of Integrator target loci and identified hundreds of additional protein-coding genes that are negatively regulated by Integrator. Focused is placed on a set of Integrator-dependent genes; Integrator was found to catalyze premature transcription termination of these genes, consistent with prior studies that suggested roles for Integrator in termination (Skaar, 2015; Shah, 2018; Gomez-Orte, 2019). Some of these genes (CG8620, Pepck1, and Sirup) have promoter-proximal RNAPII that rapidly turns over, which may indicate that Integrator can aid in clearing paused or stalled RNAPII. Once Integrator has cleaved the nascent mRNAs, this study finds that they are rapidly degraded from their 3' ends by the RNA exosome. This may be critical for enabling subsequent rounds of transcription (especially at the MtnA locus), perhaps because the small RNAs can form stable RNA-DNA hybrids (R-loops) that block transcription initiation or elongation (Tatomer, 2019).

Endonucleolytic cleavage is critical for Integrator regulation at snRNA and protein-coding genes, but the data indicate that these loci have different dependencies on Integrator subunits. Genetic studies indicate that Integrator subunits 4, 9, and 11 (which form the Integrator cleavage module) are most important for snRNA processing, while the non-catalytic Integrator subunits (all of which currently lack annotated molecular functions) play minor roles. In contrast, large increases in mRNA expression were observed when many of the non-catalalytic subunits were depleted (especially IntS1, IntS2, IntS5, IntS6, IntS7, and IntS8). IntS13 was recently shown to be able to function independently from other Integrator subunits at enhancers (Barbieri, 2018), suggesting the existence of submodules or 'specialized' complexes that may enable the activity and function of Integrator to be distinctly regulated depending on the gene locus and cellular state. Future work will reveal the subunit requirements of Integrator complexes at distinct loci and clarify the interplay between IntS11 endonuclease activity and other Integrator subunits. For example, the non-catalytic subunits may be critical for the formation and targeting of the complex to specific loci and/or controlling RNAPII dynamics (Tatomer, 2019).

Finally, it is noted that the metazoan Integrator complex has parallels with the yeast Nrd1-Nab3-Sen1 (NNS) complex that (1) terminates transcription at both mRNA and snRNA loci and (2) interacts with the RNA exosome. Interestingly, the underlying molecular mechanisms of transcription termination carried out by these two complexes are quite distinct. NNS uses the Sen1 helicase to pull the nascent transcript out of the RNAPII active site, while Integrator likely promotes termination by taking advantage of its RNA endonuclease activity and providing an entry site for a 5'-3' exonuclease. There is currently conflicting data on whether the canonical 'torpedo' exonuclease Rat1/Xrn2 is involved in termination at snRNA genes as only subtle termination defects have been observed at these loci when Rat1/Xrn2 is depleted from cells. Notably, Cpsf73 has been shown to behave as both an endonuclease and exonuclease, raising the possibility that IntS11 could support a 'Rat1/Xrn2-like' function and mediate termination. Future studies that compare and contrast the Integrator and NNS complexes, especially how their recruitment and termination activities are controlled, will shed light on this important facet of gene regulation. In summary, transcription attenuation through premature termination was first described decades ago in bacteria, and the current work indicates that the metazoan Integrator complex can function analogously to limit expression from protein-coding genes (Tatomer, 2019).

Mediator and RNA polymerase II clusters associate in transcription-dependent condensates

Models of gene control have emerged from genetic and biochemical studies, with limited consideration of the spatial organization and dynamics of key components in living cells. This study used live-cell superresolution and light-sheet imaging to study the organization and dynamics of the Mediator coactivator and RNA polymerase II (Pol II) directly. Mediator and Pol II each form small transient and large stable clusters in living embryonic stem cells. Mediator and Pol II are colocalized in the stable clusters, which associate with chromatin, have properties of phase-separated condensates, and are sensitive to transcriptional inhibitors. It is suggested that large clusters of Mediator, recruited by transcription factors at large or clustered enhancer elements, interact with large Pol II clusters in transcriptional condensates in vivo (Cho, 2018).

A conventional view of eukaryotic gene regulation is that transcription factors, bound to enhancer DNA elements, recruit coactivators such as the Mediator complex, which is thought to interact with RNA polymerase II (Pol II) at the promoter. This model is supported by a large body of molecular genetic and biochemical evidence, yet the direct interaction of Mediator and Pol II has not been observed and characterized in living cells. Using superresolution and light-sheet imaging, the organization and dynamics of endogenous Mediator and Pol II in live mouse embryonic stem cells (mESCs) was studied. Whether Pol II and Mediator interact in a manner consistent with condensate formation was directly tested, their biophysical properties were quantitatively characterized, and the implications of these observations for transcription regulation in living mammalian cells was considered (Cho, 2018).

To visualize Mediator and Pol II in live cells, mouse embryonic stem cell lines were generated with endogenous Mediator and Pol II labeled with Dendra2, a green-to-red photoconvertible fluorescent protein. Live-cell superresolution imaging was performed and Mediator was found to form clusters with a range of dynamic temporal signatures. Mediator exists in a population of transient small (~100 nm) clusters with an average lifetime of 11.1 ± 0.9 s, comparable to that of transient Pol II clusters observed in this study and previously in differentiated cell types. In addition, it was observed that both Mediator and Pol II form a population of large (>300 nm) clusters (~14 per cell), each comprising ~200 to 400 molecules, that are temporally stable (lasting the full acquisition window of the live-cell superresolution imaging) (Cho, 2018).

The extent to which these clusters depend on the stem cell state was tested. The mESCs were subjected to a protocol to differentiate them into epiblastlike cells (EpiLCs) within 24 h. Differentiation had no apparent effect on the population of transient clusters, consistent with previous observations that transient clusters persist in differentiated cell types. However, both the size and the number of stable clusters decreased along the course of differentiation, suggesting that these stable clusters are prone to change as cells differentiate (Cho, 2018).

Focused was placed on the stable clusters of Mediator and Pol II and whether they are colocalized was investigated. mESCs were generated with endogenous Mediator and Pol II tagged with JF646-HaloTag and Dendra2, respectively. Direct imaging of both JF646-Mediator and Dendra2-Pol II showed bright spots of large accumulations in the nucleus, which corresponded to stable Pol II clusters according to subsequent superresolution imaging of Dendra2-Pol II in the same nuclei. The same observations were made with Dendra2-Mediator. Of 143 Mediator clusters imaged by dual-color light-sheet imaging, 129 (90%) had a colocalizing Pol II cluster. It was concluded that these Mediator and Pol II clusters colocalize in live mESCs (Cho, 2018).

Previous studies have shown that high densities of Mediator are located at enhancer clusters called super-enhancers (SEs) and that some are disrupted by loss of the BET (bromodomain and extraterminal family) protein BRD4 (Drosophila homolog: fs(1)h), which is a cofactor associated with Mediator. This study found that treatment of mESCs with JQ1, a drug that causes loss of BRD4 from enhancer chromatin, dissolved transient and stable clusters of both Mediator and Pol II clusters (Cho, 2018).

After transcription initiation, Pol II transcribes a short distance (~100 base pairs), pauses, and is released to continue elongation when phosphorylated by CDK9. It was hypothesized that inhibition of CDK9 might selectively affect the Pol II stable clusters. It was observed that upon incubation with DRB (5,6-dichloro-1-beta-d-ribofuranosyl-benzimidazole), Pol II stable clusters dissolved but Mediator stable clusters remained. Quantification of Mediator-Pol II colocalization revealed that incubation with DRB progressively decreased the fraction of Mediator stable clusters that colocalized with Pol II. This effect could be reversed when DRB was washed out; the colocalization fraction recovered completely. These results imply that the association between Mediator and Pol II clusters may be hierarchical, with upstream enhancer recruitment controlling both clusters but downstream transcription inhibition selectively affecting Pol II clusters (Cho, 2018).

The long-term dynamics of stable clusters were characterized by using lattice light-sheet imaging in live mESCs. It was observed that clusters can merge upon contact. The time scale of coalescence was very rapid, comparable to the full volumetric acquisition frame rate (15-s time interval). The added-up intensity of the two precursor clusters was close to that of the newly merged cluster. These biophysical dynamics are reminiscent of those of biomolecular condensates in vivo (Cho, 2018).

In addition to coalescence, in vivo condensates had rapid turnover of the molecular components, as shown by fast recovery in fluorescence recovery after photobleaching (FRAP) assays, and were sensitive to a nonspecific aliphatic alcohol, 1,6-hexanediol. FRAP analyses of clusters revealed very rapid dynamics and turnover of their components: 60% of the Mediator and 90% of Pol II components were exchanged within ~10 s within clusters. Moreover, the treatment of mESCs with 1,6-hexanediol resulted in the gradual dissolution of both Mediator and Pol II clusters. Together, these results suggest that the stable clusters are in vivo condensates of Mediator and Pol II (Cho, 2018).

It was hypothesized that a phase separation model with induced condensation at the recruitment step of Mediator to enhancers would qualitatively account for the observations in this study. The model implies that the condensates are chromatin associated and colocalize with enhancer-controlled active genes. Therefore these two specific implications were tested. The diffusion dynamics of Mediator clusters were tracked by computing their mean squared displacement as a function of time (n = 6 cells). On short time scales, the cluster motion was subdiffusive, with an exponent α = 0.40 ± 0.12. This is the same exponent found in the subdiffusional behavior of chromatin loci in eukaryotic cells. The same diffusional parameters were also observed when tracking a chromatin locus labeled by dCas9-based chimeric array of guide RNA oligonucleotides (CARGO) in mESCs. It is concluded that clusters diffuse like chromatin-associated domains (Cho, 2018).

It was hypothesized that clusters were in close physical proximity to actively transcribed genes that can be visualized by global run-on nascent RNA labeling with ethynyl uridine (EU). The run-on results showed that 2 min after DRB washout, virtually all Mediator clusters observed were proximal or overlapping with nascent RNA accumulations, as imaged by Click labeling of EU in fixed cells. Yhe MS2 endogenous RNA labeling system was employed to investigate whether active transcription could be observed at Esrrb, one of the top SE-controlled genes in mESCs. Bright foci were observed consistent with nascent MS2-labeled gene loci, and the gene loci were confirmed by dual-color RNA fluorescence in situ hybridization (FISH) targeting the MS2 sequence and intronic regions of Esrrb. Intronic FISH on 125 Esrrb loci from 82 fixed cells showed that 93% of Esrrb loci had a stable Mediator cluster nearby (within 1 µm) but only ~22% of the loci colocalized with a stable Mediator cluster, suggesting that the Mediator-bound enhancer only occasionally colocalizes with the gene. The variability in colocalization may be explained by a dynamic 'kissing' model, where a distal Mediator cluster colocalizes with the gene only at certain time points (Cho, 2018).

By dual-color three-dimensional (3D) live-cell imaging with lattice light-sheet microscopy, it was found that some Mediator clusters were up to a micrometer away from the active Esrrb gene locus but in some instances directly colocalized with the gene. In addition, the dynamic interaction between Mediator clusters and the gene locus was directly observed, supporting the dynamic kissing model. Tracking of loci in all six cells indicated that colocalization below the resolution limit of 300 nm occurred at ~30% of the time points. However, even when they were not overlapping, the Mediator cluster and the gene loci moved as a pair through the nucleus, consistent with two adjacent regions anchoring to the same underlying chromatin domain. It is proposed that Mediator clusters form at the Esrrb SE and then interact occasionally and transiently with the transcription apparatus at the Esrrb promoter (Cho, 2018).

This study has found that Mediator and Pol II form large stable clusters in living cells and has shown that these clusters have properties expected for biomolecular condensates. The condensate properties were evident through coalescence, rapid recovery in FRAP analysis, and sensitivity to hexanediol. In a model of phase separation on the basis of scaffold-client relationships, it is possible that enhancer-associated Mediator forms a condensate and provides a 'scaffold' for 'client' RNA Pol II molecules. The model proposed whereby large Mediator clusters at enhancers transiently kiss the transcription apparatus at promoters has a number of implications for gene control mechanisms. The presence of large Mediator clusters at some enhancers may allow Mediator condensates to contact the transcription apparatus at multiple gene promoters simultaneously. The large size of the Mediator clusters may also mean that the effective distance of the enhancer-promoter DNA elements can be in the same order as the size of the clusters (>300 nm), larger than the distance requirement for direct contact. It is speculated that such clusters may help explain gaps of hundreds of nanometers that are found in previous studies measuring distances between functional enhancer-promoter DNA elements. Such cluster sizes also imply that some long-range interactions could go undetected in DNA interaction assays that depend on much closer physical proximity of enhancer and promoter DNA elements (Cho, 2018).

Transcription factors activate genes through the phase-separation capacity of their activation domains

Gene expression is controlled by transcription factors (TFs) that consist of DNA-binding domains (DBDs) and activation domains (ADs). The DBDs have been well characterized, but little is known about the mechanisms by which ADs effect gene activation. This study, carried out in murine embryonic stem cells, reports that diverse ADs form phase-separated condensates with the Mediator coactivator. For the OCT4 and GCN4 TFs, this study shows that the ability to form phase-separated droplets with Mediator in vitro and the ability to activate genes in vivo are dependent on the same amino acid residues. For the estrogen receptor (ER), a ligand-dependent activator, it was shown that estrogen enhances phase separation with Mediator, again linking phase separation with gene activation. These results suggest that diverse TFs can interact with Mediator through the phase-separating capacity of their ADs and that formation of condensates with Mediator is involved in gene activation (Boija, 2018).

Regulation of gene expression requires that the transcription apparatus be efficiently assembled at specific genomic sites. DNA-binding transcription factors (TFs) ensure this specificity by occupying specific DNA sequences at enhancers and promoter-proximal elements. TFs typically consist of one or more DNA-binding domains (DBDs) and one or more separate activation domains (ADs). While the structure and function of TF DBDs are well documented, comparatively little is understood about the structure of ADs and how these interact with coactivators to drive gene expression (Boija, 2018).

The structure of TF DBDs and their interaction with cognate DNA sequences has been described at atomic resolution for many TFs, and TFs are generally classified according to the structural features of their DBDs. For example, DBDs can be composed of zinc-coordinating, basic helix-loop-helix, basic-leucine zipper, or helix-turn-helix DNA-binding structures. These DBDs selectively bind specific DNA sequences that range from 4 to 12 bp, and the DNA binding sequences favored by hundreds of TFs have been described. Multiple TF molecules typically bind together at any one enhancer or promoter-proximal element. For example, at least eight different TF molecules bind a 50-bp core component of the interferon (IFN)-β enhancer (Boija, 2018).

Anchored in place by the DBD, the AD interacts with coactivators, which integrate signals from multiple TFs to regulate transcriptional output. In contrast to the structured DBD, the ADs of most TFs are low-complexity amino acid sequences not amenable to crystallography. These intrinsically disordered regions (IDRs) have therefore been classified by their amino acid profile as acidic, proline, serine/threonine, or glutamine rich or by their hypothetical shape as acid blobs, negative noodles, or peptide lassos. Remarkably, hundreds of TFs are thought to interact with the same small set of coactivator complexes, which include Mediator and p300. ADs that share little sequence homology are functionally interchangeable among TFs; this interchangeability is not readily explained by traditional lock-and-key models of protein-protein interaction. Thus, how the diverse ADs of hundreds of different TFs interact with a similar small set of coactivators remains a conundrum. Recent studies have shown that the AD of the yeast TF GCN4 binds to the Mediator subunit MED15 at multiple sites and in multiple orientations and conformations. The products of this type of protein-protein interaction, where the interaction interface cannot be described by a single conformation, have been termed 'fuzzy complexes'. These dynamic interactions are also typical of the IDR-IDR interactions that facilitate formation of phase-separated biomolecular condensates (Boija, 2018).

It has recently been proposed that transcriptional control may be driven by the formation of phase-separated condensates and it was demonstrated that the coactivator proteins MED1 and BRD4 form phase-separated condensates at super-enhancers (SEs). This study report that diverse TF ADs phase separate with the Mediator coactivator. The embryonic stem cell (ESC) pluripotency TF OCT4, the estrogen receptor (ER), and the yeast TF GCN4 form phase-separated condensates with Mediator and require the same amino acids or ligands for both activation and phase separation. It is proposed that IDR-mediated phase separation with coactivators is a mechanism by which TF ADs activate genes (Boija, 2018).

The results described in this study support a model whereby TFs interact with Mediator and activate genes by the capacity of their ADs to form phase-separated condensates with this coactivator. For both the mammalian ESC pluripotency TF OCT4 and the yeast TF GCN4, it was found that the AD amino acids required for phase separation with Mediator condensates were also required for gene activation in vivo. For ER, it was found that estrogen stimulates the formation of phase-separated ER-MED1 droplets. ADs and coactivators generally consist of low-complexity amino acid sequences that have been classified as IDRs, and IDR-IDR interactions have been implicated in facilitating the formation of phase-separated condensates. It is proposed that IDR-mediated phase separation with Mediator is a general mechanism by which TF ADs effect gene expression and provide evidence that this occurs in vivo at SEs. It is suggested that the ability to phase separate with Mediator, which would employ the features of high valency and low-affinity characteristic of liquid-liquid phase-separated condensates, operates alongside an ability of some TFs to form high-affinity interactions with Mediator (Boija, 2018).

The model that TF ADs function by forming phase-separated condensates with coactivators explains several observations that are difficult to reconcile with classical lock-and-key models of protein-protein interaction. The mammalian genome encodes many hundreds of TFs with diverse ADs that must interact with a small number of coactivators, and ADs that share little sequence homology are functionally interchangeable among TFs. The common feature of ADs-the possession of low-complexity IDRs-is also a feature that is pronounced in coactivators. The model of coactivator interaction and gene activation by phase-separated condensate formation thus more readily explains how many hundreds of mammalian TFs interact with these coactivators (Boija, 2018).

Previous studies have provided important insights that prompted an investigation of the possibility that TF ADs function by forming phase-separated condensates. TF ADs have been classified by their amino acid profile as acidic, proline rich, serine/threonine rich, glutamine rich, or by their hypothetical shape as acid blobs, negative noodles, or peptide lassos. Many of these features have been described for IDRs that are capable of forming phase-separated condensates. Evidence that the GCN4 AD interacts with MED15 in multiple orientations and conformations to form a 'fuzzy complex' is consistent with the notion of dynamic low-affinity interactions characteristic of phase-separated condensates. Likewise, the low complexity domains of the FET (FUS/EWS/TAF15) RNA-binding proteins can form phase-separated hydrogels and interact with the RNA polymerase II C-terminal domain (CTD) in a CTD phosphorylation-dependent manner; this may explain the mechanism by which RNA polymerase II is recruited to active genes in its unphosphorylated state and released for elongation following phosphorylation of the CTD (Boija, 2018).

The model described in this study for TF AD function may explain the function of a class of heretofore poorly understood fusion oncoproteins. Many malignancies bear fusion-protein translocations involving portions of TFs. These abnormal gene products often fuse a DNA or chromatin-binding domain to a wide array of partners, many of which are IDRs. For example, MLL may be fused to 80 different partner genes in AML, the EWS-FLI rearrangement in Ewing's sarcoma causes malignant transformation by recruitment of a disordered domain to oncogenes, and the disordered phase-separating protein FUS is found fused to a DBD in certain sarcomas. Phase separation provides a mechanism by which such gene products result in aberrant gene expression programs; by recruiting a disordered protein to the chromatin, diverse coactivators may form phase-separated condensates to drive oncogene expression. Understanding the interactions that compose these aberrant transcriptional condensates, their structures, and behaviors may open new therapeutic avenues (Boija, 2018).

Nucleosome Positioning around Transcription Start Site Correlates with Gene Expression Only for Active Chromatin State in Drosophila Interphase Chromosomes

This study analyzed the whole-genome experimental maps of nucleosomes in Drosophila melanogaster and classified genes by the expression level in S2 cells (RPKM value, reads per kilobase million) as well as the number of tissues in which a gene was expressed (breadth of expression, BoE). Chromatin in 5'-regions of genes were classified into four states according to the hidden Markov model (4HMM). Only the Aquamarine chromatin state was considered as Active, while the remaining three states were defined as Non-Active. Surprisingly, about 20/40% of genes with 5'-regions mapped to Active/Non-Active chromatin possessed the minimal/at least modest RPKM and BoE. Regardless of RPKM/BoE the genes of Active chromatin possessed the regular nucleosome arrangement in 5'-regions, while genes of Non-Active chromatin did not show respective specificity. Only for genes of Active chromatin the RPKM/BoE positively correlates with the number of nucleosome sites upstream/around TSS and negatively with that downstream TSS. It is proposed that for genes of Active chromatin, regardless of RPKM value and BoE the nucleosome arrangement in 5'-regions potentiates transcription, while for genes of Non-Active chromatin, the transcription machinery does not require the substantial support from nucleosome arrangement to influence gene expression (Levitsky, 2020).

Quantitative imaging of transcription in living Drosophila embryos reveals the impact of core promoter motifs on promoter state dynamics

Genes are expressed in stochastic transcriptional bursts linked to alternating active and inactive promoter states. A major challenge in transcription is understanding how promoter composition dictates bursting, particularly in multicellular organisms. This study investigated two key Drosophila developmental promoter motifs, the TATA box (TATA) and the Initiator (INR). Using live imaging in Drosophila embryos and new computational methods, it was demonstrated that bursting occurs on multiple timescales ranging from seconds to minutes. TATA-containing promoters and INR-containing promoters exhibit distinct dynamics, with one or two separate rate-limiting steps respectively. A TATA box is associated with long active states, high rates of polymerase initiation, and short-lived, infrequent inactive states. In contrast, the INR motif leads to two inactive states, one of which relates to promoter-proximal polymerase pausing. Surprisingly, the model suggests pausing is not obligatory, but occurs stochastically for a subset of polymerases. Overall, these results provide a rationale for promoter switching during zygotic genome activation (Pimmett, 2021).

Comparison of transcriptional initiation by RNA polymerase II across eukaryotic species

The preinitiation complex (PIC) for transcriptional initiation by RNA polymerase (Pol) II is composed of general transcription factors that are highly conserved. However, analysis of ChIP-seq datasets reveals kinetic and compositional differences in the transcriptional initiation process among eukaryotic species. In yeast, Mediator associates strongly with activator proteins bound to enhancers, but it transiently associates with promoters in a form that lacks the kinase module. In contrast, in human, mouse, and fly cells, Mediator with its kinase module stably associates with promoters, but not with activator-binding sites. This suggests that yeast and metazoans differ in the nature of the dynamic bridge of Mediator between activators and Pol II and the composition of a stable inactive PIC-like entity. As in yeast, occupancies of TATA-binding protein (TBP) and TBP-associated factors (Tafs) at mammalian promoters are not strictly correlated. This suggests that within PICs, TFIID is not a monolithic entity, and multiple forms of TBP affect initiation at different classes of genes. TFIID in flies, but not yeast and mammals, interacts strongly at regions downstream of the initiation site, consistent with the importance of downstream promoter elements in that species. Lastly, Taf7 and the mammalian-specific Med26 subunit of Mediator also interact near the Pol II pause region downstream of the PIC, but only in subsets of genes and often not together. Species-specific differences in PIC structure and function are likely to affect how activators and repressors affect transcriptional activity (Petrenko, 2021).

Transcription factor TFIIEbeta interacts with two exposed positions in helix 2 of the Antennapedia homeodomain to control homeotic function in Drosophila

Homeodomains (HDs) increase their DNA-binding specificity by interacting with additional cofactors outlining a Hox interactome with a multiplicity of protein-protein interactions. In Drosophila, the first link of functional contact with a general transcription factor (GTF) was found between Antennapedia (Antp) and BIP2 (TFIID complex). Hox proteins also interact with other components of Pol II machinery such as the subunit Med19 from Mediator (MED) complex, TFIIEbeta and transcription-pausing factor M1BP. This paper focused on the Antp-TFIIEbeta protein-protein interface to establish the specific contacts as well as its functional role. TFIIEbeta was found to interact with Antp through the HD independently of the YPWM motif and the direct physical interaction is at helix 2, specifically amino acidic positions I32 and H36 of Antp. These two positions in helix 2 are crucial for Antp homeotic function in head involution, and thoracic and antenna-to tarsus transformations. Interestingly, overexpression of Antp and TFIIEbeta in the antennal disc showed that this interaction is required for the antenna-to-tarsus transformation. These results open the possibility to more broadly analyze Antp-TFIIEbeta interaction on the transcriptional control for the activation and/or repression of target genes in the Hox interactome during Drosophila development (Altamirano-Torres, 2018).

To analyze the interplay between Hox and the general transcription machinery, this study focused on Antp-TFIIEβ protein-protein interface to establish the specific contacts, as well as the functional role of this interaction. The results showed a direct physical interaction of TFIIEβ with the 32 and 36 positions of helix 2 Antp HD in cell culture and in vivo. These two positions on helix 2 HD are required for interaction with TFIIEβ, and this interaction is necessary for homeotic transformation (Altamirano-Torres, 2018).

The results demonstrate that Antp HD was necessary for maintaining the interaction with TFIIEβ. Previous studies have confirmed that the HD is sufficient for interaction with GTFs. For example, it has been found that the AbdA HD was sufficient for TFIIEβ interaction and that when the DNA-binding of the HD is mutated, the interaction is diminished but not abolished. Another example used Bimolecular fluorescence complementation (BiFC) in vivo to demonstrate that the Ubx HD and AbdA HD are sufficient for direct interaction with Med19. In addition of the conserved HD affinities to DNA and RNA, several protein-protein interactions also relied on the HD, such as dimerization of Scr, and Antp interaction with Eyeless (Altamirano-Torres, 2018).

Although this study found that Antp-TFIIEβ interaction is YPWM-independent in BiFC cell culture and the presence of an intact YPWM motif in the helix 2 Antp mutant showed neither interaction by BiFC nor functional activity, co-expression of the YPWM mutant and TFIIEβ reduced the signal interaction in embryos. A similar result in embryos was found in a earlier study where YPWM Antp mutant showed a reduction but not an abolition of TFIIEβ interaction on Drosophila embryos, that could be attributable to the presence of helix 2 in the mutant. Altogether, this suggest that interactions of Antp with TFIIEβ could change from one tissue to another with complex formation in different tissues using various interfaces (YPWM and/or HD), contributing to the plasticity of Hox interaction properties (Altamirano-Torres, 2018).

Deletional analysis of Antp HD suggested interaction of TFIIEβ through the helix 2 of Antp HD. Based on the reported 3D-structure of Antp HD DNA complex, in which helix 2 is on the opposite side of the HD-DNA binding, this study selected the conserved residues 32 and 36, which are exposed and physically available, as candidates for TFIIEβ interaction. To perform a molecular dissection on the Antp-TFIIEβ interaction, the residues I32 and H36 of helix 2, either individually or together, were studied by site-directed mutagenesis in cell culture. BiFC results show a drastic reduction of the interaction by mutation of these two residues, indicating that they are directly involved on Antp-TFIIEβ interaction. It has been demonstrated that AntpHD is internalized to the nuclei, through the residues 43-58 of the third helix. Therefore, since the mutations examined in this study are present on helix 2, the Antp NLS were not affected. To confirm that, immunostaining of Antp helix 2 mutants on cells and embryos showing very clearly the nuclear localization of Antp helix 2 single mutants and double mutant Antp. These results indicated that Antp helix 2 mutants include NLSs for their localization into the nucleus. Moreover, it was also demonstrated that helix 2 mutant keeps its transactivation activity and is capable to interact with EXD in cells and embryos confirming that mutation of these amino acids did not alter DNA binding affinity and the protein conformation to perform essential activities required for in vivo transformation (Altamirano-Torres, 2018).

Since both substitutions by alanines or structurally similar residues affected Antp-TFIIEβ interaction in cell culture in the same manner, I32A-H36A HD mutant was selected for the in vivo analysis in Drosophila. In concordance with BiFC cell culture assay, the results showed no interaction in embryos or in imaginal discs with Antp mutant I32A-H36A. Therefore, residues 32 and 36 of Antp helix 2 are crucial for the interaction with TFIIEβ in BiFC assays in Drosophila embryos and imaginal discs. This is relevant because residues 32 and 36 on Antp helix 2 are identical and highly conserved within Drosophila Hox proteins and can be extrapolated for the interaction with TFIIEβ to another homeoproteins due to the high Hox conservation (Altamirano-Torres, 2018).

Although the results very clearly show Antp-TFIIEβ interaction through positions 32 and 36 of helix 2, this does not exclude the possibility of another amino acid positions, either at helix 2 or the intervening loop, that could be involved to a minor extent on the interaction. For example, position 30 and 33, in addition to the helix 2 amino acids 32 and 36, have also been reported in human POU proteins Oct-1 and Oct-2 interaction with VP16 transactivator factor of Herpes Simplex Virus (Altamirano-Torres, 2018).

Because the precise molecular mechanisms of Antp in transcriptional regulation remains unclear, attempts were made to shed light on these by determining whether I32 and H36 are important for Antp function. When Antp is ectopically expressed on embryos it causes inhibition of head-involution and transformation of prothoracic segment T1 into T2 and antennae into mesothoracic (T2) legs. Antp ectopic expression exhibits that residues 32 and 36 of HD helix 2 are essential for its function in embryo head involution and homeotic transformations of thorax and antenna. Lack of homeotic transformations of AntpI32A-H36A double mutant expression indicates that residues 32 and 36 of HD helix 2 are absolutely required for the Antp ectopic homeotic function in Drosophila. Likewise, Antp mutated in the YPWM motif is not capable of transforming the antenna, and a single exposed residue on helix 1 of Scr HD is necessary for its homeotic function, showing that beside the HD DNA-binding, exposed positions on the HD are crucial for Hox functional activity (Altamirano-Torres, 2018).

To determine the functional relevance of the Antp-TFIIEβ interaction, co-expression of TFIIEβ and double mutant AntpI32A-H36A was directed to the antenna, showing a drastic reduction of the antenna transformation. These findings clearly demonstrate that Antp-TFIIEβ interaction (visualized by BiFC in live larvae) is necessary for the Antp homeotic function with a very strong transformation of the antenna into T2 mesothoracic leg. Together, these results imply that very subtle changes of two amino acids in the Antp HD helix 2 can have dramatic effects on protein-protein interaction with TFIIEβ, affecting transcriptional control and the functional properties of antenna-to-tarsus transformation (Altamirano-Torres, 2018).

These results show that the interaction between TFIIEβ and Antp HD contributes to transcriptional regulation and functional activities of Antennapedia. In the Pol II PIC formation, TFIIE is a heterodimer with α and β subunits, regulating TFIIH activities such as kinase on RNA Pol II CTD, ATPase and DNA helicase. TFIIEβ binds to both TFIIB and TFIIF in important activities needed for promoter melting and stabilization as well as for the transition to elongation. Thus, Antp-TFIIEβ interaction may represent a key control point for modulation of transcription factors involved in activation or repression functions. Repression activity of Antp-TFIIEβ interaction may imply destabilization of the PIC complex or the inhibition of TFIIEβ functions modulating TFIIH ATPase, CTD kinase or helicase activities. For example, it has been determined by in vitro transcription and co-immunoprecipitation assays that the zinc-finger TF Kruppel (Kr), a Drosophila segmentation protein for late embryonic development, interacts in a dimeric way with TFIIEβ and this interaction represses transcription. If it is considered that Antp dictates leg fate by repressing the activity of antenna-determining genes such as Hth and Dll in the leg imaginal discs, it could be reasonable that Antp-TFIIEβ can be involved in repression. Co-expression of Antp with TFIIEβ resulted in a reduction to 47% of the expression of Luciferase compared with of Antp alone, however further experiments need to be done to evaluate the precise molecular mechanism of this interaction. It could also be possible that Antp facilitates the arrival of TFIIEβ to the PIC and subsequently the recruitment and/or activation of TFIIH, allowing an efficient transcription elongation. For example, mutation of Med19 on haltere imaginal discs shows that Med19 is required for Ubx target gene activation. Another example would be that Kr binds to TFIIB in a monomeric way, and this interaction activates transcription in vitro. Thus, further experiments are needed to determine the fine molecular mechanism of how interaction between Antp and TFIIEβ contribute to transcriptional regulation by activation or repression activities, or even both (Altamirano-Torres, 2018).

This study has presented a clear interaction of TFIIEβ with two amino acid positions of Antp HD that are important for Antp homeotic function, and this interplay is essential to the Antp antenna-to-tarsus transformation. In conclusion, amino acids 32 and 36 of Antp HD helix 2 play a very important role in determining the specificity of the TFIIEβ interaction. Altogether, these results provide insights into the molecular interface of Antp HD with TFIIEβ to evaluate the extent to which these molecular contacts translate into functional properties in activation or repression of target genes. The role of residues 32 and 36 on Antp helix 2 can be extrapolated for the interaction of TFIIEβ with other homeoproteins, for example Scr, Ubx and AbdA, due to the highly Hox conservation. In addition, Antp-TFIIEB interaction open the possibility to more broadly explore the interplay between Antp and additional transcription factors in the Hox interactome for the genetic control of development in Drosophila (Altamirano-Torres, 2018).

list of proteins involved in messenger RNA synthesis


References

Altamirano-Torres, C., Salinas-Hernandez, J. E., Cardenas-Chavez, D. L., Rodriguez-Padilla, C. and Resendez-Perez, D. (2018). Transcription factor TFIIEbeta interacts with two exposed positions in helix 2 of the Antennapedia homeodomain to control homeotic function in Drosophila. PLoS One 13(10): e0205905. PubMed ID: 30321227

Aoyagia, N. and Wassarman, D. A. (2000). Genes encoding Drosophila melanogaster RNA polymerase II general transcription factors: diversity in TFIIA and TFIID components contributes to gene-specific transcriptional regulation. J. of Cell Bio. 150: F45-50. 10908585

Arenas-Mena, C. (2017). The origins of developmental gene regulation. Evol Dev 19(2): 96-107. PubMed ID: 28116828

Arnold, C. D., Zabidi, M. A., Pagani, M., Rath, M., Schernhuber, K., Kazmar, T. and Stark, A. (2017). Genome-wide assessment of sequence-intrinsic enhancer responsiveness at single-base-pair resolution. Nat Biotechnol 35(2): 136-144. PubMed ID: 28024147

Barbieri, E., Trizzino, M., Welsh, S. A., Owens, T. A., Calabretta, B., Carroll, M., Sarma, K. and Gardini, A. (2018). Targeted enhancer activation by a subunit of the integrator complex. Mol Cell 71(1): 103-116 e107. PubMed ID: 30008316

Baumann, D. G. and Gilmour, D. S. (2017). A sequence-specific core promoter-binding transcription factor recruits TRF2 to coordinately transcribe ribosomal protein genes. Nucleic Acids Res 45(18): 10481-10491. PubMed ID: 28977400

Boija, A., Klein, I. A., Sabari, B. R., Dall'Agnese, A., Coffey, E. L., Zamudio, A. V., Li, C. H., Shrinivas, K., Manteiga, J. C., Hannett, N. M., Abraham, B. J., Afeyan, L. K., Guo, Y. E., Rimel, J. K., Fant, C. B., Schuijers, J., Lee, T. I., Taatjes, D. J. and Young, R. A. (2018). Transcription factors activate genes through the phase-separation capacity of their activation domains. Cell 175(7): 1842-1855. PubMed ID: 30449618

Bose, D. A., Donahue, G., Reinberg, D., Shiekhattar, R., Bonasio, R. and Berger, S. L. (2017). RNA binding to CBP stimulates histone acetylation and transcription. Cell 168(1-2): 135-149 e122. PubMed ID: 28086087

Cazalla, D., Xie, M. and Steitz, J. A. (2011). A primate herpesvirus uses the integrator complex to generate viral microRNAs. Mol Cell 43(6): 982-992. PubMed ID: 21925386

Cho, H., et al. (1999). A protein phosphatase functions to recycle RNA polymerase II. Genes Dev. 13: 1540-52. Medline abstract: 10385623

Cho, W. K., Spille, J. H., Hecht, M., Lee, C., Li, C., Grube, V. and Cisse, II (2018). Mediator and RNA polymerase II clusters associate in transcription-dependent condensates. Science 361(6400): 412-415. PubMed ID: 29930094

Core, L. J., Martins, A. L., Danko, C. G., Waters, C. T., Siepel, A. and Lis, J. T. (2014). Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers. Nat Genet 46(12): 1311-1320. PubMed ID: 25383968

Duttke, S. H. C., Lacadie, S. A., Ibrahim, M. M., Glass, C. K., Corcoran, D. L., Benner, C., Heinz, S., Kadonaga, J. T. and Ohler, U. (2015). Human promoters are intrinsically directional. Mol Cell 57(4): 674-684. PubMed ID: 25639469

Elrod, N. D., Henriques, T., Huang, K. L., Tatomer, D. C., Wilusz, J. E., Wagner, E. J. and Adelman, K. (2019). Mol Cell 76(5):738-752. PubMed ID: 31809743

Fan, W., Lam, S. M., Xin, J., Yang, X., Liu, Z., Liu, Y., Wang, Y., Shui, G. and Huang, X. (2017). Drosophila TRF2 and TAF9 regulate lipid droplet size and phospholipid fatty acid composition. PLoS Genet 13(3): e1006664. PubMed ID: 28273089

Fant, C. B., Levandowski, C. B., Gupta, K., Maas, Z. L., Moir, J., Rubin, J. D., Sawyer, A., Esbin, M. N., Rimel, J. K., Luyties, O., Marr, M. T., Berger, I., Dowell, R. D. and Taatjes, D. J. (2020). TFIID enables RNA polymerase II promoter-proximal pausing. Mol Cell. PubMed ID: 32229306

Gomez-Orte, E., Saenz-Narciso, B., Zheleva, A., Ezcurra, B., de Toro, M., Lopez, R., Gastaca, I., Nilsen, H., Sacristan, M. P., Schnabel, R. and Cabello, J. (2019). Disruption of the Caenorhabditis elegans Integrator complex triggers a non-conventional transcriptional mechanism beyond snRNA genes. PLoS Genet 15(2): e1007981. PubMed ID: 30807579

Isogai, Y, Keles S, Prestel M, Hochheimer A, Tjian R. (2007). Transcription of histone gene cluster by differential core-promoter factors. Genes Dev. 21(22): 2936-49. PubMed ID: 17978101

Jin, Y., Eser, U., Struhl, K. and Churchman, L. S. (2017). The ground state and evolution of promoter region directionality. Cell 170(5): 889-898 e810. PubMed ID: 28803729

Kamieniarz-Gdula, K., Gdula, M. R., Panser, K., Nojima, T., Monks, J., Wisniewski, J. R., Riepsaame, J., Brockdorff, N., Pauli, A. and Proudfoot, N. J. (2019). Selective and roles of vertebrate PCF11 in premature and full-length transcript termination. Mol Cell 74(1): 158-172. PubMed ID: 30819644

Kim, M. K., Tranvo, A., Hurlburt, A. M., Verma, N., Phan, P., Luo, J., Ranish, J. and Stumph, W. E. (2020). Assembly of SNAPc, Bdp1, and TBP on the U6 snRNA gene promoter in Drosophila melanogaster. Mol Cell Biol. PubMed ID: 32253345

Kwak, H. and Lis, J. T. (2013). Control of transcriptional elongation. Annu Rev Genet 47: 483-508. PubMed ID: 24050178

Levitsky, V. G., Zykova, T. Y., Moshkin, Y. M. and Zhimulev, I. F. (2020). Nucleosome Positioning around Transcription Start Site Correlates with Gene Expression Only for Active Chromatin State in Drosophila Interphase Chromosomes. Int J Mol Sci 21(23). PubMed ID: 33291385

Louder, R. K., He, Y., Lopez-Blanco, J. R., Fang, J., Chacon, P. and Nogales, E. (2016). Structure of promoter-bound TFIID and model of human pre-initiation complex assembly. Nature 531(7596): 604-609. PubMed ID: 27007846

Lai, F., Gardini, A., Zhang, A. and Shiekhattar, R. (2015). Integrator mediates the biogenesis of enhancer RNAs. Nature 525(7569): 399-403. PubMed ID: 26308897

Xie, M., Zhang, W., Shu, M. D., Xu, A., Lenis, D. A., DiMaio, D. and Steitz, J. A. (2015). The host Integrator complex acts in transcription-independent maturation of herpesvirus microRNA 3' ends. Genes Dev 29(14): 1552-1564. PubMed ID: 26220997

Li, G., Ruan, X., Auerbach, R. K., Sandhu, K. S., Zheng, M., Wang, P., Poh, H. M., Goh, Y., Lim, J., Zhang, J., Sim, H. S., Peh, S. Q., Mulawadi, F. H., Ong, C. T., Orlov, Y. L., Hong, S., Zhang, Z., Landt, S., Raha, D., Euskirchen, G., Wei, C. L., Ge, W., Wang, H., Davis, C., Fisher-Aylor, K. I., Mortazavi, A., Gerstein, M., Gingeras, T., Wold, B., Sun, Y., Fullwood, M. J., Cheung, E., Liu, E., Sung, W. K., Snyder, M. and Ruan, Y. (2012). Extensive promoter-centered chromatin interactions provide a topological basis for transcription regulation. Cell 148(1-2): 84-98. PubMed ID: 22265404

Liu, W. L., et al. (2009). Structures of three distinct activator-TFIID complexes. Genes Dev. 23(13): 1510-21. PubMed Citation: 19571180

Mahat, D. B., Kwak, H., Booth, G. T., Jonkers, I. H., Danko, C. G., Patel, R. K., Waters, C. T., Munson, K., Core, L. J. and Lis, J. T. (2016). Base-pair-resolution genome-wide mapping of active RNA polymerases using precision nuclear run-on (PRO-seq). Nat Protoc 11(8): 1455-1476. PubMed ID: 27442863

Marr, M. T., Isogai, Y., Wright, K. J. and Tjian, R. (2006). Coactivator cross-talk specifies transcriptional output. Genes Dev. 20(11): 1458-69. 16751183

Murakami, K., Elmlund, H., Kalisman, N., Bushnell, D. A., Adams, C. M., Azubel, M., Elmlund, D., Levi-Kalisman, Y., Liu, X., Gibbons, B. J., Levitt, M. and Kornberg, R. D. (2013). Architecture of an RNA polymerase II transcription pre-initiation complex. Science 342: 1238724. Abstract

Nguyen, T. A., Jones, R. D., Snavely, A. R., Pfenning, A. R., Kirchner, R., Hemberg, M. and Gray, J. M. (2016). High-throughput functional comparison of promoter and enhancer activities. Genome Res 26(8): 1023-1033. PubMed ID: 27311442

Nikolov, D. B. and Burley, S. K. (1997). RNA polymerase II transcription initiation: A structural view. Proc. Natl. Acad. Sci. 94: 15-22. Medline abstract: 8990153

Ohler, U., Liao, G. C., Niemann, H. and Rubin, G. M. (2002). Computational analysis of core promoters in the Drosophila genome. Genome Biol 3(12): RESEARCH0087. PubMed ID: 12537576

Orphanides, G., Lagrange, T., and Reinberg, D. (1996). The general transcription factors of RNA polymerase II. Genes Dev. 10: 2657-83. Medline abstract: 8946909

Pahi, Z., Kiss, Z., Komonyi, O., Borsos, B. N., Tora, L., Boros, I. M. and Pankotai, T. (2015). dTAF10- and dTAF10b-containing complexes are required for ecdysone-driven larval-pupal morphogenesis in Drosophila melanogaster. PLoS One 10: e0142226. PubMed ID: 26556600

Parry, T. J., Theisen, J. W., Hsu, J. Y., Wang, Y. L., Corcoran, D. L., Eustice, M., Ohler, U. and Kadonaga, J. T. (2010). The TCT motif, a key component of an RNA polymerase II transcription system for the translational machinery. Genes Dev 24(18): 2013-2018. PubMed ID: 20801935

Patel, A. B., Louder, R. K., Greber, B. J., Grunberg, S., Luo, J., Fang, J., Liu, Y., Ranish, J., Hahn, S. and Nogales, E. (2018). Structure of human TFIID and mechanism of TBP loading onto promoter DNA. Science 362(6421). PubMed ID: 30442764

Petrenko, N. and Struhl, K. (2021). Comparison of transcriptional initiation by RNA polymerase II across eukaryotic species. Elife 10. PubMed ID: 34515029

Pimmett, V. L., Dejean, M., Fernandez, C., Trullo, A., Bertrand, E., Radulescu, O. and Lagha, M. (2021). Quantitative imaging of transcription in living Drosophila embryos reveals the impact of core promoter motifs on promoter state dynamics. Nat Commun 12(1): 4504. PubMed ID: 34301936

Qiu, Y. and Gilmour, D. S. (2017). Identification of regions in the Spt5 subunit of DSIF that are involved in promoter proximal pausing. J Biol Chem [Epub ahead of print]. PubMed ID: 28213523

Rubtsova, M. P., Vasilkova, D. P., Moshareva, M. A., Malyavko, A. N., Meerson, M. B., Zatsepin, T. S., Naraykina, Y. V., Beletsky, A. V., Ravin, N. V. and Dontsova, O. A. (2019). Integrator is a key component of human telomerase RNA biogenesis. Sci Rep 9(1): 1701. PubMed ID: 30737432

Schor, I. E., Degner, J. F., Harnett, D., Cannavo, E., Casale, F. P., Shim, H., Garfield, D. A., Birney, E., Stephens, M., Stegle, O. and Furlong, E. E. (2017). Promoter shape varies across populations and affects promoter evolution and expression noise. Nat Genet 49(4): 550-558. PubMed ID: 28191888

Shah, N., Maqbool, M. A., Yahia, Y., El Aabidine, A. Z., Esnault, C., Forne, I., Decker, T. M., Martin, D., Schuller, R., Krebs, S., Blum, H., Imhof, A., Eick, D. and Andrau, J. C. (2018). Tyrosine-1 of RNA polymerase II CTD controls global termination of gene transcription in mammals. Mol Cell 69(1): 48-61 e46. PubMed ID: 29304333

Shiraki, T., Kondo, S., Katayama, S., Waki, K., Kasukawa, T., Kawaji, H., Kodzius, R., Watahiki, A., Nakamura, M., Arakawa, T., Fukuda, S., Sasaki, D., Podhajska, A., Harbers, M., Kawai, J., Carninci, P. and Hayashizaki, Y. (2003). Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proc Natl Acad Sci U S A 100(26): 15776-15781. PubMed ID: 14663149

Sigova, A. A., Abraham, B. J., Ji, X., Molinie, B., Hannett, N. M., Guo, Y. E., Jangi, M., Giallourakis, C. C., Sharp, P. A. and Young, R. A. (2015). Transcription factor trapping by RNA in gene regulatory elements. Science 350(6263): 978-981. PubMed ID: 26516199

Skaar, J. R., Ferris, A. L., Wu, X., Saraf, A., Khanna, K. K., Florens, L., Washburn, M. P., Hughes, S. H. and Pagano, M. (2015). The Integrator complex controls the termination of transcription at diverse classes of gene targets. Cell Res 25(3): 288-305. PubMed ID: 25675981

Tatomer, D. C., Elrod, N. D., Liang, D., Xiao, M. S., Jiang, J. Z., Jonathan, M., Huang, K. L., Wagner, E. J., Cherry, S. and Wilusz, J. E. (2019). The Integrator complex cleaves nascent mRNAs to attenuate transcription. Genes Dev 33(21-22): 1525-1538. PubMed ID: 31530651

Tanaka, A., Akimoto, Y., Kobayashi, S., Hisatake, K., Hanaoka, F. and Ohkuma, Y. (2015). Association of the winged helix motif of the TFIIEalpha subunit of TFIIE with either the TFIIEbeta subunit or TFIIB distinguishes its functions in transcription. Genes Cells 20: 203-216. PubMed ID: 25492609

van Arensbergen, J., FitzPatrick, V. D., de Haas, M., Pagie, L., Sluimer, J., Bussemaker, H. J. and van Steensel, B. (2017). Genome-wide mapping of autonomous promoter activity in human cells. Nat Biotechnol 35(2): 145-153. PubMed ID: 28024146

Verma, N., Hung, K. H., Kang, J. J., Barakat, N. H. and Stumph, W. E. (2013). Differential utilization of TATA box-binding protein (TBP) and TBP-related factor 1 (TRF1) at different classes of RNA polymerase III promoters. J Biol Chem 288(38): 27564-27570. PubMed ID: 23955442

Verma, N., Hurlburt, A. M., Wolfe, A., Kim, M. K., Kang, Y. S., Kang, J. J. and Stumph, W. E. (2018). Bdp1 interacts with SNAPc bound to a U6, but not U1, snRNA gene promoter element to establish a stable protein-DNA complex. FEBS Lett 592(14): 2489-2498. PubMed ID: 29932462

Xie, X., et al. (1996). Structural similarity between TAFs and the heterotetrameric core of the histone octamer. Nature 380: 316-322. Medline abstract: 8598927

Zhang, Z., English, B. P., Grimm, J. B., Kazane, S. A., Hu, W., Tsai, A., Inouye, C., You, C., Piehler, J., Schultz, P. G., Lavis, L. D., Revyakin, A. and Tjian, R. (2016). Rapid dynamics of general transcription factor TFIIB binding during preinitiation complex assembly revealed by single-molecule analysis. Genes Dev 30: 2106-2118. PubMed ID: 27798851


date revised: 15 December 2022
 

Zygotically transcribed genes

Home page: The Interactive Fly © 1995, 1996 Thomas B. Brody, Ph.D.

The Interactive Fly resides on the
Society for Developmental Biology's Web server.