Trithorax-like

Targets of Activity and Protein Interactions (part 1/2)

Trithorax-like functions in regulating the Bithorax complex

The distribution of Polycomb protein has been mapped at high resolution on the bithorax complex of Drosophila tissue culture cells, using an improved formaldehyde cross-linking and immunoprecipitation technique. Sheared chromatin was immunoprecipitated and amplified by linker-modified PCR, before using as a probe on a Southern of the entire PX-C walk. Polycomb protein is not distributed homogeneously on the regulatory regions of the repressed Ultrabithorax and abdominal-A genes, but is highly enriched at discrete sequence elements, many of which coincide with previously mapped Polycomb group response elements (PREs). Among the identified sites are peak F (the bxd PRE) and peak G (the bx PRE), both of which contain GAGA consensus sequences. Three other sites, E, D and C correspond to iab2, iab3 and iab4. No PC binding is seen in the regulatory domains iab6, iab7 or iab8, indicating that these domains positively regulate Abd-B expression. These results suggest that Polycomb protein spreads locally over a few kilobases of DNA surrounding PREs, perhaps to stabilize silencing complexes. GAGA factor/Trithorax-like, a member of the trithorax group, is also bound at those PREs which contain GAGA consensus-binding sites. Two modes of binding can be distinguished: a high level binding to elements in the regulatory domain of the expressed Abdominal-B gene, and a low level of binding to Polycomb-bound PREs in the inactive domains of the bithorax complex. The Abd-B sites include the iab7/iab8 regulatory region, and the Fab-7 PRE. The Fab-7 PRE does not bind Polycomb. It is proposed that GAGA factor binds constitutively to regulatory elements in the bithorax complex, which function both as PREs (silencing elements) and as trithorax group response elements. It is suggested that a GAGA site in the Antennapedia promoter is both a PRE (binding PC protein) and a TRE (binding GAGA) factor (Strutt, 1997).

The homeotic genes of the Drosophila bithorax complex are controlled by a large cis-regulatory region that ensures their segmentally restricted pattern of expression. A deletion that removes the Frontabdominal-7 cis-regulatory region (Fab-7') dominantly transforms parasegment 11 into parasegment 12. This chromosomal region contains both a boundary element and a silencer. Previous studies have suggested that removal of a domain boundary element on the proximal side of Fab-7' is responsible for dominantly transforming gain-of-function phenotype. The Fab-7 boundary element maps to two nuclease hypersensitive sites, HS1 and HS2. This article demonstrates that the Fab-7' deletion also removes a silencer element, the iab-7 PRE, which maps to a different DNA segment (the HS3 site) and plays a different role in regulating parasegment-specific expression patterns of the Abdominal-B gene. The iab-7 PRE mediates pairing-sensitive silencing of mini-white, and can maintain the segmentally restricted expression pattern of a BXD, Ubx/lacZ reporter transgene. Both mini-white and Ubx/lacZ silencing activities depend upon Polycomb Group proteins. Pairing-sensitive silencing is relieved by removing the transvection protein Zeste, but is enhanced in a novel pairing-independent manner by the zeste' allele. The iab-7 PRE silencer is contained within a 0.8-kb fragment that spans the HS3 nuclease hypersensitive site, and silencing appears to depend on the chromatin remodeling protein, the GAGA factor. It is suggested that PRE-PRE cooperation, either in trans or in cis, may be an important feature of the silencing process within the BX-C and that boundaries may limit PRE-PRE cooperation (Hagstrom, 1997).

Fab-7 is a genetically identified element of the BX-C necessary to regulate spatial transcription of the Abdominal-B (Abd-B) gene. Polycomb group (PcG) and trithorax group (trxG) gene products are responsible for the maintenance of repressed and active expression patterns of many developmentally important regulatory genes, including Abd-B. In Drosophila embryos, Polycomb (Pc) protein and the trxG protein GAGA factor colocalize at the Fab-7 DNA element of the bithorax complex. There is a strong enrichment in GAGA factor and Pc at the Fab-7 site in 11-16 hr old embryos, as compared to mock immunoprecipitations. A major GAGA factor binding site was localized in a 422 bp fragment, which contains six putative GAGA binding sites. This fragment is located within the putative boundary element. A lower level of binding was detected in the flanking 1230 bp fragment, which contains three GAGA target consensus sequences and also the putative PRE. Little or no association with all other flanking sequences is observed. In contrast, Pc is found to be associated with the entire 3.6 kb region, with a rather broad peak centered at the boundary and the PRE regions. Peak binding for Pc and GAGA factor colocalize. Thus, the physical distribution of these two proteins at Fab-7 does not allow discrimination of the apparent insulator and PRE function identified by transgenic constructs. This might indicate that the chromosomal elements through which PcG and trxG proteins act (the PRE) are located in sequences partially overlapping the putative boundary region (Cavalli, 1998).

In transgenic lines, the Fab-7 element induces extensive silencing on a flanking GAL4-driven lacZ reporter and mini-white genes. The Fab-7 fragment acts as a silencer preventing trans-activators like GAL4 from binding to the UAS-site when Pc protein is bound at the PRE. However, a short single pulse of GAL4 during embryogenesis is sufficient to release PcG-dependent silencing from the transgene. Such an activated state of Fab-7 is mitotically inheritable through development and can be transmitted in a GAL4-independent manner to the subsequent generations through female meiosis. About 25% of the female progeny exhibit an activated state of Fab-7. Red-eyed flies, indicating an activated state, were again selected for two further crossings. In the F3 generation, strong beta-gal staining is still observed in 18% of the embryos, and about 27% of the adult females are red-eyed. Thus, inheritance of the active state can be propagated through multiple subsequent generations. Meiotic transmission of the activated Fab-7 state is a reversible process, however. Embryos from flies with an activated Fab-7 of the F1, F2, and F3 generation were allowed to develop at 28°C until the first instar larval stage and then transferred to 18°C. Silencing is reestablished, resulting in 100% yellow-eyed female progeny. Conversely, in those flies where repression is reestablished upon meiosis, this repressed state can be reactivated again. This complete reversibility strongly suggests that the observed partial efficiency of transmission of the active state does not depend on heterogeneity in the genetic background of the fly line, but on a stochastic process whereby some of the chromatin templates may lose the epigenetic information upon meiotic transmission. Crosses using GAL4-less females strongly suggest that meiotic transmission of the activated Fab-7 state is not dependent on a preondurance of GAL4 protein. It is concluded that Fab-7 is a switchable chromosomal element, which can convey memory of epigenetically determined active and repressed chromatin states (Cavalli, 1998).

Although in the test system described here only maternal inheritance of Fab-7-dependent epigenetic regulatory states was observed, paternal inheritance of heterochromatin states has also been documented in Drosophila. The molecular nature of this phenomenon is not understood, but the Y chromosome has been shown to be involved in this type of inheritance. Therefore, paternal inheritance of chromatin states is possible, and whether PcG/trxG-mediated meiotic inheritance is truly restricted to the female germline for all of their regulated sequences remains a fascinating question for future investigation. It is proposed that chromosomal elements such as Fab-7, where PcG and trxG proteins perform a coordinate maintenance function, be termed "cellular memory modules" (CMMs). CMMs may be thought of as switchable elements able to induce and heritably propagate both silenced and open chromatin conformations. The respective chromatin status determined by a regulatory cascade of transcription factors during early embryogenesis might be the primary switch. Activated transcription would drive a CMM into the trxG-dependent open chromatin mode, while inactive states would be maintained as silent chromatin (Cavalli, 1998).

The finding of meiotic inheritance in PcG/trxG-dependent regulation is surprising since these proteins control genes involved in developmental decisions. In the embryo, the zygotic genome has to develop different spatial patterns of homeotic gene expression. Therefore, the developing embryo must be able to erase the epigenetic information of its parental gametes in order to allow differentiation of a variety of cell lineages. It could be argued that PcG silencing at CMMs is the default state; that is, in the germ line, all CMMs remain "marked" by certain elements of the PcG silencing complex. As such, in the early zygote the occupation of all CMMs by PcG proteins would be retained, perpetuating silencing as the default state. Recent evidence speaks in favor of such a mechanism. In somatic cells, differential transcription induced by patterning factors would switch CMMs into the active mode, which would then be heritably maintained in subsequent cell generations. In the particular transgene combination used in this study, the strong GAL4 induction might have completely removed the PcG-silencing tag from this PRE, which subsequently remains in the active state through several rounds of mitotic and meiotic divisions until it becomes reinactivated by stochastic processes. The finding that a defined Drosophila chromosomal element can transmit an epigenetic state to the next generations in the absence of any apparent covalent modifications of the DNA suggests that chromatin proteins can faithfully maintain an epigenetic state, and will allow a detailed molecular analysis of this type of inheritance. Several questions can now be addressed: can mitotic and meiotic epigenetic inheritance of active and silenced chromatin states be driven by other known PRE-containing DNA elements, and which of the PcG and trxG proteins are involved in these processes? What are the molecular features of the chromosomal complexes involved in epigenetic inheritance through mitosis and meiosis? The answer to these questions should yield important insight into how developmental decisions are faithfully maintained, and has implications for a better understanding of other epigenetic phenomena like mammalian genomic imprinting and paramutation in plants. It could well be that cellular memory modules are used in a variety of mechanisms to maintain regulatory decisions about transcriptional states throughout mitotic and meiotic division (Cavalli, 1998).

Drosophila Iswi, a highly conserved member of the SWI2/SNF2 family of ATPases, is the catalytic subunit of three chromatin-remodeling complexes: NURF, CHRAC, and ACF. To clarify the biological functions of Iswi, null and dominant-negative Iswi mutations were generated and characterized. Iswi mutations affect both cell viability and gene expression during Drosophila development. Iswi mutations also cause striking alterations in the structure of the male X chromosome. The Iswi protein does not colocalize with RNA Pol II on salivary gland polytene chromosomes, suggesting a possible role for Iswi in transcriptional repression. These findings reveal novel functions for the Iswi ATPase and underscore its importance in chromatin remodeling in vivo (Deuring, 2000).

To determine when Iswi is required during development, the lethal phase and phenotype of Iswi null mutants were examined. Individuals heterozygous for ISWI¹ or ISWI² are viable and phenotypically normal. ISWI¹/Df(2R)vg-C individuals die during late larval or early pupal development and display no obvious homeotic transformations or other pattern defects. Similar results were obtained for both ISWI²/Df(2R)vg-C and ISWI¹/ISWI² individuals (Deuring, 2000).

In vitro studies have suggested that Iswi plays an important role in transcription by facilitating the interaction of transcription factors with chromatin. One of the best candidates for a transcription factor that requires Iswi for its activity is the GAGA factor. GAGA factor binds to GA-rich sequences near the promoters of a wide variety of Drosophila genes and is thought to activate transcription by altering local chromatin structure. As the ATPase subunit of NURF, Iswi assists the GAGA factor to remodel chromatin in vitro, suggesting that the two proteins may act in concert to modulate chromatin structure in vivo as well. To examine possible interactions between Iswi and GAGA factor in vivo, the phenotypes of mutations in the two genes have been compared. GAGA factor is encoded by Trithorax-like (Trl), a member of the trithorax group of homeotic gene activators. Trl mutations enhance mutations in trithorax and cause homeotic transformations resulting from the decreased transcription of homeotic genes. Trl mutations also enhance position effect variegation, suggesting that GAGA factor antagonizes the assembly or function of heterochromatin. Unlike Trl mutations, Iswi mutations fail to enhance or suppress position effect variegation. No dominant interactions could be detected between mutations in Iswi and other genes, including Trl, other trithorax group genes (trithorax and brm), and Polycomb, a repressor of homeotic genes that is thought to act at the level of chromatin structure. These data suggest that Iswi and GAGA factor play distinct roles in chromatin remodeling in vivo (Deuring, 2000).

To investigate the role of Iswi in transcriptional activation in vivo, the effect of Iswi mutations on the expression of two targets of the GAGA factor were examined: the segmentation gene engrailed (en) and the homeotic gene Ultrabithorax (Ubx). The expression of En protein is reduced dramatically in imaginal discs of ISWI¹/ISWI² mutant larvae. Similar results are observed for Ubx. These data suggest that Iswi is essential for the expression of both en and Ubx in imaginal discs, although the possibility that this interaction is indirect cannot be ruled out (Deuring, 2000).

To directly observe interactions between Iswi and chromatin in vivo, the distribution of Iswi protein on salivary gland polytene chromosomes in third instar larvae was examined by immunofluorescence microscopy. Consistent with a fairly general role in transcription or other processes, Iswi protein is present at a large number of euchromatic sites in the polytene chromosomes. The same pattern was observed using whole sera and affinity-purified antibodies. The chromosomal distribution of Iswi protein is not appreciably altered following heat shock (Deuring, 2000).

Iswi protein is also associated with a subset of heterochromatin, as evidenced by punctate staining at the chromocenter. It is difficult to analyze the distribution of heterochromatic proteins on salivary gland chromosomes, since heterochromatic sequences are underreplicated in polytene tissues. To more accurately map the regions of heterochromatin with which Iswi interacts, the distribution of Iswi protein on mitotic chromosomes from larval neuroblasts was examined. On mitotic chromosomes, Iswi protein is abundantly present on the euchromatic arms of all chromosomes and is concentrated in regions of heterochromatin enriched with middle-repetitive sequences. For example, on the heterochromatic Y chromosome, Iswi is concentrated in the h11–13 region, which is composed almost entirely of middle repetitive DNA families. By contrast, little Iswi protein is detected in regions containing predominantly satellite DNA. The distributions of Iswi and GAGA factor on polytene and mitotic chromosomes were determined by double-label immunofluorescence microscopy. Both GAGA factor and Iswi are associated with hundreds of sites in the euchromatin of polytene chromosomes, but the distributions of the two proteins do not overlap extensively. Even greater differences in the distributions of the two proteins were observed in mitotic chromosomes where the GAGA factor, but not Iswi, is associated with GAGA-satellite sequences. The lack of extensive colocalization does not rule out an interaction between Iswi and GAGA at specific loci, but it does suggest that Iswi and GAGA are not obligatory partners (Deuring, 2000).

The decrease in en and Ubx expression in Iswi mutant larvae is consistent with reports that Iswi is involved in transcriptional activation in vitro. Consequently, it was not anticipated that the distributions of Iswi and RNA Pol II on salivary gland polytene chromosomes would be mutually exclusive. The preferential association of Iswi with transcriptionally inactive regions suggests that Iswi may create changes in chromatin structure that are not conducive to RNA Pol II transcription in vivo. Although there is no direct evidence that Iswi represses transcription, such a function would be consistent with the proposal that Iswi acts antagonistically toward histone acetyltransferases to compact chromatin structure. Based on these observations, further investigation of the role of Iswi in transcriptional repression is clearly warranted (Deuring, 2000).

How can the distributions of Iswi and RNA Pol II on polytene chromosomes be reconciled with the effect of Iswi mutations on gene expression in imaginal discs and the ability of Iswi complexes to activate transcription in vitro? One possibility is that Iswi has roles in both transcriptional repression and activation. NURF, ACF, and CHRAC were purified from Drosophila embryo extracts, and nothing is known about the nature or relative abundance of Iswi complexes in larvae. Perhaps only one Iswi complex is associated with transcriptionally inactive chromatin in the larval salivary gland, while others are either less abundant or transiently interact with chromatin to activate transcription. It is also possible that the interaction of Iswi with en and Ubx is indirect. For instance, the decreased expression of the two genes may be a secondary consequence of reduced cell viability in Iswi mutant larvae (Deuring, 2000).

Polycomb response elements (PREs) are regulatory sites that mediate the silencing of homeotic and other genes. The bxd PRE region from the Drosophila Ultrabithorax gene can be subdivided into subfragments of 100 to 200 bp that retain different degrees of PRE activity in vivo. In vitro, embryonic nuclear extracts form complexes containing Polycomb group (PcG) proteins with these fragments. PcG binding to some fragments is dependent on consensus sequences for the GAGA factor. Other fragments lack GAGA binding sites but can still bind PcG complexes in vitro. The GAGA factor is a component of at least some types of PcG complexes and may participate in the assembly of PcG complexes at PREs (Horard, 2000).

Dissection of the PRE reveals that it is a compound region containing several sequences that are able (to different extents) to induce variegated expression of the miniwhite gene, respond to PcG mutations, and create new binding sites for PcG proteins on polytene chromosomes. The separate fragments are definitely weaker in activity than the whole. A single copy of a fragment containing different restriction enzyme fragments (BP, AB, and part of HA) silences very effectively, indicating that the different sequences normally cooperate to achieve more complete silencing to a degree that is not attained by multiple tandem copies of one fragment. The different subfragments most likely contribute complementary functions, but it has not been possible to demonstrate that different PcG proteins interact with different subfragments. As with the entire PRE, the response to different PcG mutations depends strongly on the site of insertion of the transposon construct. The genomic context makes therefore a strong contribution not only to the strength of the silencing but also to the relative importance of the different PcG components of the silencing complex. The activity of PRE-containing transposons inserted at different sites suggests that this contribution is due not only to sequences flanking the insertion site but also to the interaction in trans with other genomic PRE sites (Horard, 2000).

Only one of the three subfragments tested in embryos, BP, was able to maintain repression of the Ubx-lacZ reporter gene. This could be due simply to the relative PRE strengths of the different fragments. That is, increasing the number of copies of the other fragments might achieve the same silencing strength. Another possibility is that the complex formed at the BP fragment is qualitatively different from that recruited by the other fragments; for example, it might be able to recruit PcG proteins sufficiently early in embryonic development to have an effect on the Ubx-lacZ gene, while other PRE fragments might be able to institute silencing only at later stages. Different affinities for PcG complexes could also account for the different abilities to create binding sites for PcG proteins on polytene chromosomes. However, the fact that the PF fragment, though able to induce variegation at a high rate and to bind PcG proteins on polytenes, failed to show any detectable PcG complex formation in the immunoprecipitation assays suggests that the nature and composition of the complexes and/or the mode and timing of their recruitment are likely to differ for the different fragments (Horard, 2000).

The in vitro experiments show that GAGAG-containing sequences (GAGAG is the consensus binding site for GAGA protein) are binding sites for PcG complexes and that the GAGA factor is associated with PcG complexes present in the nuclear extracts. Ion exchange chromatography of nuclear extracts confirms that, while PcG proteins elute over a broad range of salt concentrations, the in vitro binding activity constitutes a small minority and copurifies with the GAGA factor. The multiplicity and heterogeneity of PcG complexes present in nuclear extracts would not be detected in affinity-based purification schemes. In contrast, the GAGA factor, along with Polyhomeotic, is found in a multiprotein complex that binds in vitro to PRE regions corresponding to ours. The possibility that some other PcG protein also recognizes the GAGA consensus sequence cannot be excluded, but the association of the GAGA factor with PcG complexes shows that it is most likely involved in at least one mode of PcG binding to PRE DNA. Does this reflect a role for the GAGA factor in PcG silencing in vivo? The GAGA factor was originally identified as a transcription-stimulating factor both in vivo and in vitro and was classified as a trxG protein because it stimulated the activity of homeotic genes while its mutants had phenotypes indicative of homeotic insufficiency. However, some evidence suggests that it can also be associated with repressive functions. The GAGA factor, together with another activator, NTF-1, also binds to an 11-bp element required for the repression of tailless by the torso-dependent pathway. Evidence that it might be involved in PRE function consists of the fact that GAGA mutations decrease the silencing effected by the Fab-7 PRE. In the Ubx gene, the bxd PRE region contains the largest concentration of GAGA binding sites. If each continuous G(AG)n stretch is taken as one binding site, the 1-kb interval containing the core of the PRE contains 13 sites while the next highest concentration (8 sites) is found in a 1-kb region containing the bx PRE (not to be confused with the BX enhancer). Chromatin cross-linking and immunoprecipitation experiments confirm that these regions bind the GAGA factor in vivo. The results suggest that at these sites the GAGA factor is not an antagonist of silencing and is not simply an accessory or a facilitator of PcG complex formation but may, in concert with other factors, contribute to targeting PcG complexes (Horard, 2000).

It was surprising, in view of in vitro results, that the effects of Trl mutations on either miniwhite variegation or the silencing of the Ubx-lacZ reporter are sporadic and strongly dependent on the insertion site. One possible explanation is that, in vivo, the GAGA factor is only one of a set of DNA-binding recruiting proteins and that, while it contributes to, it is not essential for, the assembly of PcG complexes. Chromatographic fractionation of nuclear extracts indicates in fact that only a fraction of the PcG complexes present in embryonic extracts are associated with the GAGA factor. Furthermore, embryos contain an important maternal supply of the GAGA factor, which would mask the effect of a reduced zygotic contribution. Later, other recruiting factors might be involved. Finally, the results cannot exclude the possibility that, although the GAGA factor is (1) a component of PcG complexes, (2) can target their binding in vitro, and (3) is apparently important for the function of the Fab-7 PRE, it is not primarily involved in recruitment at the bxd PRE. Instead, its role might be primarily architectural. The GAGA factor binds to DNA as a multimer that recognizes clustered GAGA consensus sequences, and it has been argued that such binding would be expected to bend DNA in a way incompatible with nucleosome assembly. GAGA binding would then clear the PRE core of nucleosomes and bend it to facilitate interactions among other DNA-binding components (Horard, 2000).

The presence of GAGA binding sites alone appears to be sufficient in vitro to bind a PcG complex since not only the PRE fragments but also the Ubx promoter and the hsp70 promoter bind, though they have no known PcG silencing activity in vivo. In addition, a GAGA-containing oligonucleotide also binds efficiently to PcG complexes. Nevertheless, GAGA protein binding to a DNA sequence is not sufficient to recruit PcG complexes in vivo. Clearly the in vitro binding reaction does not reflect the in vivo activity. The most probable explanation of this discrepancy is that the binding detected in vitro is due to complexes that are preassembled in vivo and are then dissociated from the chromatin during the preparation of nuclear extracts. If the nature and composition of PcG complexes are templated by the PREs at which they are assembled, GAGA-containing PcG complexes would be efficiently targeted to GAGA binding sites in vitro while, in vivo, complex formation would require the de novo recruitment and assembly of PcG complexes, involving other DNA binding components or cofactors. This interpretation is favored because it would also explain the variable compositions of PcG complexes detected at different chromosomal sites. In vivo, the large majority of GAGA binding sites visible on polytene chromosomes are not associated with PcG binding, suggesting that only a small fraction of the GAGA protein is involved in PcG complexes. This interpretation also accounts for the fact that the LexA-GAGA protein cannot recruit PcG complexes to LexA binding sites. It is also noted that the target of PcG complexes in vivo is chromatin, not naked DNA. The presence of nucleosomes might normally increase the selectivity, allowing PcG complexes to assemble only at sites where other recruiting or architectural proteins are also bound (Horard, 2000).

In view of these results, the existence of GAGA sites at the Ubx promoter raises other possibilities. In the presence of a PRE, a GAGA factor bound at the Ubx promoter might participate in the silencing activity by interacting with GAGA-containing PcG complexes recruited at the PRE, mediating or contributing to promoter silencing. Both the hsp70 and hsp26 promoters are efficiently repressed by the presence of a PRE in the same transposon construct. The GAGA factor might contribute to silencing in these cases also. The miniwhite gene, which is also silenced by the PRE, does not contain typical clustered GAGA sites in its promoter region but only a few scattered sites in the transcribed region. The expression of the miniwhite gene is strongly dependent on the site of insertion and on distant enhancers within or outside of the transposon construct. The silencing of these enhancers might be in part responsible for the effect of the PRE on miniwhite expression. Alternatively, other proteins binding to the miniwhite promoter region might interact with PcG complexes (Horard, 2000).

The immunoprecipitation experiments also detected binding that is not competed by GAGA oligonucleotides with PRE fragments that do not contain consensus GAGA binding sites. This implies that other recognition sequences and other DNA-binding proteins are involved in these cases. The recent discovery that Pleiohomeotic (a Drosophila PcG protein homolog of the mammalian YY-1 factor) binds to DNA suggests that it might be one such recruiter of PcG complexes. There are in fact a number of putative Pho binding sites with the minimal consensus GCCAT in the PRE region: one in AB, two in BP (a third site is destroyed by the BglI cleavage), and three in the PF fragment. These bind Pho protein in vitro and are important for PRE activity in vivo. However, none are found in the HH or HA fragments; hence these presumably depend on other recruiting proteins. However, the PF fragment, though it contains three putative Pho sites, is conspicuous for its inability to bind PcG complexes in extracts, suggesting that Pho is either not present in the complex containing Pc and Psc or does not interact directly with it. The fact that the mammalian Pho homolog YY-1 causes sharp bends in the DNA raises the possibility that Pho too might serve a primarily architectural role without necessarily interacting directly with PcG complexes (Horard, 2000).

Although PF does not contain GAGA sites, it is almost as effective in inducing PcG-dependent variegation of the miniwhite gene as the BP fragment and it can generate new PcG binding sites at the site of insertion on polytene chromosomes. Yet PF cannot maintain repression of the Ubx-lacZ reporter gene in embryos. One possible explanation for these results is that PF is the target for yet another PcG recruiting mechanism that either functions poorly under these in vitro binding conditions or depends on proteins that are not present in the embryonic extracts. The fact that the PF fragment can recruit silencing complexes in larval cells but cannot maintain repression in the embryo would be consistent with a requirement for proteins present only at later developmental stages. Another possible explanation is that PF does interact with certain PcG complexes which do not include PC or PSC and hence escaped detection (Horard, 2000).

The picture of the PRE that emerges from these experiments is that of a mosaic of multiple interaction sites which may require different DNA-binding proteins to recruit PcG components. A similar conclusion has been reached, based on deletions that abolish the activity of the bxd PRE and by an in vitro binding approach similar to that reported in this study. GAGA sites are associated with some PREs but not others (e.g., the Mcp PRE). If the GAGA factor acts as a recruiting protein, it is most likely only one of many possible recruiters. Different recruiters might interact specifically with different PcG proteins, accounting for the fact that the binding sites for different PcG proteins on polytene chromosomes do not completely coincide. Nevertheless, the ability of PcG proteins to interact with one another or to enter into a chain of recruitment means that, in most cases, strong binding sites for one PcG protein will be able to recruit at least to some degree the other PcG proteins. The difference between direct and indirect recruitment may be responsible for the fact that a strong chromosomal binding site for one PcG protein is sometimes a weak binding site for another PcG protein (Horard, 2000 and references therein).

A functional dissection of a Polycomb response element (PRE) from the iab-7 cis-regulatory domain of the Drosophila bithorax complex (BX-C) has been undertaken. Previous studies mapped the iab-7 PRE to an 860-bp fragment located just distal to the Fab-7 boundary. Located within this fragment is an ~230-bp chromatin-specific nuclease-hypersensitive region called HS3. HS3 has been shown to be capable of functioning as a Polycomb-dependent silencer in vivo, inducing pairing-dependent silencing of a mini-white reporter. The HS3 sequence contains consensus binding sites for the GAGA factor, a protein implicated in the formation of nucleosome-free regions of chromatin, and Pleiohomeotic (Pho), a Polycomb group protein that is related to the mammalian transcription factor YY1. GAGA and Pho interact with these sequences in vitro, and the consensus binding sites for the two proteins are critical for the silencing activity of the iab-7 PRE in vivo (Mishra, 2001).

The iab-7 PRE was initially identified in transgene assays using fragments from the iab-6 to -7 region of BX-C. These studies showed that an 860-bp iab-7 fragment can establish and maintain Pc-G-dependent silencing complexes in two different assays: the pairing-sensitive silencing of mini-white and the maintenance of parasegmentally restricted patterns of Ubx:LacZ expression. At the proximal end of this 860-bp fragment is the ~230-bp nuclease-hypersensitive region, HS3. Since Pc-G-dependent silencing is generally believed to involve a marked reduction in DNA accessibility, not enhanced accessibility, it is important to determine whether this nucleosome-free region of chromatin plays any role in the silencing activity of the iab-7 PRE. Two lines of evidence argue that sequences in HS3 are critical for silencing activity: (1) it has been shown that a small 260-bp fragment spanning HS3 is sufficient to mediate Pc-G-dependent silencing activity in the mini-white assay; (2) site-directed mutagenesis experiments indicate that sequences essential for silencing activity map to HS3 (Mishra, 2001).

An attractive hypothesis is that HS3 provides accessible target sequences for one or more sequence-specific DNA binding proteins. In this model, these DNA binding proteins would interact with their cognate sequences in HS3 and nucleate the assembly of Pc-G silencing complexes by recruiting Pc-G proteins. It seems likely that nucleosome-free regions of chromatin play a similar role in the functioning of other PREs. For example, the three other known PREs in the Abd-B cis-regulatory region, the iab-8 PRE, the iab-6 PRE, and Mcp, all map to small DNA fragments that contain one or more prominent nuclease-hypersensitive sites. Of these, the Mcp PRE has been characterized in the most detail. Like the iab-7 PRE, the nuclease-hypersensitive region of Mcp is essential for its silencing activity. However, it is not sufficient on its own to direct the assembly of functional silencing complexes, and adjacent proximal or distal flanking sequences are required. The chromatin structure of the Mcp element at ectopic sites has also been examined. (A ftz-LacZ transgene was used in this analysis. Unfortunately, the mini-white transgenes are not suitable for examining the chromatin structure of the iab-7 PRE fragments.) The transgene Mcp element has a nuclease-hypersensitive region of approximately the same size and position as that of the endogenous element (Mishra, 2001).

These experiments also indicate that two DNA binding proteins, the GAGA factor and Pho, interact with target sites in HS3 and play a critical role in the silencing activity of the iab-7 PRE. The GAGA factor was initially identified as a potent activator of transcription in nuclear extracts and has generally been thought to be involved in the activation rather than the repression of gene expression. The stimulatory activity of the GAGA factor appears to be due to its ability to prevent histones and other repressive proteins from associating with promoters that have GAGA binding sites. In in vitro chromatin assembly experiments the GAGA factor facilitates the formation of a nucleosome-free region of chromatin across the hsp70 promoter. In vivo, mutations in the GAGA binding sites of heat shock promoters reduce promoter accessibility and suppress transcription. Further support for a role in transcriptional activation comes from genetic studies on mutations in Trl, the gene encoding the GAGA protein. Trl mutations exhibit genetic interactions with homeotic genes in BX-C that are hallmarks of the trx-G genes, not the Pc-G genes. Additionally, the expression of several pair rule genes that have GAGA binding sites in their promoters is severely reduced in embryos from Trl mutant mothers (Mishra, 2001).

Although it is now well established that the GAGA factor promotes the transcription of many different genes, the results argue that this protein must also play an essential role in the silencing activity of the iab-7 PRE. Several lines of evidence support this conclusion: (1) the silencing activity of the iab-7 PRE is impaired by Trl mutations; (2) the GAGA protein binds to the iab-7 PRE both in vivo and in vitro; (3) mutations in the GAGA binding sites of the iab-7 PRE eliminate GAGA protein binding in nuclear extracts and abrogate silencing activity in vivo (Mishra, 2001).

What role does the GAGA factor play in the silencing activity of the iab-7 PRE? At this point the most plausible hypothesis is that the GAGA factor is required to generate a nucleosome-free region over HS3. In this view, the function of the GAGA factor would be analogous to its presumed role in gene activation, namely, to ensure that sequences in HS3 are accessible for the assembly of large multicomponent protein complexes. When the GAGA protein is reduced as in Trl mutants or when the GAGA binding sites are mutant, it is suggested that the HS3 nucleosome-free region will not be formed properly. As a consequence, target sequences for the DNA binding proteins (such as possibly Pho) that are actually responsible for recruiting the large Pc-G silencing complexes to the PRE would be unavailable. While this hypothesis is consistent with the well-documented activities of the GAGA factor at promoters both in vitro and in vivo, the possibility that GAGA is not only required for the formation of HS3 but also plays a more active role in recruiting Pc-G proteins to the iab-7 PRE cannot be excluded. Supporting this hypothesis, it has been shown that GAGA binding is required for the in vitro association of Pc-G complexes with fragments from the bxd PRE (Mishra, 2001).

Unlike that of Trl, the phenotypes of pho mutants are similar to those seen for other Pc-G genes. Animals homozygous for loss-of-function alleles die at the pupal stage and exhibit homeotic transformations of legs and abdomen. The late lethal phase is due to a substantial maternal contribution, and mutant embryos lacking a maternal source of wild-type Pho die with severe homeotic transformations and other developmental defects. The homeotic transformations evident in mutant animals indicate that pho is likely to have a direct role in Pc-G silencing. For the iab-7 PRE, the results argue that silencing activity depends on the binding of the Pho protein to the two target sites in HS3. Both sites seem to be important, since silencing activity is compromised when one site is deleted. Whereas it is supposed that the major function of the GAGA factor is to ensure that sequences in HS3 are accessible to other proteins, the phenotypic effects of pho mutations suggest that it plays a more active role in silencing. A plausible hypothesis is that it functions (perhaps together with as yet unidentified factors) to recruit components of the silencing machinery to the PRE, such as Polycomb or Sex Combs Midleg, which do not appear to interact directly with DNA. Supporting the possibility that other factors besides Pho play a critical role in recruiting Polycomb group complexes, a PRE fragment from iab-2, which contains Pho binding sites and which is able to silence mini-white, has been shown to be insufficient to confer full Pc-G maintenance activity. Moreover, mutations in the two Pho binding sites have only a minor effect on the maintenance activity of the 860-bp iab-7 PRE fragment in an iab-7 Ubx-LacZ assay system. Clearly it will be of interest to identify these other factors (Mishra, 2001).

Silencing of homeotic gene expression requires the function of cis-regulatory elements known as Polycomb Response Elements (PREs). The MCP silencer element of the Drosophila homeotic gene Abdominal-B has been shown to behave as a PRE and to be required for silencing throughout development. Using deletion analysis and reporter gene assays, a 138 bp sequence has been defined within the MCP silencer that is sufficient for silencing of a reporter gene in the imaginal discs. Within the MCP138 fragment, there are four binding sites for the Pleiohomeotic protein (Pho) and two binding sites for the GAGA factor, encoded by the Trithorax-like gene. PHO and the Trl proteins bind to these sites in vitro. Mutational analysis of Pho and Trl binding sequences indicate that these sites are necessary for silencing in vivo. Moreover, silencing by MCP138 depends on the function of Trl, and on the function of the PcG genes, including pleiohomeotic. Deletion and mutational analyses show that, individually, either Pho or Trl binding sites retain only weak silencing activity. However, when both Pho and Trl binding sites are present, they achieve strong silencing. A model is presented in which robust silencing is achieved by sequential and facilitated binding of Pho and Trl (Busturia, 2001).

How does Trl or perhaps another GAGA binding protein contribute to the silencing by MCP, and what is its relationship to the Pho protein function? Two models to explain their relationship which leads to strong silencing are suggested. These models are based on the following observations. (1) Pho binding sites by themselves show little silencing activity (MCP1 and MCP7* constructs). (2)Trl or some other protein that binds to MCP can weakly recruit silencing complexes in the absence of Pho binding (5MPho construct). (3) When present together, Trl and Pho binding sites exhibit robust silencing activity (MCP7 construct). In the first model, Trl and Pho bind to the MCP silencer in a sequential order. One version would be that Trl binding is absolutely required for binding or activity of Pho. Trl may open up chromatin at MCP, allowing binding of Pho. Upon binding, Pho may recruit PcG silencing complexes, although there is still little evidence that this happens. Trl has been shown to induce DNase I hypersensitive sites, or nucleosome-free regions, and this may create a prerequisite condition for Pho to bind to its recognition sites. There is indeed a DNase hypersensitive region associated with MCP that includes the location of the Trl binding site (Busturia, 2001).

In a second version of the model, Pho acts as a facilitator of Trl binding by creating some pre-condition, perhaps by bending DNA as YY1 does. Since Pho binding sites are not absolutely required for MCP silencing activity, Trl presumably can bind weakly to MCP in the absence of Pho. Enhanced binding of Trl leads to increased recruitment of silencing complexes. Trl bound to MCP may recruit PcG silencing complexes by directly interacting with PC or other members of PcG complexes. Alternatively, Trl could first recruit SIN3 histone deacetylation complexes through its interaction with SAP18, which then might generate a chromatin state favorable for PcG complex binding. Whichever version of the model is correct, the important feature of the model is the sequential recruitment of DNA binding proteins, Trl and Pho, to MCP. Binding of one protein creates a condition favorable to the binding of a second protein, eventually leading to the recruitment of PcG complexes. Note that the requirement of Trl and Pho proteins applies to MCP silencing, but not necessarily to all PREs. Other PREs may use other combinations of proteins. This model is analogous to Swi5 protein binding to the yeast HO promoter and recruiting the chromatin remodeling complex Swi/Snf. Swi/Snf in turn recruits the histone acetylase complex SAGA, eventually leading to the binding of the transcription factor SBF to the HO promoter. In such a sequential recruitment model, compromising one step in the sequence may become rate limitating so that combining two mutations that disable two different steps may not necessarily lead to synergistic effects. This may explain why no synergistic effects are observed when Trl and PcG mutations are combined. In the second model, Trl and Pho bind to MCP independently of one another. Each protein may induce a unique chromatin modification that, together, can have a positive synergistic effect on the recruitment of PcG silencing complexes (Busturia, 2001).

During late embryogenesis, the expression domains of homeotic genes are maintained by two groups of ubiquitously expressed regulators: the Polycomb repressors and the Trithorax activators. It is not known how the activities of the two maintenance systems are initially targeted to the correct genes. Zeste and GAGA are sequence-specific DNA-binding proteins that are Trithorax group activators of the homeotic gene Ultrabithorax. Zeste and GAGA DNA-binding sites at the proximal promoter are also required to maintain, but not to initiate, repression of Ubx. Furthermore, the repression mediated by Zeste DNA-binding site is abolished in zeste null embryos. These data imply that Zeste and probably GAGA mediate Polycomb repression. A model is presented in which the dual transcriptional activities of Zeste and GAGA are an essential component of the mechanism that chooses which maintenance system is to be targeted to a given promoter (Hur, 2002).

Zeste, GAGA and a third transcription factor, NTF-1 (Grainy head), activate promoter constructs of the Ubx gene in embryos via an intermingled cluster of sites between nucleotides -200 to -31. However, the constructs that were used in these experiments contain only a small subset of the Ubx cis regulatory region, and while they reproduce many features of Ubx expression, they do not respond to Polycomb repression when inserted at many chromosomal locations. Consequently, they have not permitted a rigorous analysis of the role of the proximal promoter factors in maintaining repression. To address this question, larger constructs have been used that contain the 22 kb of DNA upstream of the Ubx mRNA start site. These constructs do not suffer from significant position effect variation; they more closely approximate the expression pattern of the endogenous Ubx gene than the shorter constructs; they maintain efficient repression in late embryos as shown by the lack of ß-galactosidase reporter gene expression in more anterior and posterior regions, and they are genetically under the control of PcG genes (Hur, 2002).

Deletion of nucleotides -200 to -31 essentially abolishes transcription from the large Ubx promoter constructs, indicating a crucial role for factors binding to the proximal promoter. To determine the role of each factor separately, three constructs were prepared, each containing binding sites for either Zeste, GAGA or NTF-1 inserted between the deletion end points of the above construct. Importantly, biochemical, in vivo u.v. crosslinking, and genetic experiments strongly suggest that the DNA-binding sites used in these constructs are recognized only by their cognate factor, and not by any other sequence-specific DNA-binding activities. Binding sites for each factor separately activate transcription of the large constructs during late embryogenesis. Strikingly, constructs containing only GAGA- or Zeste-binding sites at the proximal promoter are not expressed in the anterior or posterior of the embryo, whereas constructs bearing only NTF-1 sites are strongly transcribed in these terminal regions (Hur, 2002).

Ectopic expression of Ubx in anterior and posterior regions is generally caused by a failure of the initiating repressors or the Polycomb maintenance system. One interpretation of this result is that Zeste and GAGA are required for at least one form of repression, while NTF-1 is not. It is also possible, however, that Zeste and GAGA are not repressors. Instead, it may be that they are unable to activate expression in anterior or posterior regions, even though they are expressed at similar levels throughout the embryo. To distinguish between these two possibilities, constructs were examined that contained either Zeste and NTF-1 sites or GAGA and NTF-1 sites. These constructs are expressed in the central region of the embryo; but, importantly, they are not significantly expressed in anterior or posterior regions. Since NTF-1 can activate Ubx transcription in these terminal regions, the absence of terminal expression is consistent with GAGA and Zeste directly repressing transcription in addition to their activation function (Hur, 2002).

To establish decisively if Zeste and GAGA are repressors, it was desirable to use a genetic test. Unfortunately, GAGA is a lethal gene and a broadly acting regulator required for expression of transcription factors that regulate Ubx in early embryos. Thus, it has not been possible to determine genetically whether GAGA is a direct repressor of Ubx. By contrast, zeste is a largely redundant gene. zeste null embryos and flies are essentially wild type, and the endogenous Ubx gene is expressed normally in these animals; but because the 22UZ transgenes lack the cis regulatory elements through which factors that redundantly share the function of zeste act, these transgenes should be regulated by zeste (Hur, 2002).

Consistent with this idea, transgenes containing only Zeste sites at the proximal promoter fail to express in zeste mutant embryos, whereas constructs containing only GAGA or NTF-1 binding sites are expressed in this same genetic background. Thus, this genetic experiment confirms that Zeste bound at the proximal promoter is required to activate transcription of the 22UZ constructs in the normal domain of Ubx expression. To test the role of Zeste in repression, constructs containing binding sites for both Zeste and NTF-1 at the proximal promoter were compared in wild type and zeste mutant embryos. In the normal domain of Ubx expression, these constructs are expressed at similar levels in mutant and wild-type embryos. Importantly, these constructs are derepressed in anterior and posterior regions of embryos lacking zeste. Thus, Zeste actively represses transcription in terminal regions of the embryo via binding sites at the proximal promoter (Hur, 2002).

The PcG genes are an essential part of system that maintains repression of the endogenous Ubx gene. To confirm that these genes also act on these transgenes, the 22UZ Zeste and 22UZ GAGA constructs were crossed into PcG mutant embryos. Both transgenes are derepressed in late stage embryos lacking the Polycomb gene. Similar results were obtained in embryos lacking another PcG gene, extra sex combs. Thus, Zeste -- and probably also GAGA -- act together with the Polycomb system to maintain repression of Ubx (Hur, 2002).

It is suspected that GAGA and Zeste have redundant, overlapping functions in maintaining repression because the 22UZ Native construct, which contains Zeste, GAGA and NTF-1 sites, is not derepressed in zeste mutant embryos, which contrasts with the behavior of the 22UZ ZESTE/NTF-1 construct. Such redundancy in repression would parallel the known redundancy between these two transcription factors in activating Ubx in the central portions of the animal, and helps explain the previous lack of evidence that Zeste and GAGA are repressors (Hur, 2002).

The data presented in this paper are consistent with the earlier genetic data that suggested that some trxG and PcG proteins may have dual activities. Further support for this idea comes from recent biochemical experiments that have shown that GAGA is complexed with two PcG proteins in Drosophila nuclear extracts and Zeste is part of a multisubunit complex that contains Polycomb. In addition, PcG proteins are frequently associated in vivo with promoter regions that include Zeste or GAGA DNA recognition sites, including the Ubx proximal promoter examined in this paper. Most PcG proteins do not recognize specific DNA sequences; thus, the interaction with Zeste and GAGA may serve to recruit PcG proteins to promoters (Hur, 2002).

But is it essential that some proteins, such as Zeste and GAGA, participate in both repression and activation, or is it mere coincidence? This joint participation may be essential. At the transition between the initiating repressors and the Polycomb system, one possibility is it that Polycomb proteins are recruited to or activated on only those genes that are bound by initiating repressors; the initiating repressors may physically bind to PcG proteins to recruit them. However, Polycomb repression can be established on Ubx promoter constructs that lack initiating repressors elements, provided that initiating enhancer elements are also absent. In other words, at the transition between the establishment and maintenance of the Ubx expression pattern, the Polycomb systems reads the absence of activation, rather than the presence of repression or repressors (Hur, 2002).

Repression and activation of the expression of homeotic genes are maintained by proteins encoded by the Polycomb group (PcG) and trithorax group (trxG) genes. Complexes formed by these proteins are targeted by PcG or trxG response elements (PREs/TREs), which share binding sites for several of the same factors. The repressive class II PcG complex PRC1 has more than 30 protein subunits, including 5 that have been genetically defined as PcG proteins: Polycomb (Pc), Posterior sex combs (PSC), Polyhomeotic (Ph), dRING1, and, at substoichiometric levels, sex combs on midlegs (SCM). GAGA factor and Zeste bind specifically to PREs/TREs and have been shown to act as both activators and repressors. Purified proteins and complexes have been reconstituted from recombinant subunits to characterize the effects of GAGA and Zeste proteins on PcG function using a defined in vitro system. Zeste directly associates with the PRC1 core complex (PCC) and enhances the inhibitory activity of this complex on all templates, with a preference for templates with Zeste binding sites. GAGA does not stably associate with PCC, but nucleosomal templates bound by GAGA are more efficiently bound and more efficiently inhibited by PCC. Thus Zeste and GAGA factor use distinct means to increase repression mediated by PRC1 (Mulholland, 2003).

To demonstrate directly that GAGA enhances template recognition by PCC, a recruitment assay was developed. Ubx5S DNA, consisting of a portion of the Ubx promoter with known high-affinity Zeste binding sites, was biotinylated, assembled into chromatin, and immobilized on streptavidin-coated magnetic beads. Nucleosomal arrays were incubated with either buffer only or GAGA for 15 min at 30°C prior to addition of PCC. Following a 20-min binding period, array-bound beads and all material bound to them were separated magnetically from unbound protein. Western blot analysis demonstrates that PCC and GAGA components bound to the nucleosomal array-bead complex and that PCC association is increased on arrays bound by GAGA. GAGA and PCC bind only minimally to unconjugated beads. Increasing amounts of competitor nucleosomal array result in a loss of PCC association with the array-bead complex, but do not affect PCC association with the GAGA array-bead complex. These results demonstrate that PCC has a higher affinity for nucleosomal templates bound by the GAGA factor. Prebinding the GAGA{Delta}POZ protein does not lead to increased recruitment of PCC (Mulholland, 2003).

These experiments show that a template prebound by GAGA factor is more efficiently bound and repressed by PCC than an unbound template. GAGA factor might recruit or stabilize PCC binding by directly interacting with its subunits, or GAGA might alter the template in a manner that favors PCC binding. GAGA factor oligomers have been shown to be able to bind multiple templates simultaneously, bringing them together. Binding by GAGA factor might create a network of templates that is more efficiently bound and recognized by PCC than an individual template might be (Mulholland, 2003).

In the defined in vitro system used in this study, both Zeste and GAGA factor can enhance the activity of a PRC1 core complex to repress remodeling of a nucleosomal template. Zeste binds directly to these PcG proteins to generally increase their repressive function, and prebinding GAGA factor to the template recruits the PRC1 core to that template. Previous genetic and mechanistic studies have suggested that regulation of PRC1 repression is a complicated process involving targeting by sequence-specific DNA binding proteins, covalent modification of histone tails, and perhaps targeting by siRNAs. The differences in function of GAGA and Zeste suggest that their role in PcG repression is more complex than previously suspected. One hypothesis for how sequence-specific factors establish PcG repression is that they create a binding surface with greater affinity for PRC1. Although these experiments with GAGA factor are consistent with this hypothesis, the experiments with Zeste suggest that additional mechanisms contribute to targeting by sequence-specific factors. Zeste binds tightly to the core components of PRC1 and enhances their activity even when templates do not contain targeting sequences. This might be important in facilitating the ability of PRC1 repression to spread away from PRE elements, and thus may facilitate the repression of large domains by PRC1 (Mulholland, 2003).

Although originally identified as an activator, Zeste can also function in vivo as a repressor. For example, in zeste mutant flies, a transgene containing the Ubx promoter modified to contain only Zeste binding sites is derepressed in the anterior and posterior segments of the embryo. Two distinct substitution mutants of zeste express proteins that repress rather than activate the white gene but retain activator function required for transvection, suggesting that Zeste has both inherent activation and repression activities that can be separated. It is possible that, when incorporated into PRC1, Zeste is configured so as to only display surfaces responsible for repression (Mulholland, 2003).

It is widely believed that PcG activity is targeted and maintained throughout the course of development by multiple systems. For instance, ESC/E(z) can methylate H3 K27, and PC can bind to this modification, suggesting that a methylation mark might also play a key role in targeting PRC1 and/or in regulating the spread of PRC1 activity. The combined effects of factors such as GAGA that target PRC1 activity, factors such as Zeste that augment PRC1 activity, and other systems such as those for covalent modification of histones might be necessary for faithful maintenance of PRC1 association with a template. It is likely that further mechanisms, such as RNAi, also contribute (Mulholland, 2003).

These multiple mechanisms might be additive or synergistic. Additionally, redundancy between them would provide a fail-safe scheme for maintenance of repression. For instance, if methylation at H3 K27 and increased function by Zeste each were sufficient to establish repression by PRC1, then repression could be established even if one or the other were to fail. Consistent with this hypothesis of redundant function, experiments were performed in which both Zeste and GAGA were present; no significant additive or synergistic effects on PCC function was seen. The establishment of defined in vitro systems, such as used here, will aid in unraveling the connections between the different mechanisms that contribute to regulation of PcG function (Mulholland, 2003).

The role of the GAGA factor [encoded by the Trithorax-like (Trl) gene] in the enhancer-blocking activity of Frontabdominal-7 (Fab-7), a domain boundary element from the Drosophila melanogaster bithorax complex (BX-C), was analyzed. One of the three nuclease hypersensitive sites in the Fab-7 boundary, HS1, contains multiple consensus-binding sequences for the GAGA factor, a protein known to be involved in the formation and/or maintenance of nucleosome-free regions of chromatin. GAGA protein has been shown to localize to the Fab-7 boundary in vivo, and it recognizes sequences from HS1 in vitro. Using two different transgene assays it has been demonstrated that GAGA-factor-binding sites are necessary but not sufficient for full Fab-7 enhancer-blocking activity. Distinct GAGA sites are required for different enhancer-blocking activities at different stages of development. The enhancer-blocking activity of the endogenous Fab-7 boundary is sensitive to mutations Trithorax-like (Schweinsberg, 2004).

Assuming that GAGA factor interactions with the consensus-binding sites in HS1 are important to Fab-7 activity, one question of interest is whether GAGA plays a direct or indirect role in boundary function. Since GAGA-binding sites have been shown to be required for the enhancer-blocking activity of other fly elements in addition to Fab-7, one must consider the possibility that GAGA plays a direct role in boundary function analogous to that of, for example, the Su(Hw). However, this view is difficult to reconcile with the fact that GAGA is required for the functioning of other elements unrelated to boundaries such as promoters, enhancers, and PREs, not to mention its role in centromeric heterochromatin and chromosome segregation. Also arguing against a direct role in boundary function, it was found that enhancer-blocking activity cannot be reconstituted by multimerizing GAGA-factor-binding sites. In contrast, one known activity of the GAGA factor that would account for its ability to participate in the functioning of such a diverse array of regulatory elements is the formation and/or maintenance of nucleosome-free regions of chromatin. In this model, GAGA factor binding to sites in HS1 would ensure that this 400-bp sequence is nucleosome free and that target sequences within HS1 are readily accessible for the binding of other factors that actually confer boundary function. In this case, mutations in the HS1 GAGA sites would disrupt boundary function indirectly because of difficulties in generating a nucleosome-free region of chromatin, which is fully accessible for these other boundary proteins. While the idea is favored that a key function of the GAGA protein is to ensure DNA accessibility, it is reasonable to think that GAGA may also play a more central role in Fab-7 boundary function because of its ability to participate in protein:protein interactions. The N terminus of the GAGA protein has a BTB/POZ domain that is present in numerous other proteins. Moreover, depending upon the particular protein partners, the GAGA factor appears to have rather different activities. Thus, a plausible hypothesis is that GAGA can have boundary functions when it is combined with one set of proteins, transcriptional activation functions when it is combined with another, and Pc-G-silencing functions when combined with yet a third set of proteins. This would explain why no boundary function was observed with the multimermized GAGAG sites. Moreover, if this hypothesis is correct, then it would be reasonable to think that the partners for GAGA protein bound at sites 3-4 are likely to be different from the partners for GAGA protein bound at sites 1-2 or, presumably, 5-6. Further studies will be required to identify these putative partners and to understand how they function together with the GAGA factor to provide a scaffold for building a boundary element (Schweinsberg, 2004).

Trithorax-like functions in regulating Fushi tarazu

GAGA factor (also known as Trithorax-like) is known to remodel the chromatin structure in concert with nucleosome-remodeling factor NURF in a Drosophila embryonic S150 extract. The promoter region of fushi tarazu carries several binding sites for GAGA factor, which triggers chromatin remodeling. Deletion of the GAGA factor-binding sites in the ftz promoter is known to markedly reduced the reporter gene expression. The striped expression of ftz is abolished by a mutation of the Trithorax-like gene, which encodes GAGA factor. Transcriptional activation of the ftz gene is observed when a preassembled chromatin template is incubated with GAGA factor and the S150 extract (Okada, 1998).

GAGA factor does not activate transcription on a naked DNA template. GAGA factor has been reported to activate transcription on naked DNA templates in crude extracts but not in transcription systems reconstituted from purified components. GAGA factor-mediated transcriptional activation on a naked DNA template requires the presence of a nonspecific DNA-binding protein, suggesting that GAGA factor functions as an antirepressor by preventing nonspecific inhibitory proteins such as histone H1 from binding to DNA. It is therefore possible that GAGA factor may activate transcription by an antirepressor mechanism in the transcription assay used in the current study. To test this possibility, experiments were carried out starting from a naked DNA template. Since the amount of S150 extract in the preincubation reactions is one order of magnitude lower than that required for full assembly of the chromatin structure on the template, nucleosomes are barely detectable by supercoiling assay after preincubation. In contrast to the preassembled chromatin, little activation (up to 1.5-fold) of ftz transcription is observed after preincubation of the naked template DNA with GAGA factor and the S150 extract. This indicates that only trace levels of activation may be caused by elimination of nonspecific DNA-binding proteins in the presence of GAGA factor. These results also suggest that the GAGA factor-mediated transcriptional activation occurs specifically on the chromatin template. These observations suggest that GAGA factor-mediated chromatin remodeling is required for the proper expression of ftz in vivo (Okada, 1998).

The S150 extract contains a nucleosome-remodeling factor (NURF) that acts with GAGA factor to disrupt the ordered array of nucleosomes near the GAGA factor-binding sites. The chromatin structure within the ftz promoter is specifically disrupted by incubation of the preassembled chromatin with GAGA factor and the S150 extract. Micrococcal nuclease assays show that the nucleosome structure surrounding nucleotide 350 in front of the TATA element (and the TATA element itself) of ftz are disrupted by incubation with GAGA factor and the S150 extract. A restriction enzyme assay demonstrates that the AvaII site at 9, the FspI sites at 90 and 317, and the PstI site at 267 on the ftz chromatin template are more susceptible to digestion after incubation with GAGA factor and the S150 extract. Base substitutions of all four GAGA sequences in the ftz promoter at 360, 348, 158, and 46 are required to completely suppress the GAGA factor-mediated chromatin remodeling. These results indicate that chromatin is remodeled throughout the proximal region of the ftz promoter. Both transcriptional activation and chromatin disruption are blocked by an antiserum raised against ISWI or by base substitutions in the GAGA factor-binding sites in the ftz promoter region. These results demonstrate that GAGA factor- and ISWI-mediated disruption of the chromatin structure within the promoter region of ftz activates transcription on the chromatin template. In vitro transcription studies have revealed that activation of ftz by FTZ-F1 requires two coactivators, termed MBF1 and MBF2. MBF1 is a bridging molecule that interconnects FTZ-F1 and TATA-binding protein and recruits positive cofactor MBF2 to a promoter carrying the FTZ-F1-binding site. MBF2 activates transcription through its contact with TFIIA. This allows the selective activation of ftz in a FTZ-F1 binding site-dependent manner. It is most likely that the GAGA factor-mediated chromatin remodeling in the proximal region of the ftz promoter is a prerequisite for the formation of active complexes containing FTZ-F1, MBF1, MBF2, TFIIA, and TBP (Okada, 1998).

The intrinsic enhancer-promoter specificity and chromatin boundary/insulator function are two general mechanisms that govern enhancer trafficking in complex genetic loci. They have been shown to contribute to gene regulation in the homeotic gene complexes from fly to mouse. The regulatory region of the Scr gene in the Drosophila Antennapedia complex is interrupted by the neighboring ftz transcription unit, yet both genes are specifically activated by their respective enhancers from such juxtaposed positions. A novel insulator, SF1, has been identified in the Scr-ftz intergenic region that restricts promoter selection by the ftz-distal enhancer in transgenic embryos. The enhancer-blocking activity of the full-length SF1, observed in both embryo and adult, is orientation- and enhancer-independent. The core region of the insulator, which contains a cluster of GAGA sites essential for its activity, is highly conserved among other Drosophila species. SF1 may be a member of a conserved family of chromatin boundaries/insulators in the HOM/Hox complexes and may facilitate the independent regulation of the neighboring Scr and ftz genes, by insulating the evolutionarily mobile ftz transcription unit (Belozerov, 2003).

Chromatin boundary function has been shown to be important for gene regulation in the Hox clusters from fly to mouse. However, the protein components involved in the Hox boundary activity, as well as the mechanism of the boundary function are unknown. Multiple GAGA binding sites have been identified that are essential for the enhancer-blocking activity of the SF1 core insulator. Drosophila GAGA factor may be involved in the SF1 boundary function. Similar findings that GAGA sites are critical for the function of Mcp1 and Fab7 boundary elements from the BX-C have been reported recently. These observations suggest that the chromatin insulators from the ANT-C and the BX-C may share common components and mechanisms, and belong to a family of conserved boundary elements that regulate enhancer-promoter interactions in the Hox complexes (Belozerov, 2003).

It is interesting that the GAGA factor is implicated in the boundary activity in the Drosophila Hox clusters. The GAGA factor has been known to regulate transcription by recruiting chromatin remodeling and transcription initiation complexes. However, its role in boundary/insulator activity may not be attributed to its ability to activate transcription but rather to the ability of this protein to forge links among distant DNA elements through its BTB domain. This property of the GAGA factor is consistent with the looping models proposed for the insulator/boundary mechanism (Belozerov, 2003).

Trithorax-like regulation of HSP70

A considerable effort has been expended trying to understand the role of TRL/GAGA in gene activation. Two issues are of interest, the first having to do with the remodeling of the promoter involving GAGA and remodeling factors. A second area of investigation looks at the interaction of GAGA with polymerase. RNA polymerase II is transcriptionally engaged but paused approximately 25 nucleotides from the start site of the hsp70 gene of Drosophila in uninduced (non-heat-shocked) flies. Sequences that reside upstream of the hsp70 TATA element specify the formation of a paused polymerase on the 5' end of this gene. Within this region are multiple copies of the GAGA element, which is known to bind a constitutively expressed factor. This element appears to play a role in generating the pause (although, see Weber, below). In the absence of much of this upstream region, hsp70 sequences in the vicinity of the transcriptional start and pause site participate in specifying the pause. Deletions of the pause site reduce the level of paused polymerase but do not lead to constitutive transcription. However, a connection between transcription and pausing is seen. The level of paused polymerase on the promoter correlates with the promoter's potential to direct heat-induced transcription (Lee, 1992).

Since purified GAGA factor and TFIID interact similarly with the hsp70 and histone H3 promoters, the architecture of the endogenous H3 promoter has been analyzed to determine what interactions might be needed to establish a potentiated state containing a pause as seen in HSP70 promoter. Despite the detection of TFIID and GAGA on the H3 promoter, no paused polymerase as seen in the HSP70 promoter is evident. In addition, no proteins appear to interact with the transcription start. These results suggest that the GAGA factor and TFIID are not sufficient to establish a potentiated state containing paused polymerase and that TFIID interactions downstream from the TATA element could be important for pausing (Weber, 1995).

The generation of an accessible heat shock promoter in chromatin in vitro requires the concerted action of the GAGA transcription factor and NURF, an ATP-dependent nucleosome remodeling factor. NURF is composed of four subunits and is biochemically distinct from the SWI2/SNF2 multiprotein complex, a transcriptional activator that also appears to alter nucleosome structure. The 140 kDa subunit of NURF can be identified as ISWI, previously of unknown function but highly related to SWI2/SNF2 only in the ATPase domain. The ISWI protein is localized to the cell nucleus and is expressed throughout Drosophila development at levels as high as 100,000 molecules/cell. The convergence of biochemical and genetic studies on ISWI and SWI2/SNF2 underscores these ATPases and their close relatives as key components of independent systems for chromatin remodeling (Tsukiyama, 1995 a and b).

Three promoter sequences influence the access of HSF to its binding sites: the GAGA element, sequences surrounding the transcription start site, and a region in the leader of hsp70 where RNA polymerase II arrests during early elongation. The GAGA element has been shown to disrupt nucleosome structure. Because the two other critical regions include sequences that are required for stable binding of TFIID in vitro, the in vivo occupancy of the TATA elements were examined in the transgenic promoters. TATA occupancy correlates with HSF binding for some promoters. However, in all cases HSF accessibility correlates with the presence of paused RNA polymerase II. Thus, a complex promoter architecture is established by multiple interdependent factors, including GAGA factor, TFIID, and RNA polymerase II (Shopland, 1995).

GAGA factor, TFIID, and paused polymerase are present on the hsp70 promoter in Drosophila melanogaster prior to transcriptional activation. In order to investigate the interplay between these components, mutant constructs were analyzed after they had been transformed, on P elements, into flies. One construct lacks the TATA box and the other lacks the upstream regulatory region where GAGA factor binds. Transcription of each mutant during heat shock is at least 50-fold less than that of a normal promoter construct. Before and after heat shock, both mutant promoters are found to adopt a DNase I hypersensitive state that includes the region downstream from the transcription start site. High-resolution analysis of the DNase I cutting pattern identifies proteins that could be contributing to the hypersensitivity. GAGA factor footprints are clearly evident in the upstream region of the TATA deletion construct, and a partial footprint possibly caused by TFIID is evident on the TATA box of the upstream deletion construct. Permanganate treatment of intact salivary glands was used to further characterize each promoter construct. Paused polymerase and TFIID are readily detected on the normal promoter construct, whereas both deletions exhibit reduced levels of each of these factors. Hence both the TATA box and the upstream region are required to efficiently recruit TFIID and a paused polymerase to the promoter prior to transcriptional activation. In contrast, GAGA factor appears to be capable of binding and establishing a DNase I hypersensitive region in the absence of TFIID and polymerase. Nevertheless GAGA factor could not be detected on the downstream region of the TATA deletion, and thus there is no direct proof that the GAGA factor interacts with the core promoter region in vivo. Nevertheless, purified GAGA factor is found to bind near the transcription start site; the strength of this interaction is increased by the presence of the upstream region. It is concluded that GAGA factor alone might be capable of establishing an open chromatin structure that encompasses the upstream regulatory region as well as the core promoter region, thus facilitating the binding of TFIID (Weber, 1997).

Genome-wide prediction of Polycomb/Trithorax response elements

Polycomb/Trithorax response elements (PRE/TREs) maintain transcriptional decisions to ensure correct cell identity during development and differentiation. There are thought to be over 100 PRE/TREs in the Drosophila genome, but only very few have been identified due to the lack of a defining consensus sequence. The definition of sequence criteria that distinguish PRE/TREs from non-PRE/TREs is reported in this study. Using this approach for genome-wide PRE/TRE prediction, 167 candidate PRE/TREs are reported, that map to genes involved in development and cell proliferation. Candidate PRE/TREs are shown to be bound and regulated by Polycomb proteins in vivo, thus demonstrating the validity of PRE/TRE prediction. Using the larger data set thus generated, three sequence motifs that are conserved in PRE/TRE sequences have been identified (Ringrose, 2003).

The detection of PRE/TREs by prediction generates a large data set that can be used to search for further common sequence features. To this end, the 30 highest scoring PRE/TRE hits were scanned for motifs that occur significantly more often in PRE/TREs than in randomly generated sequence. Five significant motifs were found. Not surprisingly, but reassuringly, two known motifs, the GAF and PHO binding sites were found. The Zeste binding motif was not found by this analysis, although it occurs as frequently as GAGA factor in the 30 sequences analyzed. This is probably due to the shortness and degeneracy of the Zeste motif, and suggests that other such short motifs will also be missed by this approach (Ringrose, 2003).

Nevertheless, three additional motifs were found. The first, called GTGT, is found several times in 14 of the sequences. The second motif, poly T, is found several times in almost all 30 PRE/TRE sequences analyzed. Some variants of this site match the binding consensus for the Hunchback protein, which has been shown to be an early regulator at some PRE/TREs. The third motif, TGC triplets, occurs several times in 13 of the PRE/TRE sequences. No binding factor for this sequence has yet been identified (Ringrose, 2003).

To further examine these three motifs, motif occurrence was evaluated in all 167 predicted PRE/TREs and in the promoter peaks described above. In contrast to the known GAF, Z, and PHO motifs, the three motifs each occur in only a subset of predicted and known PRE/TREs, and do not occur significantly together. These motifs may thus each define a subclass of PRE/TREs. Consistent with this idea, some of the lowest scoring known PRE/TRE sequences indeed contain one or more of the three motifs (Ringrose, 2003).

Although no correlation between particular sites and high scores was found, a negative correlation was found between numbers of GAF/Z and PHO sites (a correlation coefficient of -0.78, indicating that when many GAF/Z sites are present, there are few PHO sites, and vice versa). This suggests that each PRE/TRE may have a preferred ground state, in which it is either predisposed to silencing (many PHO sites) or to activation (many GAF/Z sites) (Ringrose, 2003).

In summary, this analysis identifies three motifs that occur significantly in association with known PRE/TRE motifs. Further functional characterization of these motifs and the proteins that bind them may contribute to a more complete definition of the sequence requirement for PRE/TRE function, and of subclasses of PRE/TREs (Ringrose, 2003).

This study offers four main contributions to the understanding of PRE/TRE function. First, a larger set of sequences have been defined that will facilitate the more complete definition of PRE/TRE sequence requirements. Three motifs have been identified that may contribute to this goal. The definition of the minimal requirement for PRE/TRE function will not be a trivial task. Analysis of motif composition and order in the 167 predicted PRE/TREs reveals that there is a great diversity of patterns, with no preferred linear order. It is possible that each different pattern of motifs reflects a subtly different function. However, the concept of a linear order of motifs may well be irrelevant, because these elements operate in the three-dimensional context of chromatin. The fact that such a diversity of PRE/TRE designs exist indicates that the vast majority of them would defy detection by conventional pattern-finding algorithms, and underlines the advantages of the approach described in this study (Ringrose, 2003).

Although no linear constraints on motif order were found, the fact that only motif pairs, and not single motifs, are able to identify PRE/TREs strongly suggests that this close spacing of sites has functional significance. Multiple sites may work in concert, to promote cooperative binding of similar proteins (e.g., repeated PHO sites) or to provoke competition between dissimilar proteins (e.g., closely spaced GAGA factor and PHO sites). In addition, in chromatin, only a subset of sites will be exposed and optimally available for binding at any one time, while others will be occluded by nucleosomes. The trxG includes nucleosome remodeling machines, raising the intriguing possibility that remodeling of PRE/TREs in chromatin may contribute to epigenetic switching by exposing different sets of protein binding sites (Ringrose, 2003).

Second, a PRE/TRE peak is observed at the promoter of all the genes examined. This strongly suggests that promoter binding is a general principle of PRE/TRE function. It has been reported that PcG proteins can interact with general transcription factors. It has hitherto been unclear whether the observed PcG/trxG binding at promoters of the genes they regulate is mediated indirectly via such an interaction, or whether the PcG and trxG bind directly to PRE/TREs at the promoters. The high scores observed at promoters favor the latter interpretation (Ringrose, 2003).

Third, it has been shown that in most cases, PRE/TREs do not occur in isolation, but are accompanied by one or more other peaks nearby. These grouped PRE/TREs may create multiple attachment sites for PcG and trxG proteins, which come together to build a fully operational complex at the promoter. Alternatively, grouped PRE/TREs may be individually regulated by tissue-specific enhancers as in the BX-C. Thus, each of the many PRE/TREs of the homothorax gene may interact with the promoter PRE/TRE in different tissues. This idea is consistent with the fact that Homothorax has specific roles in diverse developmental processes (Ringrose, 2003).

Finally, the current list of about ten PcG/trxG target genes has been expanded to over 150 genes, identifying candidates for epigenetic regulation. The genes thus identified encompass every stage of development, suggesting that the PcG/trxG are global regulators of cellular memory. Experiments to further investigate and compare this regulation for individual genes are currently underway (Ringrose, 2003).

Genomewide analysis of GAGA factor target genes reveals context-dependent DNA binding

The association of sequence-specific DNA-binding factors with their cognate target sequences in vivo depends on the local molecular context, yet this context is poorly understood. To address this issue, genomewide mapping was performed of in vivo target genes of Drosophila GAGA factor (GAF). The resulting list of approx. 250 target genes indicates that GAF regulates many cellular pathways. Unbiased motif-based regression analysis was applied to identify the sequence context that determines GAF binding. The results confirm that GAF selectively associates with (GA)n repeat elements in vivo. GAF binding occurs in upstream regulatory regions, but less in downstream regions. Surprisingly, GAF binds abundantly to introns but is virtually absent from exons, even though the density of (GA)n is roughly the same. Intron binding occurs equally frequently in last introns compared with first introns, suggesting that GAF may not only regulate transcription initiation, but possibly also elongation. Evidence is provided for cooperative binding of GAF to closely spaced (GA)n elements and the lack of GAF binding to exons is explained by the absence of such closely spaced GA repeats. This approach for revealing determinants of context-dependent DNA binding will be applicable to many other transcription factors (van Steensel, 2003).

The use of the DamID chromatin profiling has been used to screen approx 300 Drosophila genes for GAF binding. This approach was extended to >6,000 genes. Briefly, a fusion protein consisting of Dam methyltransferase and the 519-aa isoform of GAF was expressed in Drosophila Kc cells. This leads to preferential methylation of GAF binding sites in the genome. Methylated genomic DNA fragments were purified, fluorescently labeled, and used to probe microarrays containing 6,280 unique cDNA fragments. Methylated DNA purified from cells expressing unfused Dam was labeled with a different fluorochrome and used as a reference probe. Normalized fluorescence ratios represent the targeted/untargeted methylation ratios, and therefore the relative GAF binding to the probed loci (van Steensel, 2003).

Because cDNA microarrays were used in this assay, it was possible to directly measure methylation levels only in exons in the probed loci. However, targeted methylation 'spreads' in cis over approx 2-5 kb. Binding of GAF-Dam to upstream, downstream, and intronic sequences may therefore be detected as increased methylation of the nearby exon sequences, provided that the GAF binding sites are located within a few kilobases from a probed exon. A 'target gene' was described as a gene for which the corresponding cDNA probe on the array detects a significantly elevated GAF-Dam/Dam methylation ratio (van Steensel, 2003).

At an estimated false discovery rate of 0.05, 262 cDNA probes were identified for which GAF-targeted methylation levels were significantly elevated. Of these, 219 probes corresponded to 208 unique previously annotated genes and three repetitive elements (some genes were represented by two cDNAs). The remaining 43 positive cDNAs matched genomic loci that had not been annotated. Importantly, the GAF binding patterns are strikingly different from the binding patterns of six other Drosophila proteins [HP1, HP1c, Su(var)3-9, dMyc, dMad, and an ortholog of mammalian Max], supporting the specificity of the chromatin profiling technique (van Steensel, 2003).

The identified GAF target genes appear to cover a broad variety of functions and include genes that encode proteins involved in growth and development, signaling, heat shock response, and metabolic pathways. Thus, GAF may regulate a wide range of cellular processes and pathways. The data confirm the previously reported GAF binding to heat shock protein genes, but not the reported very weak binding to the 28S rDNA and histone gene loci (van Steensel, 2003).

To confirm the in vivo binding sequence of GAF, an unbiased bioinformatics method was used. REDUCE is a motif-based regression analysis method originally designed for the discovery of regulatory elements based on microarray expression data. The same algorithm was applied to find sequence motifs whose occurrence correlates with the chromatin profiling data for GAF. A major advantage of REDUCE is that it analyzes the entire set of probed loci and does not rely on clustering or prior partitioning into 'target' and 'nontarget' loci. Instead, REDUCE uses the full quantitative dataset obtained from one or more chromatin profiling experiments. Moreover, the output of REDUCE includes statistical parameters that indicate the correlation strength (represented as a t value) and statistical significance (P value) for each sequence motif, taking into account corrections because of the parallel testing of many motifs (van Steensel, 2003).

To account for the cis-spreading of targeted methylation, REDUCE was performed by using the sequences of the genomic regions corresponding to the cDNAs on the microarray, including introns and 2 kb of flanking sequence added to both the 5' and 3' ends. Thus, no prior assumptions were made with regard to the location of GAF binding sites relative to the transcribed regions, but instead the complete genomic regions were analyzed where binding of GAF-Dam would in principle be detectable (van Steensel, 2003).

For all possible sequence motifs up to 7 nucleotides, tests were performed to see whether their occurrence in the probed genomic regions correlates with GAF binding. The results show that (GA)n repeats indeed are strongly correlated with GAF binding. When ranked by correlation, all of the top 20 motifs contain (GA)n repeats. This finding demonstrates the specificity of the chromatin profiling technique and confirms that GAF binds selectively to (GA)n motifs in the native chromatin context (van Steensel, 2003).

A variety of (GA)n motifs displayed highly significant correlation with GAF binding. No significant correlation was found for the trinucleotide motif GAG. Thus, in the native chromatin context, at least two GA repeats are necessary for GAF recruitment, and 2.5 or 3 repeats appear to be optimal. Interestingly, the REDUCE algorithm found roughly equal correlation values for (GA)n and (CT)n motifs. This finding demonstrates that GAF binds with approximately the same frequency in either orientation relative to the direction of transcription (van Steensel, 2003).

Because the previous estimate of cis-spreading of targeted methylation was of limited accuracy, a test was performed to see how the correlation between GAF binding and the presence of (GA)n elements was affected by including more or less flanking sequence in the REDUCE analysis. The results show that inclusion of approx 2 kb of flanking sequence results in a maximum value for GAGAG. This finding indicates that GAF associates with GAGAG elements that are located upstream or downstream of the probed exons. Addition of >2 kb of flanking sequence leads to a weaker correlation, presumably because binding of GAF to sites >2 kb away from the probed regions does not add significantly to the methylation levels of the probed exons. This finding is in agreement with the estimate of 2- to 5-kb cis-spreading of targeted methylation (van Steensel, 2003).

Whether GAF binds preferentially to upstream or downstream nontranscribed regions was investigated by comparing the respective contributions of these regions to the observed correlation. All intergenic regions within 2 kb from probed loci were separed into three categories: between two divergent genes (exclusively upstream), between two convergent genes (exclusively downstream), and between two tandem genes (mixed upstream/downstream). The results show that GAF preferentially associates with GAGAG elements in upstream intergenic regions, compared with downstream intergenic regions (van Steensel, 2003).

It is important to note that the observed correlations may be interpreted as an indication of the relative average binding of GAF per GAGAG element. The observations therefore imply that GAGAG elements in downstream regions are occupied less frequently than in upstream regions. Because upstream and downstream noncoding regions harbor GAGAG elements at almost equal density, it is concluded that GAF preferentially binds to upstream regions (van Steensel, 2003).

Using the same approach, a test was performed to see whether GAF preferentially binds to GAGAG elements located in introns or exons. Strikingly, a clear correlation between GAF binding and the occurrence of GAGAG in introns was found, yet no such correlation was detectable in exons. Thus, GAF binds significantly to GAGAG elements in introns, yet fails to interact with GAGAG elements in exons. A more detailed multivariate analysis suggests that this exclusion from exons is particularly strong in relatively long exons (van Steensel, 2003).

GAF is often bound near promoter regions, where it can facilitate initiation of transcription. Enhancer elements can be located within introns, and it is therefore possible that GAF associated with introns facilitates transcriptional initiation. However, one report has suggested that GAF binds in some genes throughout the transcribed region and perhaps may control transcript elongation. If intron-associated GAF plays a role in elongation, then it may be expected that GAF binds to introns irrespective of the distance to the promoter. This was tested by comparing the binding of GAF to all first and last introns of the probed loci. Strikingly, the results show that GAF binding in last introns is at least as great as in first introns. This finding indicates that GAF binding to introns is not limited to promoter-proximal introns, which is in agreement with a role for GAGA factor in transcript elongation (van Steensel, 2003).

The striking difference in GAF binding between exons and introns argued that the association of GAF with GAGAG is modulated by additional molecular cues. In theory, GAF binding could be either selectively inhibited in exons (for example, by a chromatin folding rendering GAGAG elements inaccessible) or selectively enhanced in introns and upstream intergenic regions (by cooperative interactions). In vitro, GAF is able to form oligomeric complexes and displays cooperative binding to closely spaced (GA)n elements. Therefore whether such cooperative binding could explain the observed regional differences in GAF binding was tested (van Steensel, 2003).

To test whether GAF preferentially binds to clustered GAGAG elements in vivo, loci were ranked by their level of GAF binding and the spacing of GAGAG elements in the 500 loci with strongest GAF binding was compared to the GAGAG spacing in the 500 loci with weakest GAF binding. The results reveal that GAF target loci are indeed enriched in GAGAG elements that are spaced by less than approx 20 bp. The degree of clustering of GAGAG elements in target loci is much higher than can be attributed to random spacing, and the 500 control loci with no GAF binding do not show clustered GAGAG elements. Taken together with previously reported in vitro binding studies, this result strongly suggests that cooperative GAF binding occurs in the native chromatin context. Note that the clustering of GAGAG pairs is only significant at odd distances, suggesting that there is evolutionary pressure to preserve the even/odd character of (GA)n repeats even over distances up to at least 10 bp (van Steensel, 2003).

Comparative analysis shows that in the 500 probed loci with high GAF binding, intergenic regions and introns contain 40.3% and 43.3%, respectively, of all 641 pairs of GAGAG elements spaced <10 bp apart, whereas exons harbor only 16.4%. Because 45% of the DNA in these loci consists of exon sequences, closely spaced GAGAG elements are significantly underrepresented in exons. It is possible that this lack of clustering of GAGAG motifs explains for a large part the absence of GAF binding to exons (van Steensel, 2003).

Replacement of a Drosophila Polycomb response element core, and in situ analysis of its DNA motifs

Long-term repression of homeotic genes in the fruit fly is accomplished by proteins of the Polycomb Group, acting at Polycomb response elements (PREs). This study used gene conversion to mutate specific DNA motifs within a PRE to test their relevance, and PREs were exchanged to test their specificity. Previously it was shown that removal of a 185 bp core sequence from the bithoraxoid PRE of the bithorax complex results in posteriorly directed segmental transformations. Mutating multiple binding sites for either the Polyhomeotic (Pho) or the GAGA factor (Gaf) proteins separately in the core bithoraxoid PRE resulted in only rare and subtle transformations in adult flies. However, when both sets of sites were mutated, the transformations were similar in strength and penetrance to those caused by the deletion of the 185 bp core region. In contrast, mutating the singly occurring binding site of another DNA-binding protein, DSP1 (Dorsal switch protein 1; reportedly essential for PRE-activity), had no similar effect in combination with mutated Pho or Gaf sites. Two minimal PREs from other segment-specific regulatory domains of the bithorax complex could substitute for the bithoraxoid PRE core. In situ analysis suggests that core PREs are interchangeable, and the cooperation between Pho and Gaf binding sites is indispensable for silencing (Kozma, 2008).

This study used gene conversion to test variant forms and substitutes of a PRE core in its normal chromosomal context. The conversion strategy retained the wild type PRE sequence in the initial convertant, so that the convertant animals were normal in their segmental identities. The wild type PRE sequence could then be removed, at will, to test the remaining function from the mutated PRE. The ability to recreate the mutant multiple times proved essential to phenotypic assays, since the strength and penetrance of the phenotypes faded with successive generations of heterozygous or homozygous mutant flies (Kozma, 2008).

In case of larger deletions, such as the 665 bp Δ1-2 or the 280 bp Δ10, the penetrance of posteriorly directed transformations stays stably high in homozygous stocks. In stocks homozygous for the 280 bp deletion, only the strength of transformations gets weaker, while in the 185 bp deletion homozygous stock both the strength and the penetrance of transformations decline. Very likely, sequences neighboring the bxd PRE core are involved in the decline of the penetrance in the 185 bp deletion Δ17 homozygotes. The neighboring sequences appear to be somewhat redundant with the bxd PRE core, as suggested by the 3.5-fold higher penetrance observed in the 280 bp deletion heterozygotes. In addition, neighboring PREs on the mutated chromosome (in the bx and the iab-2 cis-regulatory regions) and PREs on the wild type homolog may also compensate for the loss of the bxd PRE core (Kozma, 2008).

The adult transformations must be due to misexpression of the Ubx gene, but no corresponding misexpression of Ubx was detected in embryonic or larval stages. Most cells in the embryo do not give rise to adult tissues, and the adult lineages may have higher sensitivity to Ubx levels. Cells of the adult tissues have gone through one (or two) more cell division(s) than their ancestral cells in the larval imaginal discs. Consequently, adult cells can accumulate more mistakes in the maintenance of the cellular memory. It is also possible that clones expressing Ubx in wing discs escaped detection because of the low penetrance (Kozma, 2008).

The Drosophila Pho and Pho-like proteins are the only PcG proteins known to bind to specific DNA-sequences, and they recognize the DNA motifs GCCAT, ACCAT and GCCAC. The first two CCAT motifs in the 185 bp core bxd PRE are separated from each other by only a single base, which may be important for the PRE-function. Similar pairs of CCAT-sites can also be found twice in the iab-7 PRE. The tested iab-7 PRE-fragment, with one such CCAT-pair, could fully replace the core bxd PRE. However, mutating all five CCAT motifs in the core bxd PRE results in only a low penetrance of posterior transformations (gain-of-function phenotypes, GOF), partly due to the compensating effect of neighboring sequences. Although the effect of point mutations in CCAT motifs is weak, it shows a GOF penetrance at least two times higher than in the case of mutated GAGA motifs. This difference is even more pronounced (6.5 times) when the neighboring distal 228 bp region is removed. These data correlate well with a prior study of a 567 bp fragment from the bxd PRE, which found no effect of mutating GAGA sites, using transgene reporter assays in imaginal discs. However, another transgene study of the 138 bp Mcp PRE fragment suggested that GAGA motifs were more important for silencing than CCAT motifs (Kozma, 2008).

Three different proteins, GAF, PIPSQUEAK and BATMAN, were suggested to act in concert, binding to the same GA-repeats, at least in the bxd PRE. As the N-terminal BTB/POZ domain of the GAF protein self-associates, the C terminal Zn-finger of GAF favors paired or clustered binding sites. However, GAF was also demonstrated to bind even GAG triplets with only a slightly reduced affinity as compared to GAGAG pentamers. Therefore, GAGA tetramers, not only the canonical GAGAG sequences, usually claimed as the minimal binding site of GAF, were mutated. Despite the suggested importance of GA-repeats in silencing, the effect of destroying only GAGA motifs is extremely weak, even when the distal 228 bp neighboring sequence is removed. It is possible that the proximal neighboring sequences can also compensate for the loss of the mutated GAGA motifs in the bxd core PRE. Indeed, this proximal ~200 bp region contains a 70 bp fragment (named MHS-70) with multiple, non-overlapping d(GA)₃ repeats, which were found to be important for silencing in embryos in transgenic assays. This prior study also showed that the ~20 bp GA-repeats in the core bxd PRE compete strongly with MHS-70 in gelshift assays, suggesting the binding of the same proteins (Kozma, 2008).

Although mutating either the CCAT or the GAGA motifs alone had only a modest phenotypic effect, simultaneous mutations in both types of motifs completely eliminated the core PRE function. The resulting penetrance of GOF phenotypes is equivalent to that of a deletion of the 185 bp core PRE. This suggests a highly cooperative effect between factors binding to the two different DNA motifs. In biochemical studies, GAF was found to facilitate Pho binding to chromatin, perhaps by interacting with NURF to create nucleosome-free regions or DNAse I hypersensitive sites. There are indeed such hypersensitive sites in the Mcp, iab-7 and bxd PRE regions. Both the Mcp and the iab-7 PREs required the presence of both Pho and GAF binding sites for efficient silencing in transgenic assays (Kozma, 2008).

Although the combination of mutated CCAT and GAGA motifs eliminates the core PRE function, it is still possible that other protein binding sites are also required for silencing. For example, the HMG-group protein DSP1, which was known previously to bind without sequence-specificity to the minor groove of DNA, was recently reported to bind to the GAAAA DNA-motif. Little or no effect was found of mutating the singly occuring putative DSP1 binding site in the core bxd PRE. Even when four other putative DSP1 binding sites were mutated by the removal of the neighboring 228 bp DNA region, distal to the reintroduced core PRE with mutated GAGA motifs, only a very weak GOF phenotype was observed. Thus, in contrast to the results of the transgenic assay studying the iab-7 PRE, it seems that the putative DSP1 sites have little or no role in silencing in the 413 bp region of the bxd PRE studied in situ (Kozma, 2008).

The 191 bp fragment of the iab-7 PRE was able to fully substitute for the 185 bp core bxd PRE, irrespective of its orientation. No posterior or anterior transformations were observed in flies with this PRE replacement, in either hetero- or homozygous animals. There is a pronounced similarity between the bxd core and iab-7 fragment in the pattern of CCAT and GAGA DNA-motifs, although there is no other apparent homology between these sequences. The perfect substitution of these PREs demonstrates that core PREs alone do not carry positional information; they act only as simple silencers. This finding is in agreement with previous studies, which used bigger PRE-fragments fused to different enhancers in transgenic assays (Kozma, 2008).

The 189 bp iab-5 PRE-fragment, which shows much less similarity in the pattern of DNA-motifs to the bxd PRE, is also a perfect substitute. This observation reinforces the notion that only the CCAT and GAGA sites matter for PRE function. In contrast, the 263 bp human H1 and the 222 bp human H2 fragments completely fail to substitute the core bxd PRE. H1 has only three CCAT motifs, perhaps one less than necessary (the bxd, iab-5 and iab-7 core PREs each have at least 4 CCAT motifs), or perhaps the CCAT and GAGA motifs in the human DNA-fragment are not sufficiently close-packed. Alternatively, the BX-C PREs might share some other protein binding sequence, yet unrecognized, that is also necessary for the core PRE function. In any case, the negative results with human sequences exclude the possibility that the 185 bp Drosophila bxd PRE-fragment has only a spacer function. Indeed, sequences without CCAT and GAGA motifs (such as H2) can clearly interfere with the impaired PRE-activity (Kozma, 2008).

These experiments demonstrated that three core PREs from different regulatory regions of the bithorax complex are functionally equivalent in situ, suggesting the interchangeability of PRE cores; CCAT and GAGA DNA-motifs act in concert and, unlike GAAAA/GATAA motifs, they are absolutely necessary for the function of the bxd PRE. The versatile gene conversion strategy and the sensitive phenotypic assay developed in this study can now be used to ask more detailed questions about the number and spacing of the GAGA and CCAT motifs, and about the function of other sequence regions in the core bxd PRE. Molecular definition and artificial assembly of a functional PRE core may become also possible with the help of these further studies. The marker gene, Gal4-VP16, used in these gene conversion events, can also be used to monitor subtle changes in the local chromatin structure, which may not result in any detectable phenotypic changes. This system should also be useful to test PREs from other Drosophila loci, like engrailed or polyhomeotic, and to assay potential PREs predicted by computational analyses of mammalian genomes (Kozma, 2008).

Rapid, transcription-independent loss of nucleosomes over a large chromatin domain at Hsp70 loci

To efficiently transcribe genes, RNA Polymerase II (Pol II) must overcome barriers imposed by nucleosomes and higher-order chromatin structure. Many genes, including Drosophila Hsp70, undergo changes in chromatin structure upon activation. To characterize these changes, the nucleosome landscape of Hsp70 was mapped after an instantaneous heat shock at high spatial and temporal resolution. Surprisingly, an initial disruption of nucleosomes was found across the entire gene within 30 s after activation, faster than the rate of Pol II transcription, followed by a second further disruption within 2 min. This initial change occurs independently of Pol II transcription. Furthermore, the rapid loss of nucleosomes extends beyond Hsp70 and halts at the scs and scs' insulating elements. An RNAi screen of 28 transcription and chromatin-related factors reveals that depletion of heat shock factor, GAGA Factor, or Poly(ADP)-Ribose Polymerase or its activity abolishes the loss of nucleosomes upon Hsp70 activation (Petesch, 2009; full text of article).

Using a high resolution in vivo approach to map changes in the chromatin structure of the rapidly induced Hsp70 gene, a broad disruption of nucleosome structure was observed that occurred at a rate faster than transcribing Pol II and broader than a single transcription unit, ceasing at the natural insulating elements. Furthermore, it was found that the initial changes in chromatin architecture at Hsp70 can be decoupled from transcription of the gene, whereas the second disruption by 2 minutes is transcription-dependent. A selective RNAi screen identified HSF, GAF, and PARP as each being necessary for the changes in chromatin landscape at Hsp70 (Petesch, 2009).

Before HS, the Hsp70 gene contains a chromatin landscape that has many general, as well as some distinct features. Like many other TATA containing genes, a highly positioned nucleosome exists downstream of the promoter region and the adjacent nucleosomes on the body of Hsp70 gradually lose their positioning. Likewise, as seen with many genome-wide studies, the promoter, and a region at the 3' end of the gene, is relatively nucleosome free. It is yet to be determined why 3' ends of genes are hypersensitive to nucleases. However, while many genes in yeast contain a positioned nucleosome starting within the first 100 bp of the transcription unit, Hsp70 contains a nucleosome free region that extends further, with the first nucleosome centered 330 bp following the TSS. This extended nucleosome-free region may be a more general feature of genes containing a paused polymerase (Petesch, 2009).

The HS time course shows that within 2 minutes following HS, the chromatin landscape of Hsp70 drastically changes. Following 2 minutes of HS, there no longer exists appreciable protection of a contiguous 100 bp piece of DNA that would normally be provided from a histone octamer. However, there are still detectable levels of histone H3 on the body of the gene, albeit three-fold less than NHS levels. Although these results differ from early observations that histone levels on Hsp70 do not change following HS, the 3-fold decrease measured by qPCR agrees with more recent quantifications of histone levels following HS and may have gone undetected in the qualitative analysis of these early experiments. Early electron microscopy spreads of native transcribing Pol II complexes with a growing RNA chain from D. melanogaster indicate that the bulk of transcribing Pol II in vivo appears to have nucleosomes flanking its path. The current results, however, suggest that at least for the rapidly induced Hsp70 gene, the nucleosomal structure present before HS no longer exists following activation of the gene (Petesch, 2009).

Changes found in chromatin upon Hsp70 induction extend well beyond the transcription unit of Hsp70 and halt at the scs and scs' insulating elements. Previous studies of scs and scs' have shown that these insulators are capable of blocking enhancer functions and establishing chromatin domains that are resistant to position effects. However, the scs and scs' regions have been located by DNA FISH on squashed polytene chromosomes to be within a HS puff at the endogenous 87A HS locus. This indicates that the scs and scs' regions by themselves are not absolute boundaries to changes in chromosome architecture, and supports the observation that puffing is maximal at a time well after nucleosome disruption and therefore denotes additional structural alterations beyond those observed here. Although transcription of CG31211, CG3281, and Aurora did not change following HS, and no factor targeted for RNAi permitted the disruption of nucleosomes beyond scs or scs', both of these regions include a TSS with detectable amounts of Pol II. It is therefore possible that the promoter architecture with Pol II present at these genes may be responsible for establishing a barrier at these sites. Overall, the results show that scs and scs' provide a primary barrier to the spread of chromatin decondensation, at least at the nucleosomal level, and add to the limited knowledge of the chromatin architecture of a puff (Petesch, 2009).

The results indicate that transcription-independent chromatin decondensation may prove more general than previous believed. Changes in chromatin structure independent of transcription have been implicated at Hsp70 in humans and also at developmentally regulated puffs in Drosophila. Furthermore, the current results indicate that the changes in chromatin at D. melanogaster Hsp70 do not depend on many different transcription factors. In Saccharomyces cerevisiae, many HS genes also lose histone density within the body of the gene by 2 minutes of HS, and as in the current study, these changes are independent of SWI/SNF, Gcn5, and Paf1. Overall, transcription-independent chromatin decondensation might allow cells to rapidly activate genes by clearing the obstacles in the path of Pol II prior to its movement, together with its entourage of elongation factors, through the gene (Petesch, 2009).

The current results show that in addition to HSF and GAF, which have previously been implicated in the decondensation at Hsp70 loci, PARP is also necessary for rapid changes in the nucleosome architecture of Hsp70. This is consistent with the finding that reduction of PARP expression results in decreased HS puff sizes. The results go further in demonstrating that PARP aids the rapid removal of nucleosomes within 2 minutes of HS. Poly(ADP-)Ribose (PAR) polymers are the enzymatic product of PARP and have similar chemical and structural features as a nucleic acid. Upon activation, PARP polyribosylates itself, which results in PARP's release from chromatin. The result of this could be two fold. First, since PARP binds to nucleosomes in a similarly repressive manner as linker histone H1, the activation of PARP could result in its release from chromatin to reverse any repressive effects on the chromatin structure at Hsp70. Second, the ADP-ribosylation of histones may destabilize the nucleosome, and the creation of these PAR polymers could act locally as a nucleic acid that attracts and removes histones from the body of the Hsp70 gene. Alternatively, PARP could covalently modify another protein to activate its role in removal of nucleosomes (Petesch, 2009).

In addition to histones, PAR could also attract transcription factors that bind nucleic acids. This could explain the rapid recruitment of Pol II and other important transcription factors to the site of active HS transcription. Likewise, PAR could also provide a means through which transcription factors recruited to the gene are then retained locally. The activation of PARP could thus provide a rapid, transcription-independent method to deplete histones and promote transcription of the Hsp70 gene (Petesch, 2009).

A BEAF dependent chromatin domain boundary separates myoglianin and eyeless genes of Drosophila melanogaster

Precise transcriptional control is dependent on specific interactions of a number of regulatory elements such as promoters, enhancers and silencers. Several studies indicate that the genome in higher eukaryotes is divided into chromatin domains with functional autonomy. Chromatin domain boundaries are a class of regulatory elements that restrict enhancers to interact with appropriate promoters and prevent misregulation of genes. While several boundary elements have been identified, a rational approach to search for such elements is lacking. With a view to identifying new chromatin domain boundary elements genomic regions were examined between closely spaced but differentially expressed genes of Drosophila melanogaster. A new boundary element between myoglianin and eyeless, ME boundary, was identified that separates these two differentially expressed genes. ME boundary maps to a DNaseI hypersensitive site and acts as an enhancer blocker both in embryonic and adult stages in transgenic context. It is also reported that BEAF and GAF are the two major proteins responsible for the ME boundary function. These studies demonstrate a rational approach to search for potential boundaries in genomic regions that are well annotated (Sultana, 2011).

BEAF is the major player in the boundary function of ME boundary as evident from genetic data and the effect that BEAF has on the ME boundary is by direct binding to the ME region as is evident from the ImmunoFISH and ChIP data. It is already known that BEAF binds to the scs boundary as a heterotrimer at the CGATA sites and ME boundary has similar arrangement of the CGATA sites. Some scattered CGATA motifs are also present in the ME boundary. ChIP data shows that BEAF binds to the core region of ME where two palindromic CGATA sites and one additional CGATA site are present. The importance of these BEAF binding sites is also evident from the fact that when these sites were mutated, boundary activity of ME is lost (Sultana, 2011).

The ME boundary also contains binding sites for GAF. The pattern of GAF binding sites in ME boundary is similar to that seen in the case of Fab-7 boundary present in the bithorax complex of D. melanogaster. This prompted an examination of whether GAF has any effect on the boundary activity of ME. The results show that GAF is also a positive regulator of the ME boundary function as loss of single copy of GAF results in partial loss of the boundary function of ME. This effect is by direct binding of GAF to the ME sequence as seen in the ImmunoFISH and ChIP experiments. In case of GAF, it was observed that the effect of loss of GAF was more dramatic in female flies, which was opposite to what was see in the case of BEAF. Since both these proteins, specially GAF, regulate a large number of loci and GAF has also been implicated in dosage compensation, it is likely that the sex specific effect seen here in the case of ME boundary may be a result of complex and indirect interaction of multiple factors (Sultana, 2011).

It is shown that both BEAF and GAF are needed for ME boundary activity. However, either BEAF or GAF (Trl) mutations alone were not sufficient for the complete loss of the boundary function. Since flies with BEAF^AB-KO/BEAF^AB-KO;P/Trl^R85 genotype were lethal, it remains an open question whether BEAF and GAF can account for the complete boundary function of ME. Synthetic lethality in the double mutant BEAF^AB-KO/BEAF^AB-KO;P/Trl^R85 does, however, suggest that these two proteins act in combination at the key loci and that this combination is essential for viability. There might be several such loci working as boundary elements and the double mutant combination, by abolishing or weakening a number of such boundaries, would cause misregulation of associated genes and lead to lethality (Sultana, 2011).

ME boundary function is by recruitment of BEAF and GAF along with, perhaps, several other proteins although BEAF appears to be the major player as mutation in BEAF binding sites abolishes boundary function. Relatively lower level of GAF enrichment at ME, as seen in ChIP experiments, may also indicate an indirect role of this protein at this locus. Minor but distinct effect of Polycomb and trithorax group mutations on ME boundary function was observed. The data, although suggestive and preliminary, indicate that ME boundary functions by recruiting multiple proteins, mutants of which lead to a partial loss of the boundary function. This mode of boundary function is similar to the other well studied gypsy boundary which depends on large number of factors including Su(Hw), Mod(mdg4, CP190 and dTopors that associate with lamina. Boundary function of gypsy was also shown to depend on Polycomb and trithorax group of proteins. While no prominent effect was seen of CTCF or CP190 on ME boundary activity, which is expected as ME region does not contain binding sites for these proteins, genome wide ChIP studies do detect association of these factors with ME. It is possible that ME may be part of nuclear structures where multiple boundaries cluster and number of factor participate even if not by direct binding to each boundary (Sultana, 2011).

In conclusion, a rationale to look for boundary elements in short intergenic regions that separate differentially expressed genes can be applied successfully. Although expression pattern of a number of genes has not been analyzed in many organisms, analysis in other model organisms and human can be used and by homology criteria, large part of a genome can be mapped for potential boundary elements. Once a boundary region has been identified, the precise mapping of the functional boundary element can be accomplished by DNaseI hypersensitivity and transgene based assays available in model systems. Such studies will help in understanding the genomic organization and regulatory environment of genes (Sultana, 2011).

Enhancer--core-promoter specificity separates developmental and housekeeping gene regulation

Gene transcription in animals involves the assembly of RNA polymerase II at core promoters and its cell-type-specific activation by enhancers that can be located more distally. However, how ubiquitous expression of housekeeping genes is achieved has been less clear. In particular, it is unknown whether ubiquitously active enhancers exist and how developmental and housekeeping gene regulation is separated. An attractive hypothesis is that different core promoters might exhibit an intrinsic specificity to certain enhancers. This is conceivable, as various core promoter sequence elements are differentially distributed between genes of different functions, including elements that are predominantly found at either developmentally regulated or at housekeeping genes. This study shows that thousands of enhancers in Drosophila melanogaster S2 and ovarian somatic cells (OSCs) exhibit a marked specificity to one of two core promoters-one derived from a ubiquitously expressed ribosomal protein gene and another from a developmentally regulated transcription factor-and confirm the existence of these two classes for five additional core promoters from genes with diverse functions. Housekeeping enhancers are active across the two cell types, while developmental enhancers exhibit strong cell-type specificity. Both enhancer classes differ in their genomic distribution, the functions of neighbouring genes, and the core promoter elements of these neighbouring genes. In addition, two transcription factors -- Dref and Trl -- were identified that bind and activate housekeeping versus developmental enhancers, respectively. These results provide evidence for a sequence-encoded enhancer-core-promoter specificity that separates developmental and housekeeping gene regulatory programs for thousands of enhancers and their target genes across the entire genome (Zabidi, 2014).

The core promoter of Ribosomal protein gene 12 (RpS12) and a synthetic core promoter derived from the even skipped transcription factor were chosen as representative 'housekeeping' and 'developmental' core promoters, respectively (hereafter termed hkCP and dCP), and the ability of all candidate enhancers genome wide to activate transcription from these core promoters was tested using self-transcribing active regulatory region sequencing (STARR-seq) in D. melanogaster S2 cells. This set-up allows the testing of all candidates in a defined sequence environment, which differs only in the core promoter sequences but is otherwise constant (Zabidi, 2015).

Two hkCP STARR-seq replicates were highly similar [genome-wide Pearson correlation coefficient (PCC) 0.98] and yielded 5,956 enhancers, compared with 5,408 enhancers obtained when dCP STARR-seq data was reanalyzed. Interestingly, the hkCP and dCP enhancers were largely non-overlapping and the genome-wide enhancer activity profiles differed (PCC 0.38), as did the individual enhancer strengths: of the 11,364 enhancers, 8,144 (72%) activated one core promoter at least twofold more strongly than the other, a difference rarely seen in the replicate experiments for each of the core promoters. Indeed, 21 out of 24 hkCP-specific enhancers activated luciferase expression (>1.5-fold) from the hkCP versus 1 out of 24 from the dCP. Consistently, 10 out of 12 dCP-specific enhancers were positive with the dCP but only 2 out of 12 with the hkCP, a highly significant difference that confirms the enhancer–core-promoter specificity observed for thousands of enhancers across the entire genome (Zabidi, 2015).

Enhancers that were specific to either the hkCP or the dCP showed markedly different genomic distributions: whereas the majority (58.4%) of hkCP-specific enhancers overlapped with a transcription start site (TSS) or were proximal to a TSS (<200 bp upstream), dCP-specific enhancers located predominantly to introns (56.5%) and intergenic regions (26.9%). Importantly, despite the TSS-proximal location of most hkCP-specific enhancers, they activated transcription from a distal core promoter in STARR-seq. Luciferase assays confirmed that they function from a distal position (>2 kb from the TSS) downstream of the luciferase gene and independently of their orientation towards the luciferase TSS. These results show that TSS-proximal sequences can act as bona fide enhancers and that developmental and housekeeping genes are both regulated through core promoters and enhancers, yet with a substantially different fraction of TSS-proximal enhancers (3.4% versus 58.4%; Zabidi, 2015).

hkCP and dCP enhancers were also located next to functionally distinct classes of genes according to gene ontology (GO) analyses: genes next to hkCP enhancers were enriched in diverse housekeeping functions including metabolism, RNA processing and the cell cycle, whereas genes next to dCP enhancers were enriched for terms associated with developmental regulation and cell-type-specific functions. Consistently, hkCP enhancers were preferentially near ubiquitously expressed genes and dCP enhancers were near genes with tissue-specific expression (Zabidi, 2015).

The core promoters of the putative endogenous target genes of hkCP and dCP enhancers were also differentially enriched in known core promoter elements: TSSs next to hkCP enhancers were enriched in Ohler motifs 1, 5, 6 and 7, consistent with the ubiquitous expression and housekeeping functions of these genes. In contrast, TSSs next to dCP enhancers were enriched in TATA box, initiator (Inr), motif ten element (MTE) and downstream promoter element (DPE) motifs, which are associated with cell-type-specific gene expression (Zabidi, 2015).

Whether the specificity that hkCP and dCP show to the two enhancer classes applies more generally was tested. Three additional core promoters were chosed from housekeeping genes with different functions: from the eukaryotic translation elongation factor 1δ (eEF1δ) , the putative splicing factor x16, and the cohesin loader Nipped-B (NipB). Importantly, all three contained combinations of core promoter elements that differed from that of hkCP, namely TCT and DNA-replication-related element (DRE) motifs (eEF1δ), and Ohler motifs 1 and 6 (x16 and NipB). In addition, a DPE-containing core promoter of the transcription factor pannier (pnr) and the TATA-box core promoter of Heat shock protein 70 (Hsp70) , which can be activated by tissue-specific enhancers, were tested, thus covering the two most prominent core promoter types of regulated genes (Zabidi, 2015).

To assess whether the marked core promoter specificities of the hkCP and dCP enhancers are encoded in their sequences, the cis-regulatory motif content of both classes of enhancers was examined. This revealed a strong enrichment of the DRE motif in hkCP enhancers, whereas dCP enhancers were strongly enriched in the GAGA motif of Trithorax-like (Trl) and other motifs previously described to be important for dCP enhancers. Published genome-wide chromatin immunoprecipitation (ChIP) data confirmed that DRE-binding factor (Dref) bound significantly more strongly to hkCP enhancers than to dCP enhancers, while the opposite was true for Trl. Considering only distal enhancers (>500 bp from the closest TSS) yielded the same results, suggesting that the differential occupancy is a property of both classes of enhancers rather than a consequence of the different extents to which they overlap with TSSs. Disrupting the DRE motifs in four different hkCP enhancers substantially reduced the activities of the enhancers as measured by luciferase assays in S2 cells (between 2.3- and 24.5-fold reduction), while dCP enhancers depend on GAGA motifs. Adding DRE motifs to 11 different dCP enhancers significantly increased luciferase expression from the hkCP for 9 of them (82%), and changing the GAGA motifs of two dCP enhancers to DRE motifs significantly increased the activities of both enhancers towards the hkCP but decreased their activities towards the dCP. Furthermore, an array of six DRE motifs was sufficient to activate luciferase expression from the hkCP but not the dCP. Together, these results show that hkCP and dCP enhancers depend on DRE and GAGA motifs, respectively, and demonstrate that DRE motifs are required and sufficient for hkCP enhancer function (Zabidi, 2015).

These results show that developmental and housekeeping gene regulation is separated genome wide by sequence-encoded specificities of thousands of enhancers to one of two types of core promoter, supporting the longstanding 'enhancer–core-promoter specificity' hypothesis. The findings indicate that these specificities are probably mediated by defined biochemical compatibilities between different trans-acting factors such as Dref versus Trl (at enhancers) and the different paralogues that exist for several components of the general transcription apparatus (at core promoters), presumably including the TATA-box-binding protein-related factor 2 (Trf2) at housekeeping core promoters. As such paralogues can have tissue-specific expression and stage-specific or promoter-selective functions, sequence-encoded enhancer-core-promoter specificities could be used more widely to define and separate different transcriptional programs (Zabidi, 2015).

Zelda is differentially required for chromatin accessibility, transcription-factor binding and gene expression in the early Drosophila embryo

The transition from a specified germ cell to a population of pluripotent cells occurs rapidly following fertilization. During this developmental transition, the zygotic genome is largely transcriptionally quiescent and undergoes significant chromatin remodeling. In Drosophila, the DNA-binding protein Zelda (also known as Vielfaltig) is required for this transition and for transcriptional activation of the zygotic genome. Open chromatin is associated with Zelda-bound loci as well as more generally with regions of active transcription. Nonetheless, the extent to which Zelda influences chromatin accessibility across the genome is largely unknown. This study used Formaldehyde Assisted Isolation of Regulatory Elements to determine the role of Zelda in regulating regions of open chromatin in the early embryo. Zelda was shown to be essential for hundreds of regions of open chromatin. This Zelda-mediated chromatin accessibility facilitates transcription-factor recruitment and early gene expression. Thus, Zelda possesses some key characteristics of a pioneer factor. Unexpectedly, chromatin at a large subset of Zelda-bound regions remains open even in the absence of Zelda. The GAGA factor-binding motif and embryonic GAGA factor binding are specifically enriched in these regions. It is propose that both Zelda and GAGA factor function to specify sites of open chromatin and together facilitate the remodeling of the early embryonic genome (Schulz, 2015).

This study used FAIRE to identify regions of open chromatin in the early embryo and determine the role of ZLD in establishing or maintaining chromatin accessibility. It was demonstrated on a genome-wide level that ZLD is instrumental in defining specific regions of open chromatin. Furthermore, this ZLD-mediated chromatin accessibility dictates both transcription factor binding and early gene expression. Unexpectedly, most open chromatin regions to which ZLD is bound do not absolutely require ZLD for chromatin accessibility. At these regions ZLD may function redundantly with GAF to determine the chromatin state. It is suggested that ZLD directly mediates the very earliest gene expression by facilitating chromatin accessibility. At cycle 14, when thousands of genes are transcribed, ZLD and GAF may coordinate to determine both regions of open chromatin and levels of gene expression (Schulz, 2015).

ZLD is known to be instrumental in regulating expression of both the very first set of zygotic genes transcribed after fertilization as well as a large set of genes transcribed at cycle 14. ZLD is already bound to thousands of loci at cycle 10, including those that will not be activated until four nuclear cycles later during the major wave of genome activation. This suggests that early ZLD-binding is poising genes for later activation. Nonetheless, it remains unclear what differentiates the small subset of ZLD-bound loci that are transcribed early from the hundreds of ZLD-bound genes activated at cycle 14. This study demonstrates regions that require ZLD for chromatin accessibility are correlated with the subset of genes transcribed prior to cycle 14 and with histone acetylation. However, not all ZLD-bound regions are equally dependent on ZLD for chromatin accessibility. It is therefore proposed that ZLD is essential for creating regions of open chromatin that drive expression of the subset of earliest expressed genes. This may be mediated, in part, by local histone acetylation. At cycle 14, other factors likely function with ZLD to determine chromatin accessibility (Schulz, 2015).

It has been shown that ZLD is required for the DNA binding of three different transcription factors: TWI, DL, and BCD. Additionally, transgenic versions of the brinker (brk) and sog enhancers show a correlation between the number of ZLD-binding sites and both DL binding and DNase I accessibility. Thus, prior work has clearly demonstrated a role for ZLD in mediating transcription factor binding, but the mechanism by which ZLD served this function has been unclear. This study demonstrates that BCD binding is lost in zld minus embryos preferentially at those regions that depend on ZLD for chromatin accessibility. The data show that ZLD potentiates transcription-factor binding through the establishment or maintenance of open chromatin, and this is likely to be important for ZLD-mediated transcriptional activation. The mechanism by which ZLD establishes or maintains chromatin accessibility remains unknown. Unlike the pioneer factor FoxA1, which can open chromatin by binding chromatin through a winged-helix domain, the ZLD DNA-binding domain does not resemble that of a linker histone. Instead, ZLD binds DNA through a cluster of four zinc fingers in the C-terminus. In addition, ZLD is a large protein with no recognizable enzymatic domains that activates transcription through a low-complexity protein domain. Thus, ZLD likely facilitates open chromatin through interactions with cofactors, and it is possible that recruitment of different cofactors to distinct ZLD-bound loci could partially explain the differential requirement on ZLD for chromatin accessibility in the early embryo (Schulz, 2015).

ZLD binding but not ZLD-mediated chromatin accessibility is a defining feature of HOT regions HOT regions, loci that are bound by a large number of different transcription factors, have been identified in multiple organisms, including worms, flies and humans. Unexpectedly, these HOT regions are not strongly enriched for the DNA-sequence motifs bound by the transcription factors that define them. Instead, HOT regions are associated with open chromatin, suggesting that chromatin accessibility along with sequence motif enrichment drives the high transcription factor occupancy. In Drosophila, HOT regions are enriched for developmental enhancers that contain the canonical ZLD-binding site, CAGGTAG, as well as for in vivo ZLD binding additional transcription factors. Early ZLD binding is a robust predictor of where multiple additional transcription factors will later bind (Schulz, 2015).

By analyzing the 5000 regions with the highest FAIRE signal, this study demonstrates that high transcription factor occupancy is correlated with ZLD-bound regions of accessible chromatin and not with open chromatin more generally. Furthermore, this association was not specific for those regions that require ZLD for accessibility. Thus, HOT regions overlap with ZLD-bound regions of open chromatin regardless of whether these loci require ZLD for accessibility. The data suggest that, while ZLD-mediated chromatin accessibility may facilitate gene expression, it is not this function of ZLD alone that defines HOT regions (Schulz, 2015).

The FAIRE data showed that more than 400 regions are bound by ZLD and require ZLD for chromatin accessibility. However, at least three times as many regions are bound by ZLD, but remain open even in its absence. The data predict GAF functions at many of these constitutively open chromatin regions to maintain chromatin accessibility, even in the absence of ZLD. Along with the CAGGTAG element, GAF-binding motifs are enriched in HOT regions. Like ZLD, GAF is maternally deposited into embryos. Furthermore, GAF is known to facilitate nuclease-hypersensitive regions and interact with members of the NURF A TP-dependent chromatin-remodeling complex. The data show that at early expressed genes there is a correlation between regions that require ZLD for chromatin accessibility (differential, ZLD-bound) and ZLD-dependent gene expression. However, this association is not found for genes expressed during cycle. Instead, the data suggest that at loci associated with this later gene expression, 2GAF is functioning together with ZLD to regulate chromatin accessibility and gene expression. Maternally deposited GAF is required for robust transcription and nuclear divisions during the MZT. GAF is thought to mediate transcription, at least in part, through a role in the establishment of poised polymerase. The fact that poised polymerase is not established until cycle 13, supports the model that GAF is required specifically for gene expression at cycles 13-14. Thus, it is suggested that ZLD-dependent early embryonic enhancers may be unique in that they rely only on ZLD for chromatin accessibility. Although there are likely additional factors involved, the data demonstrate that later in development ZLD and GAF likely function together to define the chromatin landscape of the early embryo (Schulz, 2015).

Pioneer factors are a specialized class of transcription factors that bind nucleosomal DNA and initiate chromatin remodeling, allowing the recruitment of additional transcription factors. ZLD-binding is strongly driven by DNA sequence, much more so than the binding of other transcription factors. This observation combined with the FAIRE data and analyses demonstrates that ZLD exhibits many of the characteristics of a pioneer factor: 1) engaging chromatin prior to gene activity; 2) establishing or maintaining chromatin accessibility to facilitate transcription factor binding; and 3) playing a primary role in cell reprogramming. Additional properties have been shown for classical pioneer factors, including remaining bound to the mitotic chromosomes (i.e. bookmarking) and binding to nucleosomal DNA. It will be important to determine whether ZLD shares these characteristics with other pioneer factors. Pioneer factors, such as FoxA1, can bind to closed chromatin and subsequently increase accessibility of the target site. However the chromatin of the early embryo may provide a unique environment with little compacted chromatin. Heterochromatin formation is not observed until the 14th nuclear cycle. Chromatin bound H3 levels increase through the MZT, and histone modifications indicative of silent genes, such as H3K27 trimethylation, are not evident until there is widespread activation of the zygotic genome (Schulz, 2015).

Thus, while ZLD binds to genes prior to zygotic genome activation this activity may not require binding to compacted chromatin. It may be that ZLD is distinctive in the timing of its expression rather than in its chromatin-binding properties and that the sequence-driven binding of ZLD is a property of the open chromatin and rapid nuclear divisions that characterize the earliest stages of embryonic development. Despite the fact that this study has demonstrated a critical role for ZLD in determining chromatin accessibility at hundreds of genomic regions, the data show that this role is limited to specific regions associated with the earliest-expressed embryonic genes. Other factors, such as GAF likely work redundantly with ZLD to define chromatin accessibility during the MZT (Schulz, 2015).

The coordinated function of multiple factors in determining chromatin structure and genome activation is not without precedent. It has recently been demonstrated homologs of the core pluripotency factors, Nanog, Pou5f3 (also known as Pou5f1 and Oct 4), and Sox19B (a member of the SoxB1 family), act analogously to ZLD during the zebrafish MZT to drive genome activation. Furthermore, Oct 4 and Sox2 are known to be pioneer factors instrumental in reprogramming differentiated cells to a pluripotent state. Together, these data suggest that chromatin remodeling in the early embryo requires the function of multiple factors, and this activity facilitates the transition from the specified germ cells to the pluripotent cells of the early embryo (Schulz, 2015).

Transcription factors GAF and HSF act at distinct regulatory steps to modulate stress-induced gene activation

The coordinated regulation of gene expression at the transcriptional level is fundamental to development and homeostasis. Inducible systems are invaluable when studying transcription because the regulatory process can be triggered instantaneously, allowing the tracking of ordered mechanistic events. This study used precision run-on sequencing (PRO-seq) to examine the genome-wide heat shock (HS) response in Drosophila and the function of two key transcription factors on the immediate transcription activation or repression of all genes regulated by HS. The primary HS response genes and the rate-limiting steps in the transcription cycle were identified that are regulated by GAGA-associated factor (GAF) and HS factor (HSF). GAF acts upstream of promoter-proximally paused RNA polymerase II (Pol II) formation (likely at the step of chromatin opening), and GAF-facilitated Pol II pausing is critical for HS activation. In contrast, HSF is dispensable for establishing or maintaining Pol II pausing but is critical for the release of paused Pol II into the gene body at a subset of highly activated genes. Additionally, HSF has no detectable role in the rapid HS repression of thousands of genes (Duarte, 2016).

Establishment and maintenance of heritable chromatin structure during early Drosophila embryogenesis

During embryogenesis, the initial chromatin state is established during a period of rapid proliferative activity. This study measured with three-minute time resolution how heritable patterns of chromatin structure are initially established and maintained during the midblastula transition (MBT). Regions of accessibility are established sequentially, where enhancers are opened in advance of promoters and insulators. These open states are stably maintained in highly condensed mitotic chromatin to ensure faithful inheritance of prior accessibility status across cell divisions. The temporal progression of establishment is controlled by the biological timers that control the onset of the MBT. In general, acquisition of promoter accessibility is controlled by the biological timer that measures the nucleo-cytoplasmic (N:C) ratio whereas timing of enhancer accessibility is regulated independently of the N:C ratio. These different timing classes each associate with binding sites for two transcription factors, GAGA-factor and Zelda, previously implicated in controlling chromatin accessibility at ZGA (Blythe, 2016).

Other Trithorax-like functions and targets

Continued: see Trithorax-like Targets of Activity part 2/2

Trithorax-like: Biological Overview | Evolutionary Homologs | Developmental Biology | Effects of Mutation | References

The Interactive Fly resides on the
Society for Developmental Biology's Web server.