The Interactive Fly
Zygotically transcribed genes
What are Polycomb and trithorax group proteins?
Pathways that mediate gene activation and silencing throught chromatin
Genome-wide prediction of Polycomb/Trithorax response elements
Genome-wide analysis of Polycomb targets in Drosophila melanogaster
Histone H3 variants specify modes of chromatin assembly
General transcriptional silencing by a Polycomb response element in Drosophila
What are Polycomb and trithorax group proteins?
Chromatin consists of proteins that serve as the structural organizer of DNA, binding DNA into higher order structures and ultimately forming the chromosome itself. Chromatin restricts the access of DNA to transcription factors. Both Polycomb and trithorax group proteins act to remodel chromatin altering the accessibility of DNA to factors required for gene transcription. Polycomb group genes are involved in chromatin based gene silencing, while trithorax group genes counteract the silencing effects of chromatin to maintain gene activity.
Pathways that mediate gene activation and silencing throught chromatin
There is an evolving understanding of the enzymes that function to remodel chromatin. At least two systems related to yeast SWI/SNF proteins function to open up chromatin, permitting access to transcription factors. Information on SWI/SNF homologs can be found at the ISWI and Brahma sites. Information about the potential role of the origin recognition complex in chromatin remodeling can be found at the Origin recognition complex 2 (ORC2) site.
Nucleosome assembly protein-1 (NAP1) and Chromatin assemby factor 1 subunit (CAF1) play the role of histone chaperones in establishing an ordered nucleosome structure on newly synthesized DNA. Drosophila CAF-1 appears to comprise four subunits of 180, 105, 75 and 55 kDa. The smallest subunit of Drosophila CAF-1, p55, is homologous to a mammalian RbAp-48 protein which is associated with the HD1 histone deacetylase. A model for the role of core histone chaperones in chromatin assembly is as follows: CAF-1 binds to newly synthesized H3 and (acetylated) H4 and mediates the formation of the H3-H4 tetramer into newly replicated DNA: histones H2A-H2B are subsequently incorporated with the assistance of other histone chaperones, such as nucleoplasmin or NAP1, to give the complete histone octamer. The initial histone acetylation may be required to neutralize its high positive charge, allowing it to be assembled into chromatin. Deacetylation of histones carried out by histone deacetylase, could be a prerequisite to maturation of chromatin. In any case, it is now clear that chromatin assembly and maturation involves histone acetylation and that this process begins in cytoplasm and histones are subsequently transferred to the nucleus and the deacetylated (Tyler, 1996).
Regulatory elements called enhancers, or locus control regions are capable of exerting their influence over long distances, and in a orientation-independent manner to orchestrate the complex gene expression patterns required for embryonic development. How are the effects of enhancers confined to the genes they regulate? In recent years the concept of chromatin based domain boundaries or insulator elements has developed, based on the genetic properties of several eukaryotic genes. One example of an insulator element is the Drosophila gypsy insulator. For discussion of the gypsy insulator, and the role of two proteins, Suppressor of Hairy wing and MOD(MDG4) in its regulation, see the su(Hw) site.
One other aspect of gene silencing has been established for mammalian and yeast systems. Whereas histone acetylation is known to be involved in gene activation in Drosophila dosage compensation (See Male-specific lethal 2), a role for deacetylation in gene silencing has not yet been established in Drosophila. Two examples of the role of histone deacetylation in gene silencing in mammals will be described briefly here. Histone deacetylation plays a role in mammalian Myc mediated silencing (see Drosophila Myc Evolutionary Homologs section for more information) and in mammalian nuclear receptor mediated silencing (see Ecdysone receptor Evolutionary Homologs section for more information).
Myc family proteins function through heterodimerization with the stable, constitutively expressed bHLH-Zip protein, Max. Human Mad protein homodimerizes poorly but binds Max in vitro, forming a sequence-specific DNA binding complex with properties very similar to those of Myc-Max. Both Myc-Max and Mad-Max heterocomplexes are favored over Max homodimers. Mad does not associate with Myc or with representative bHLH, bZip, or bHLH-Zip proteins. On the other hand, Myc-Max and Mad-Max complexes carry out opposing functions in transcription and Max plays a central role in this network of transcription factors (Ayer, 1993).
Members of the Mad family of bHLH-Zip proteins heterodimerize with Max to
repress transcription in a sequence-specific manner. Transcriptional repression by
Mad:Max heterodimers is mediated by ternary complex formation with either of the
corepressors mSin3A or mSin3B. mSin3A is an in vivo
component of large, heterogeneous multiprotein complexes and is tightly and
specifically associated with at least seven polypeptides. Two of the
mSin3A-associated proteins, p50 and p55, are highly related to the histone deacetylase HDAC1. The mSin3A immunocomplexes possess histone deacetylase activity that is sensitive to the specific deacetylase inhibitor trapoxin. mSin3A-targeted repression is reduced by trapoxin treatment, suggesting that histone deacetylation mediates transcriptional repression through Mad-Max-mSin3A multimeric complexes (Hassig, 1997).
The same proteins that mediate transcriptional silencing of Mad-Max also mediate transcriptional silencing of nuclear hormone receptors that are attached to DNA but free of ligand. Whereas liganded nuclear receptors serve as transcriptional activators, unliganded nuclear receptors serve as repressors. How does the unliganded nuclear receptor transmit a repressive signal to the transcriptional apparatus and what is the nature of this signal? In fact, the target of the unliganded nuclear receptor is not RNA polymerase but chromatin, and repression is mediated by corepressors, proteins that associate with unliganded nuclear receptors that assemble a macromolecular complex that modifies chromatin so as to silence gene activity. The macromolecular complex acts to deacetylate histone. The transcriptional corepressors SMRT and N-CoR function as silencing mediators for retinoid and thyroid hormone receptors. SMRT and N-CoR directly interact with unliganded nuclear receptors, and these corepressors in turn
directly interact with mSin3A, a corepressor for the Mad-Max heterodimer and a
homolog of the yeast global-transcriptional repressor Sin3p. The recently characterized histone deacetylase 1 (HDAC1) interacts
with Sin3A and SMRT to form a multisubunit, ternary repressor complex. Histone deacetylase in turn targets chromatin, converting it into a form that is unaccessable to the transcriptional apparatus. Consistent with this
model, it is found that HDAC inhibitors synergize with retinoic acid to stimulate
hormone-responsive genes and the differentiation of myeloid leukemia (HL-60) cells. Addition of a deacetylase inhibitor such as Trichostatin A relieves transcriptional repression resulting in a promoter that is sensitive to the addition of activating hormone. This
work establishes a convergence of repression pathways for bHLH-Zip proteins and
nuclear receptors and suggests that this type of regulation may be more widely conserved than previously suspected (Nagy, 1997).
Genome-wide prediction of Polycomb/Trithorax response elements
Polycomb/Trithorax response elements (PRE/TREs) maintain transcriptional decisions to ensure correct cell identity during development and differentiation. There are thought to be over 100 PRE/TREs in the Drosophila genome, but only very few have been identified due to the lack of a defining consensus sequence. The definition of sequence criteria that distinguish PRE/TREs from non-PRE/TREs is reported in this study. Using this approach for genome-wide PRE/TRE prediction, 167 candidate PRE/TREs are reported, that map to genes involved in development and cell proliferation. Candidate PRE/TREs are shown to be bound and regulated by Polycomb proteins in vivo, thus demonstrating the validity of PRE/TRE prediction. Using the larger data set thus generated, three sequence motifs that are conserved in PRE/TRE sequences have been identified (Ringrose, 2003).
The detection of PRE/TREs by prediction generates a large data set that can be used to search for further common sequence features. To this end, the 30 highest scoring PRE/TRE hits were scanned for motifs that occur significantly more often in PRE/TREs than in randomly generated sequence. Five significant motifs were found. Not surprisingly, but reassuringly, two known motifs, the GAF and PHO binding sites were found. The Zeste binding motif was not found by this analysis, although it occurs as frequently as GAGA factor in the 30 sequences analyzed. This is probably due to the shortness and degeneracy of the Zeste motif, and suggests that other such short motifs will also be missed by this approach (Ringrose, 2003).
Nevertheless, three additional motifs were found. The first, called GTGT, is found several times in 14 of the sequences. The second motif, poly T, is found several times in almost all 30 PRE/TRE sequences analyzed. Some variants of this site match the binding consensus for the Hunchback protein, which has been shown to be an early regulator at some PRE/TREs. The third motif, TGC triplets, occurs several times in 13 of the PRE/TRE sequences. No binding factor for this sequence has yet been identified (Ringrose, 2003).
To further examine these three motifs, motif occurrence was evaluated in all 167 predicted PRE/TREs and in the promoter peaks described above. In contrast to the known GAF, Z, and PHO motifs, the three motifs each occur in only a subset of predicted and known PRE/TREs, and do not occur significantly together. These motifs may thus each define a subclass of PRE/TREs. Consistent with this idea, some of the lowest scoring known PRE/TRE sequences indeed contain one or more of the three motifs (Ringrose, 2003).
Although no correlation between particular sites and high scores was found, a negative correlation was found between numbers of GAF/Z and PHO sites (a correlation coefficient of -0.78, indicating that when many GAF/Z sites are present, there are few PHO sites, and vice versa). This suggests that each PRE/TRE may have a preferred ground state, in which it is either predisposed to silencing (many PHO sites) or to activation (many GAF/Z sites) (Ringrose, 2003).
In summary, this analysis identifies three motifs that occur significantly in association with known PRE/TRE motifs. Further functional characterization of these motifs and the proteins that bind them may contribute to a more complete definition of the sequence requirement for PRE/TRE function, and of subclasses of PRE/TREs (Ringrose, 2003).
This study offers four main contributions to the understanding of PRE/TRE function. First, a larger set of sequences have been defined that will facilitate the more complete definition of PRE/TRE sequence requirements. Three motifs have been identified that may contribute to this goal. The definition of the minimal requirement for PRE/TRE function will not be a trivial task. Analysis of motif composition and order in the 167 predicted PRE/TREs reveals that there is a great diversity of patterns, with no preferred linear order. It is possible that each different pattern of motifs reflects a subtly different function. However, the concept of a linear order of motifs may well be irrelevant, because these elements operate in the three-dimensional context of chromatin. The fact that such a diversity of PRE/TRE designs exist indicates that the vast majority of them would defy detection by conventional pattern-finding algorithms, and underlines the advantages of the approach described in this study (Ringrose, 2003).
Although no linear constraints on motif order were found, the fact that only motif pairs, and not single motifs, are able to identify PRE/TREs strongly suggests that this close spacing of sites has functional significance. Multiple sites may work in concert, to promote cooperative binding of similar proteins (e.g., repeated PHO sites) or to provoke competition between dissimilar proteins (e.g., closely spaced GAGA factor and PHO sites). In addition, in chromatin, only a subset of sites will be exposed and optimally available for binding at any one time, while others will be occluded by nucleosomes. The trxG includes nucleosome remodeling machines, raising the intriguing possibility that remodeling of PRE/TREs in chromatin may contribute to epigenetic switching by exposing different sets of protein binding sites (Ringrose, 2003).
Second, a PRE/TRE peak is observed at the promoter of all the genes examined. This strongly suggests that promoter binding is a general principle of PRE/TRE function. It has been reported that PcG proteins can interact with general transcription factors. It has hitherto been unclear whether the observed PcG/trxG binding at promoters of the genes they regulate is mediated indirectly via such an interaction, or whether the PcG and trxG bind directly to PRE/TREs at the promoters. The high scores observed at promoters favor the latter interpretation (Ringrose, 2003).
Third, it has been shown that in most cases, PRE/TREs do not occur in isolation, but are accompanied by one or more other peaks nearby. These grouped PRE/TREs may create multiple attachment sites for PcG and trxG proteins, which come together to build a fully operational complex at the promoter. Alternatively, grouped PRE/TREs may be individually regulated by tissue-specific enhancers as in the BX-C. Thus, each of the many PRE/TREs of the homothorax gene may interact with the promoter PRE/TRE in different tissues. This idea is consistent with the fact that Homothorax has specific roles in diverse developmental processes (Ringrose, 2003).
Finally, the current list of about ten PcG/trxG target genes has been expanded to over 150 genes, identifying candidates for epigenetic regulation. The genes thus identified encompass every stage of development, suggesting that the PcG/trxG are global regulators of cellular memory. Experiments to further investigate and compare this regulation for individual genes are currently underway (Ringrose, 2003).
Polycomb group (PcG) complexes are multiprotein assemblages that bind to chromatin and establish chromatin states leading to epigenetic silencing PcG proteins regulate homeotic genes in flies and vertebrates, but little is known about other PcG targets and the role of the PcG in development, differentiation and disease. This study determined the distribution of the PcG proteins PC, E(Z) and PSC and of trimethylation of histone H3 Lys27 (me3K27) in the Drosophila genome
using chromatin immunoprecipitation (ChIP) coupled with analysis of immunoprecipitated DNA with a high-density genomic tiling microarray. At more than 200 PcG target genes, binding sites for the three PcG proteins colocalize to presumptive Polycomb response elements (PREs). In contrast, H3 me3K27 forms broad domains including the entire transcription unit and regulatory regions. PcG targets are highly enriched in genes encoding transcription factors, but they also include genes coding for receptors, signaling proteins, morphogens and regulators representing all major developmental pathways (Schwartz, 2006).
The components of PcG complexes are products of PcG genes, first discovered as crucial regulators of homeotic genes in Drosophila. Immunostaining of Drosophila polytene chromosomes, however, showed PcG proteins at about 100 cytological loci, implying a much larger number of target genes. Functional analysis has identified PREs as DNA sequences able to recruit PcG proteins and establish PcG silencing of neighboring genes. Two types of PcG complexes bind to PREs. PRC1-type complexes include a core quartet of proteins: PC, PSC, PH and dRing. PRC2-type complexes include E(Z), which methylates histone H3 Lys27. Mono- and dimethylated Lys27 is widely distributed in the genome, but PcG sites characteristically contain trimethylated Lys27 (me3K27). The activity of the E(Z) complex is essential for stable silencing, and it has been proposed that H3 me3K27 recruits the PRC1 complex through the specific affinity of the PC chromodomain for me3K27. But the relationships between PRC1 and PRC2 complexes, between their binding sites and histone methylation, and between binding, methylation and gene expression are not well understood and remain the subject of debate. The genomic distribution of three PcG proteins [PC, PSC and E(Z)] and of histone H3 me3K27 was examined using using chromatin immunoprecipitation (ChIP). Since PcG target genes may be repressed in some tissues and active in others, a cultured cell line was used to minimize heterogeneity (Schwartz, 2006).
Viewed at the scale of a chromosome arm, the distributions of PC, PSC, E(Z) and me3K27 coincide at a number of distinct binding peaks (which are refered to as 'PcG sites') that correspond to 70% of the bands reported in salivary gland polytene chromosomes stained with the corresponding antibodies. To minimize false positives, the analysis focussed on the PcG sites that showed simultaneous binding of two or more proteins, each above twofold enrichment. Of the 149 PcG sites detected (see the supplemental figure), 95 showed strong binding of all four proteins ('strong' PcG sites), whereas in 54 sites the binding was lower and below threshold for one of the proteins ('weak' PcG sites). At higher resolution, most PcG sites involve two or more genes, often sharing structural or functional similarities. Thus, PcG sites involve the following: engrailed (en) and invected (inv); the PcG genes ph-p and ph-d; the Dorsocross T-box gene cluster; the muscle NK homeobox gene cluster; the wingless cluster; and the two homeotic complexes ANT-C and BX-C (Schwartz, 2006).
The Bithorax complex (BX-C) is a cluster of three homeotic genes (Ubx, abd-A and Abd-B) responsible for segmental identity in the abdomen and posterior thorax. The most prominent features are two sharp binding peaks for all three PcG proteins at the sites of the bx and bxd PREs that control Ubx. No peak was detected over the Ubx proximal promoter, although the entire gene shows a low but significant level of PC. A series of lower peaks emerged in the abd-A region and part of the Abd-B gene. Some of these correspond to the known PREs iab-2. In contrast, the distribution of H3 me3K27 oscillated rapidly above a high plateau that covers Ubx and abd-A but not Abd-B. RT-PCR was used to determine the mRNA levels corresponding to these three genes. Transcription of Ubx and abd-A in these cells was very low but distinctly above background. Abd-B was highly transcribed, at levels 300 times higher than Ubx. This pattern of activity was reflected by the distribution of both PcG proteins and me3K27. It is noted that in the Abd-B regulatory region, the previously characterized Fab-7 and Fab-8 PREs neither bound PcG proteins nor were methylated in these cells. The Abd-B gene has five distinct promoters. A sharp resurgence of both methylation and PcG protein binding in the region of the most upstream Abd-B promoter suggests that, in contrast to the other four promoters, this one might be repressed in the cultured cells. RT-PCR analysis using primers specific for mRNAs initiating from each promoter confirmed that the most upstream promoter is silent and that the other four are active. These results support the view that binding of PcG proteins to PREs is associated with transcriptional quiescence, whereas robust transcriptional activity is accompanied by lack of binding to the PREs and lack of Lys27 methylation over the transcription unit (Schwartz, 2006).
Strong genomic sites bind all three PcG proteins. The PSC and E(Z) peaks generally rise sharply and are contained within less than 2 kb, whereas PC frequently forms a broader peak that may include shoulders or subsidiary peaks absent for E(Z) and PSC and subsides to background more gradually. These peak binding regions are thought of as corresponding to PREs, which they in fact do in the cases where these are known. Additional binding peaks may be found within or downstream of the transcription unit. In contrast, distribution of H3 me3K27 at each site is very broad, forming a domain of tens or even hundreds of kilobases encompassing the transcription unit and regulatory regions of one or more genes but, rather than a level plateau, it consists of a series of deep oscillations (Schwartz, 2006).
The strong binding peaks or putative PREs are often associated with low values or troughs in the methylation profile and at secondary peaks the PC distribution frequently echoes methylation peaks. Overall, their relationship does not support the idea that methylation of Lys27 suffices to recruit binding of PC. It is proposed instead that PC bound to the strong binding peaks, the presumptive PREs, is recruited by proteins that bind specifically to those sequences. The weaker PC binding peaks and tails that mirror the methylation profile near PREs may represent a second mode of PC binding mediated by the interaction of the chromodomain with H3 me3K27 (Schwartz, 2006).
It is supposed that methylation domains initiated by a PRE might spread bidirectionally until they encounter 'active' chromatin, characterized by histone acetylation or methylation of H3 Lys4, marks typical of transcriptionally active genes. Alternatively, specific features might shape the methylation domain either positively, by attracting the methyltransferase complex, or negatively, by blocking productive interactions with the PRE. As in the case of the Abd-B gene or of CG7922 and CG7956 genes, sudden drops in levels of me3K27 are generally associated with transcriptional activity. Are insulators involved in protecting CG7922 and CG7956 from silencing, or is the activity of these two genes simply epigenetically maintained from the time the cell line was originally established? Further work is required to answer this question (Schwartz, 2006).
In many cases, the presumptive PRE lies between divergently transcribed genes such as dco and Sox100B. Which of the two is the PRE target? As PREs can act at distances of 20-30 kb, the proximity of PcG peaks to a promoter is not a reliable guide. It is proposed that the methylation domain is the clue to the target of PcG regulation. A PcG peak is not considered to regulate a promoter if the gene is not included in the methylation domain. When multiple genes are included in the methylation domain, it is likely that they are all affected by PcG regulation. However, this study distinguishes between genes that contain methylation as well as one or more PcG proteins and genes that contain only methylation (Schwartz, 2006).
The 95 'strong' binding sites in the genome encompass a total of 392 genes. Of these 392 genes, 186 contain both PcG binding and methylation, and the remainder are found within broad methylation domains associated with PcG proteins binding but do not bind PcG proteins over their own promoter or transcription unit. They may represent genes not directly targeted but affected by the spread of methylation. An analysis of their ontology indicates that these two classes are in fact very different. Transcription regulators constitute 64.5% of the first set, compared to 4.3% for the full annotation set. Instead they constitute only 4.0% of those genes that contain only me3K27. These comparisons strongly suggest that (1) genes that regulate transcription are preferred PcG targets, and (2) genes that only include the tails of a methylation domain are probably not primary targets of PcG regulation. A similar preference is also seen among the 'weak' binding sites. These include a total of 74 genes containing both PcG proteins and methylation, 28.4% of which encode transcription regulators. Flanking genes containing only methylation include only 5.7% transcription regulators. Although transcription regulators are preferred PcG targets, secreted proteins, growth factors or their receptors, and signaling proteins are also targeted. PcG target genes include components of all the major differentiation and morphogenetic pathways in Drosophila (Schwartz, 2006).
The major features of PcG binding shown by this work are that, although the proteins themselves are highly localized at presumptive PREs, the domain of histone methylation they produce is much broader. If the E(Z) methyltransferase is localized at the PRE, how is the extensive methylation domain produced? A looping mechanism is proposed in which interaction of PRE-bound complexes with flanking chromatin is mediated by the PC chromodomain. The observed broader distribution of PC might result from crosslinking of the chromodomain to methylated H3, reflecting this mechanism (Schwartz, 2006).
Are PREs defined by characteristic sequence motifs? Although the analysis of the sequences underlying the binding peaks will be presented elsewhere, it is noted that Ringrose (2003) devised an algorithm based on GAGA factor, PHO and Zeste binding motifs to identify sequences likely to represent PREs. This algorithm correctly predicts a number of the strong PcG binding sites (27%) and a few of the weaker sites (7%), overall 20%; however, it does not predict the majority of the PcG sites. The reverse is also true: only 22% of the PREs predicted by Ringrose bind PcG proteins in these experiments. Together, these data suggest that additional criteria are necessary to predict most PREs reliably (Schwartz, 2006).
As expected, PcG proteins and me3K27 are associated with transcriptional quiescence, but the data suggest that this is not an absolute condition. Low but significant transcription levels are detected even for the repressed Ubx and abd-A genes. Two target sites, polyhomeotic and the Psc-Su(z)2 site, contain PcG genes, which must be active to ensure the functioning of the PcG mechanism. The polyhomeotic locus is one of two sites in the entire genome that bind PC but lack appreciable levels of E(Z) and of Lys27 methylation. Instead, the Psc-Su(z)2 region is well methylated and binds both PC and E(Z) at multiple peaks. It is concluded that PcG mechanisms do not invariably lead to transcriptional silencing and are compatible with moderate levels of transcription (Schwartz, 2006).
Another point of interest is the number and kind of genes that are PcG targets. Considering the developmental difference between salivary gland cells and the embryo-derived tissue culture cells, the substantial number of shared PcG sites suggests that a majority of target sites are occupied in a large percent of cells. Target genes are in fact predominantly regulatory genes that control major differentiation and morphogenetic pathways. These pathways and their genes are highly conserved, and recent work shows that they are also regulated by PcG in mammals. It might be expected that in a given cell type most alternative genomic programs would be repressed save the subset required in that cell type. The emerging picture from these studies is that PcG regulation is a key mechanism in genomic programming (Schwartz, 2006).
Histone H3 variants specify modes of chromatin assembly
Histone variants have been known for 30 years, but their functions
and the mechanism of their deposition are still largely unknown.
Drosophila has three versions of histone H3. H3.3 marks active chromatin and may be essential for
gene regulation, and Cid is the characteristic structural component
of centromeric chromatin. The properties of
these histones have been characterized by using a Drosophila cell-line system that allows precise analysis of both DNA replication and histone deposition. The deposition of H3 is restricted to replicating DNA. In striking contrast, H3.3 and Cid deposit throughout the cell cycle. Deposition of H3.3 occurs without any corresponding DNA replication. To
confirm that the deposition of Cid is also replication-independent
(RI), centromere replication was examined in cultured cells and
neuroblasts. It was found that centromeres replicate out of phase
with heterochromatin and display replication patterns that may
limit H3 deposition. This confirms that both variants undergo RI
deposition, but at different locations in the nucleus. How variant
histones accomplish RI deposition is unknown, and raises basic
questions about the stability of nucleosomes, the machinery that
accomplishes nucleosome assembly, and the functional organization
of the nucleus. The different in vivo properties of H3, H3.3, and
Cid set the stage for identifying the mechanisms by which they are
differentially targeted. It is suggested that local effects of
'open' chromatin and broader effects of nuclear organization help to guide the two different H3 variants to their target sites (Ahmad, 2002).
Nucleosomes are the fundamental units of chromatin, consisting
of 146 bp of DNA wrapped around an octamer of
four core histones. Histone deposition occurs primarily as DNA
replicates to complete chromatin doubling. During S phase
of the cell cycle, new histones are produced in abundance for
immediate replication-coupled deposition. In most metazoans,
this abundant S-phase synthesis results from the tight regulation
of tens to hundreds of intronless histone genes that have special
3' untranscribed regions instead of poly(A) tails. However,
some histones are produced from orphan genes outside of S
phase. In Drosophila, orphan genes encode two H3 variants: one
encodes Cid, the centromeric histone, and two encode H3.3,
the replacement variant. These variants
have equivalents in many other eukaryotes. The H3.3
histone is nearly identical to H3, differing at only four amino acid
positions. Cid differs profoundly from H3 in sequence, showing
some significant identity only within the histone fold domain.
Surprisingly, these three histones have different deposition
properties. H3 and H3.3 are deposited as DNA replicates, but
both H3.3 and Cid can be deposited at sites that are not
undergoing DNA replication. Whereas only
a minor fraction of the bulk genome is packaged into Cid- and
H3.3-containing nucleosomes, each variant is targeted to different
specialized sites, with Cid localizing to centromeres and H3.3
to transcriptionally active genes. Specific localization of centromeric
H3-like histones (CenH3s) has been observed in various
animals, fungi, and plants. Also, an H3.3-like
histone targets the transcriptionally active macronucleus in
ciliates. Thus, the targeting of H3 variants is likely a feature
of every eukaryotic cell, where centromeres and transcribed
regions are the major loci of activity in metaphase and interphase,
respectively. Both kinds of loci use a distinct pathway for
nucleosome assembly, and this study explores the properties
of this process (Ahmad, 2002).
Studies of histone deposition have generally been done using
crude extracts, purified components or pools of cells from which
bulk chromatin is extracted. These methods reveal the
average properties of chromatin, and have shown that the bulk
of chromatin doubles as DNA replicates. Extensive in vitro work
has demonstrated that the assembly of nucleosomes is a stepwise
process in which deposition of an (H3:H4)2 tetramer is followed
by addition of two H2A:H2B dimers. The new histones are
brought to the replication fork in a complex with chromatin
assembly factor 1 (CAF1). CAF1 appears to be recruited to the
replication fork by binding to the ring-shaped proliferating cell
nuclear antigen (PCNA) that encircles the DNA template at
each replication fork. Histones from the parent DNA are
distributively segregated to the two sister chromatids behind the
replication fork, and the gaps in their nucleosomal arrays are
rapidly filled by step-wise assembly of new nucleosomes. These
nucleosomes are then matured by addition of linker histones and
covalent modification of histone tails to complete chromatin (Ahmad, 2002).
Nucleosomes containing H3 variants comprise only a small
proportion of bulk chromatin, and thus their properties have
been generally undetectable. However, replacement H3 variants
can become enriched in the chromatin of nonreplicating cells. This means that other ways of depositing histones must
exist; but because such variant enrichment has been detectable only
in unusual cell types (such as long-lived neurons or spermatocytes), studies of the phenomenon have been limited. The ability
to tag histones and examine their deposition properties in single
cells has allowed a gain in insight into chromatin assembly
processes (Ahmad, 2002).
A cytological assay system was developed for studying replication
and chromatin assembly by using Drosophila Kc cells, a cell
line that displays a regular cell division schedule and
a consistent tetraploid karyotype. Organization of the Drosophila
nucleus is visually simple, because the late-replicating heterochromatin
typically coalesces into a compartment in the
nucleus, termed the chromocenter. This provides both
a temporal and spatial distinction between the early replicating,
gene-rich euchromatin, and the late-replicating heterochromatin (Ahmad, 2002).
DNA replication can be tracked either by pulse-labeling with
nucleotide analogs or by using anti-PCNA antibody. Furthermore,
by introducing histone-GFP fusion constructs and producing
a pulse of the tagged protein, histone deposition can be tracked during the cell cycle. Using this system, it has been possible to quantitatively examine DNA replication and histone
deposition in unsynchronized populations of cells (Ahmad, 2002).
GFP-tagged H3 shows exclusively replication-coupled deposition,
displaying co-localization with replication markers and
showing no detectable deposition in cells in which replication has
been blocked. The N-terminal tail of H3 is required, suggesting
that the H3 tails of tetramer particles interact with
accessory factors at some early step in nucleosome assembly in
vivo (Ahmad, 2002).
In contrast to the properties of GFP-tagged H3 in cells, tagged
H3.3 deposits in a replication-independent manner at actively
transcribing loci. Deposition can occur in any stage of the cell
cycle, and it is not accompanied by
unscheduled DNA synthesis. Incorporation of H4 also occurs at
these target sites, as expected for deposition of (H3.3:H4)2
tetramers; but how replication-independent (RI) histone deposition
occurs is virtually unknown. Tagged Cid can also deposit throughout the cell cycle, suggesting that its deposition is also replication-independent (Ahmad, 2002).
However, this conclusion depends on knowing the timing of
centromere replication. Centromeres replicate
within a defined portion of S phase and Drosophila centromeres replicate as isolated domains within
later-replicating heterochromatin (Ahmad, 2002).
Historically, centromeres have been thought to replicate very
late in the cell cycle. This is because they are embedded within
pericentric heterochromatin, which replicates late. Analysis has
usually relied on visualization at mitosis; but mitotic chromosomes
have inherently low resolution because they are highly
condensed. Indeed, a recent study showed that Drosophila
centromeres cannot be resolved from heterochromatin in 44% of
spread mitotic chromosomes. Despite this limitation, it has been concluded Cid-containing chromatin replicates
on the same late schedule as pericentric heterochromatin. However,
this could be late replication in pericentric heterochromatin
that was mis-scored as replication of centromeres (Ahmad, 2002).
This uncertainty has been addressed by analyzing mitotic
chromosome replication patterns, providing brief 15-min pulses
to Kc cells and examining mitotic figures after a chase. This
provides a 'snapshot' of replication at single points in the cell
cycle. Examples of heterochromatin replication
patterns are observed similar to those previously reported, where labeling
overlaps Cid spots. However, unambiguous
examples of chromosomes that were intensely labeled
throughout the euchromatic arms, with foci directly coinciding
with centromeres, are also observed. These centromeric foci are surrounded by heterochromatin that did not replicate during the
labeling pulse (Ahmad, 2002).
Experiments using interphase Kc cells revealed
that ~90% of centromere replication occurs when euchromatin
is replicating. The remaining 10% may be late replication
in centromeric regions, but is more likely the result of nearby
heterochromatic replication foci that can not be resolved from
sites with Cid. Such early replication of centromeres is not
limited to tetraploid Kc cells --
similar replication patterns are observed in diploid larval neuroblasts -- although the much shorter cell cycle time and the more
irregular chromocenter limits quantitative analysis. Therefore,
this early timing of centromere replication appears to be general
for Drosophila cells (Ahmad, 2002).
A series of progressively more direct experiments have provided
insight into the fine structure in the centromere region. A model
for the centromeric constriction has suggested that loops of DNA
coil through the constriction, with centromeric nucleosomes
lying in the outward parts of these coils, and conventional
nucleosomes in the interior portions. This would
account for the polar structure of the entire centromere if
centromeric nucleosomes nucleate kinetochore formation (and
thus microtubule capture) and conventional nucleosomes recruit
cohesins (and thus centromeric cohesion). The linear arrangement
of nucleosomes along centromeric DNA would then be
alternating blocks of centromeric and conventional nucleosomes
within the centromeric domain. A study using
stretched chromatin fibers has demonstrated that Cid and H3 are
interspersed in Drosophila, although these are not included in the
same nucleosome. Apparently, blocks packaged in one kind
of nucleosome alternate with blocks packaged in the other (Ahmad, 2002).
How could the duplication of such regular but discontinuous arrays of
nucleosomes occur? The alternating pattern of nucleosomes on stretched chromatin fibers is reminiscent of replication patterns on fibers from
normal chromatin. Replication origins within a chromatin
domain often appear to be regularly spaced with an interval of
50-100 kb, and these origins fire synchronously. Perhaps the
nucleosome blocks in the centromeric regions correspond to an
underlying regular arrangement of replication origins throughout
the entire centromeric domain. If Cid-containing blocks
include the origins for these domains, and if replication initiates
at a time when H3 is not available, ultimately only the RI
deposition of Cid will package these blocks. The later replicating
stretches would incorporate H3 as it becomes available. In this
way, the fine pattern of replication would maintain the discontinuous
Cid arrays over an extended region (Ahmad, 2002).
The model for maintaining the higher-order chromatin structure
of the entire centromere has precise requirements for
replication patterns in this region: a discontinuously spaced
arrangement of origins must correspond to the blocks of Cid-containing
chromatin. At least two other
patterns of replication in this region can be imagined: (1) all
Cid- and H3-containing blocks might replicate simultaneously
(pattern 2); (2) a single origin might replicate the entire
domain (pattern 3) (Ahmad, 2002).
The possibility of the existence of discontinuous replication track corresponding to blocks of centromeric chromatin was investigated by pulse-labeling cells for only
15 min. To prepare stretched chromatin fibers,
nuclei spread on a glass slide were disrupted in a high-salt buffer. As the buffer
runs off the slide, it pulls chromatin fibers behind it. Stretched centromeres were identified and those fibers were examined in
which nucleotide incorporation was unambiguous. In each of
these cases it was clear that replication was occurring in discrete
patches scattered throughout the centromeric domain. These replication tracks must arise from multiple origins, and
thus the possibilities that the entire domain
replicates from a single origin, or that the whole domain
replicates simultaneously can be ruled out (Ahmad, 2002).
These patches correspond significantly with the segments
between Cid-containing chromatin. Thus, from published experiments
and the experiments described here it appears that
replication occurs in two discrete phases: all CenH3-containing
chromatin within a domain replicates, and at a different time all
H3-containing chromatin replicates. Therefore, replication
within this domain is discontinuous and initiates from multiple
origins (Ahmad, 2002).
Given that deposition of any H3 must occur in the form of
(H3:H4)2 tetramers, there must be discrimination of H3-
containing tetramers from tetramers containing variants. Thus
analysis of RI assembly was used to initiated the mapping of discriminating sites within the histone variants. It was found that one type of discrimination is a cluster of three residues within the histone
fold domain (HFD) of H3 that limits it to replication-coupled
deposition. Furthermore, because both Cid and H3.3 undergo
RI deposition but have mutually exclusive targets, there must be
additional discrimination between these variants (Ahmad, 2002).
Replication-coupled nucleosome assembly is aided by accessory
factors that are recruited to the replication fork by binding
to PCNA. However, the process of RI deposition must be
different, because RI deposition of H3.3 does not require
portions of the histone that are required for replication-coupled
deposition. Furthermore, the lack of PCNA during gap phase
deposition raises the question of what is recruiting histones to the
sites. The phenomenon of CenH3 targeting has raised expectations
that a specific, localized chromatin assembly factor or
histone modification will be involved in the targeting of CenH3s. Indeed, a chromatin remodeler of the RSC family,
PyBAF, localizes to kinetochores during mitosis of mammalian
cells. Furthermore, RSC mutations in budding yeast alter chromatin structure
specifically around centromeres, and perhaps RSC activity
is involved in assembly of centromeric nucleosomes. Mutations
in CAF and Hir genes also give centromere defects, and it has
been suggested that these factors are involved in loading the
yeast CenH3 Cse4p. However, a role for any of these factors
does little to explain the specific targeting of CenH3s, because
these factors are all widely distributed in the nucleus (Ahmad, 2002).
The best candidate for a uniquely centromere-localized chromatin
assembly factor is the Mis6 protein in fission yeast.
This protein is required for centromeric localization of the
CenH3 SpCENP-A, but Mis6 homologs in budding yeast (Ctf3) and in mammals (CENP-I) localize to centromeres
but are not required for targeting CenH3s. Thus, Mis6
proteins appear to be structural components of centromeres, not
histone assembly factors (Ahmad, 2002).
An alternative model is that some feature of centromeric
chromatin facilitates the targeting of its specialized histones. An
obvious candidate for this feature is that centromeric nucleosomes
themselves bind to and thereby recruit new CenH3
tetramers for future deposition. Such an interaction is a possible
molecular mechanism for direct templating of centromere duplication. Regardless of whether CenH3 targeting involves
specialized co-factors, templating, or both, the question remains
as to why it should use an RI pathway (Ahmad, 2002).
The targeted deposition of H3.3 to active genes is likewise
replication-independent, although transcription-coupled assembly
may facilitate (H3.3:H4)2 deposition. Perhaps H3.3 targeting
is mediated by a component of RNA polymerase complexes (Ahmad, 2002).
Because RNA polymerases move processively along the DNA
during transcription, a contiguous transcribed segment of DNA
might incorporate the H3.3 variant. Alternatively, RI deposition
of H3.3 may be facilitated by any of a number of ATP-dependent
chromatin remodeling complexes to target specific sites near
transcription units. Any candidate factor might be expected to
preferentially use H3.3 instead of H3, but whether there is any
such discriminating factor is unknown, because all in vitro studies
of higher eukaryotic chromatin assembly have been performed
with H3. It is anticipated that this will soon be addressed. However,
the prospects for identifying a unique remodeler that is
required for RI deposition are uncertain, because budding yeast
mutants that eliminate any known chromatin assembly factors do
not eliminate chromatin assembly. Thus
the possibility has to be considered that RI deposition at active genes and at centromeres uses generic remodeling activities, and that components or structural aspects common to both centromeres and actively
transcribed genes may result in RI histone deposition at both
kinds of sites (Ahmad, 2002).
The deposition of histones throughout the cell cycle by a
replication-independent process implies that previously existing
nucleosomes are unraveled, and their histones released. It is
known that the process of transcription results in a local unfolding
of the chromatin fiber and an 'open' chromatin configuration. Although transcription of nucleosomal templates with bacterial polymerases can occur in vitro without displacing histone octamers from DNA, in vivo assays demonstrated that a measurable amount of transcription-dependent
histone displacement does occur in eukaryotic nuclei. In fact, even in vitro, RNA polymerase II is virtually unable to transcribe nucleosomal
DNA under physiological conditions. Transcription requires
that histone-DNA contacts be broken for polymerase to
transit the nucleosomal DNA. Although transcription can occur
without histone displacement if the histone octamer releases
some contacts with DNA and maintains others, at some
frequency all contacts might be released. The histone octamer
would then simply fall off. Additionally, localized remodeling
factors will disrupt nucleosome structure as they act. The in vitro
and in vivo observations can be reconciled if histone displacement
occurs occasionally as nucleosomes are disrupted. Constraints on nucleosomes in a compacted chromatin fiber (i.e., 'closed' chromatin) would limit histone displacement (Ahmad, 2002).
Although internucleosome forces within inactive chromatin are
uncharacterized, they have been inferred from numerous experiments,
including the tendency of nucleosomes within hetero-chromatin
to form extremely regular and fixed arrays. A
likely constraint in heterochromatin arises from the multimeric
associations that occur between heterochromatin-specific non-histone
chromatin proteins. Attention has focused on the heterochromatin
protein-1 (HP1). HP1 is recruited to heterochromatic
DNA by binding, through its chromodomain, to the H3 tail
when it is methylated at lysine-9 (H3-K9me). The chromo
shadow domain of HP1 mediates associations between HP1
molecules, and multimers of HP1 bound to methylated histone
tails provides one basis for constraining arrays of nucleosomes (Ahmad, 2002).
Although the state of chromatin in heterochromatin and in
actively transcribed regions is well known, less is known about
the chromatin fiber packaged by centromeric nucleosomes. However, these regions appear to be open. Centromeric DNA is
sensitive to micrococcal nuclease digestion both in budding yeast and in the central core region of fission yeast centromeres
where SpCCENP-A-containing nucleosomes reside, and
plant meiotic centromeres appear decondensed. In addition,
early replication is a feature of open chromatin, and
centromeric chromatin replicates before surrounding heterochromatin (Ahmad, 2002).
An open configuration may arise from at least three
sources. (1) All CenH3s lack a canonical H3 tail. Because
methyl-modification of lysine-9 appears to be the key epitope to
maintain heterochromatin, the lack of this site in centromeric
nucleosomes means that such regions cannot become heterochromatic. Indeed, the heterochromatin protein HP1 is not associated with chromatin packaged by CenH3s. (2) A recent study of Cid homologs in drosophilids has uncovered DNA minor-groove binding motifs in the Cid tail outside of the nucleosome core. Extension of the Cid tail along linker DNA between nucleosomes may inhibit compaction of the
nucleosome strand, thus maintaining these regions in an open
configuration. (3) Chromatin remodeling factors that
destabilize nucleosomes are found both at active genes and centromeres, and their activity will promote histone replacement. It is suggested that an open chromatin configuration is the common basis for RI deposition at centromeres and at actively transcribed genes (Ahmad, 2002).
If open chromatin were the sole basis for RI deposition, then we
would expect that active genes and centromeres would incorporate
both H3.3 and CenH3s. However, their deposition is
mutually exclusive. This exclusivity is likely to rely on multiple
mechanisms that act on all steps in nucleosome assembly. Factors that discriminate between H3.3 and Cid would be the
best candidates for directing these variants to their targets. However, the organization of the nucleus provides a clue as to
another way in which exclusive targeting may be accomplished. Centromeric DNA in Drosophila is flanked by repeated sequences
that are packaged into heterochromatin, and this forms
a compartment at interphase in which centromeres are embedded
in heterochromatin. The active rDNA genes are the
primary sites of H3.3 deposition and they are also found in a
distinct nuclear compartment, the nucleolus, next to the chromocenter (Ahmad, 2002).
This functional nuclear organization is very simple to
see in Drosophila, where all heterochromatin typically associates
into one large chromocenter, and the active rDNA arrays also
often associate to present one large nucleolus. In fact, this
general compartmentalization is almost invariant in eukaryotes,
and has led to the idea that heterochromatin somehow protects
centromeres and NORs. Although both Cid and H3.3
undergo RI deposition, their exclusive targeting could in part be
accomplished by restricting one or both variants within the
nucleus. For example, unincorporated (Cid:H4)2 tetramers
might be sequestered within the heterochromatic chromocenter. Cid deposition would then appear targeted to the centromere,
because this is the only site within the chromocenter with open
chromatin (Ahmad, 2002).
Whether (Cid:H4)2 tetramers are actually sequestered in this
way is unknown. Indeed, whether sequestering substrates can
have any effect on reactions within the nucleus has become a
pressing issue. Many nuclear components remain mobile,
but functional experiments argue that certain effects in the
nucleus actually only occur when components are sequestered. It is likely that some reactions in the nucleus are relatively
independent of localization because they associate efficiently
with their partners and their reactions proceed quickly. Conversely,
reactions that involve weak interactions or multiple steps
may require raising the effective concentration of their substrates
by nuclear sequestration (Ahmad, 2002).
It has been suggested that the heterochromatic compartment
is involved in histone traffic within the nucleus. The
basis of this hypothesis was the realization that Cid-containing
chromatin behaves unusually during S phase. Generally, the
deposition of H3 quickly follows DNA replication. However, the
replication of Cid-containing centromeric DNA occurs without
H3 deposition, implying that the normal coupling between
replication components and nucleosome assembly components
must be broken. Because this coupling is thought to result from
an interaction between chromatin assembly factor 1 histone complexes and PCNA, the simplest explanation for
uncoupling the two processes would be to sequester replicative
nucleosome assembly factors away from centromeres. It is imagined
that unincorporated H3-containing tetramers might be
sequestered in euchromatin in the first half of S phase, and would
thus never (productively) see the replication forks at centromeres
within the heterochromatic compartment. This uncoupling
might be necessary to prevent dilution of centromeric
nucleosomes by conventional nucleosomes that would assemble
after replication-coupled deposition. Genetic experiments in
budding yeast and Drosophila suggest that CenH3s and H3 do
compete for assembly (Ahmad, 2002).
One way that a competition between CenH3 and H3 histones
can be probed is to change their relative concentration. A tagged Cid protein exclusively deposits at centromeres when it is ectopically expressed at low levels
from a heat-shock-inducible promoter. However, it is
apparent that expression from this construct remains low. Re-engineering the transcriptional start region of the construct
to include a translational initiation consensus site now allows
overproduction of Cid in cells (Ahmad, 2002).
To analyze the behavior of excess quantities of Cid protein, an overexpression construct was introduced into Drosophila Kc cells. Cells receive varying amounts of transfected DNA, and
thus express Cid over a wide range of levels. In cells that express
low amounts of the ectopic protein, Cid localizes to centromeres,
as expected. However, a new localization pattern for Cid is seen
at high expression levels: the tagged protein localizes to centromeres
and throughout euchromatin. The incorporation pattern
of ectopic Cid is especially clear on mitotic chromosomes from
these transfections, where the tagged protein is incorporated
throughout the euchromatic arms as well as at centromeres. It is concluded from this result that excess Cid can be
deposited at sites other than centromeres. Normal cells must
have mechanisms to prevent euchromatic deposition, but over-expression
is sufficient, by itself, to overcome this restriction (Ahmad, 2002).
The mis-incorporation pattern of Cid shows an interesting
specificity: Cid can deposit at centromeres and euchromatin but
not in heterochromatin. Therefore, heterochromatin
must either lack the feature that tolerates mis-incorporation, or must
actively exclude Cid. It is argued that centromeres and
euchromatin share the feature of open chromatin, which is
proposed to be the first prerequisite for RI deposition of histone
variants. Indeed, the mis-incorporation of Cid into euchromatin
is replication-independent, because it occurs both when euchromatin
is replicating in early S phase, and in late S phase
when euchromatic replication is complete. It is suggested
that Cid is contaminating open chromatin in the euchromatic
compartment when it is overexpressed (Ahmad, 2002).
What normally prevents the deposition of Cid into euchromatin?
Endogenous Cid is present only at low levels, and
mis-incorporation could be avoided if Cid were sequestered
away from euchromatin in the nucleus. If unincorporated Cid
were sequestered in the heterochromatic chromocenter, it would
be unable to deposit in the closed chromatin of this compartment. Thus, sequestration might serve two purposes: deposition
in euchromatin would be prevented and deposition at centromeres
would be promoted. Overexpression of CenpA in
mammalian cells also mis-incorporates into euchromatin (Ahmad, 2002).
Although it has not been examined whether CenpA mis-incorporation
is replication-independent, this is expected to be the
case, because this is how CenpA deposits at centromeres (Ahmad, 2002).
The idea that histone variants may respect nuclear compartments
was first raised by experiments expressing heterologous
CenH3s in Drosophila Kc and human HeLa cells. These extremely diverged heterologous histones do not localize to centromeres in these cells, implying that there is some kind of specificity for depositing the correct CenH3 at centromeres (Ahmad, 2002).
Surprisingly, heterologous histones are preferentially enriched
in the heterochromatic blocks. It has been suggested that there is a default ability of cells to enrich diverged H3 variants in the heterochromatic compartment. Perhaps heterochromatic enrichment is a
normal first step in the deposition of the endogenous CenH3s (Ahmad, 2002).
Those experiments and overexpression results encourage the
view that nuclear compartments may guide histone variants to
the correct subset of their potential deposition sites. Compartment
effects may also affect the RI deposition of H3.3 in an
inverse way to Cid: i.e., sequestering to promote H3.3 deposition
at active genes, and preventing its deposition at centromeres (Ahmad, 2002).
Because H3.3 is largely identical to H3, the hypothetical element
that is recognized in H3 and results in its exclusion from
chromocenters during centromere replication may also be
present in H3.3. Perhaps this discrimination against canonical
H3 histones also serves to prevent the RI deposition of H3.3 at
centromeres (Ahmad, 2002).
RI assembly permits immediate chromatin repair. The unfolding
of chromatin during transcription may be damaging, in that the
forces RNA polymerases apply to their template DNA should at
least occasionally displace histone octamers from DNA. Additionally, histone octamers may sometimes be displaced by
chromatin remodeling factors associated with transcriptional
activity. In either case, these regions must be repackaged into
nucleosomes. Similarly, replacement of CenH3s may be required
to maintain the nucleosomal configuration of centromeres after
mitosis. Bundles of microtubules drag a chromosome to the pole
during anaphase, and the forces they apply may be sufficient
to occasionally pull off histone octamers. Chromatin would then
be stripped of some CenH3 histone octamers. RI deposition
allows repair of this damage. In fact, the RI deposition of CenpA
in mammalian cells seems to occur around the time of mitosis). The deposition of Cid in Drosophila cells occurs throughout
the cell cycle, but may only be required at two points: as
centromeric DNA replicates to double its chromatin, and after
mitosis to repair stripped chromatin (Ahmad, 2002).
The process of RI assembly at active genes provides a novel
level of control over histone modifications. Replacement of
nucleosomes in one modification state by new histones could
switch chromatin to an active state. Initiation of transcription
would start this process, and successive transits of RNA polymerases
would promote RI assembly. The replacement H3
histone in alfalfa is hyperacetylated, and RI assembly with
acetylated histones could enrich such modifications in active
chromatin. However, histone modification by methylation has
appeared more problematic. A number of histone methyl-transferases
(HMTs) have been characterized, but no
histone demethylase is known. Methylated lysine-9 in the H3 tail
(H3K9me) is a critical epitope for recruiting heterochromatic
chromatin proteins, because this is the binding site for HP1. HP1
recruits additional heterochromatic proteins including the Su-var3-9 HMT. Therefore, it is straightforward to imagine how
these recruited proteins could perpetuate a heterochromatic
state through replication-coupled nucleosome assembly and cell
division (Ahmad, 2002).
Because an irreversible methyl modification appears to specify
the heterochromatic state, it has been unknown how a heterochromatic
site could switch to an active state. One route for
switching might be to prevent the methylation of nucleosomes
assembled during replication. Successive cell cycles could then
dilute methylated nucleosomes, allowing eventual activation (Ahmad, 2002).
However, more rapid mechanisms for activating silenced chromatin
must exist. Induction of silenced genes can occur within a
single cell cycle; for example, X chromosomes become reactivated
and lose H3K9me during diplotene in the Caenorhabditis
ovary. Work using a reporter for heterochromatic
gene silencing suggests that switching to an active
state can occur in somatic cells without cell division. Thus,
H3K9me can be removed without replication-coupled nucleosome
assembly (Ahmad, 2002).
RI deposition implies that the entire heterochromatic nucleosome
may be unraveled and replaced. The process of
transcriptional activation may force the disassembly of H3K9me-containing
nucleosomes, followed by RI assembly of an unmarked
nucleosome. Although the fate of the
displaced methylated H3 is not known, it is know that RI deposition can
occur at any time in the cell cycle, and thus should be able to
rapidly derepress silencing. Conversely, an active gene
could be silenced by methylating the tail of H3.3, which presents
the same lysine-9 epitope. The stability of histone methylation
gives it a distinct advantage over other histone modifications for
heritable effects on chromatin. The possibility of RI deposition
circumvents the irreversible nature of methylation, thus retaining
the potential to switch the heritable chromatin state at a later
time (Ahmad, 2002).
It is concluded that H3 variants are used to package functionally specialized chromatin,
where they play vital functional roles. Localizing these
variants to centromeres and to transcriptionally active regions
utilizes an RI process that is distinct from the nonspecific,
replication-coupled method of packaging the bulk genome. It is
argued that RI deposition is the consequence of the
activities that impinge on these sites in the genome and creates an
open chromatin structure. This flexibility in histone deposition
may be necessary to maintain the nucleosomal structure of these
regions. In higher eukaryotes, the RI deposition process allows
specialized chromatin to be distinguished at the most basic level,
where histone variants are incorporated into chromatin. The
differences between the generic H3 -- which packages the bulk of
the genome -- and the H3 variants may contribute to the physical
properties of specialized regions and recruit particular non-histone
chromatin proteins. Because histones remain associated
with DNA through mitosis, these variants establish heritable
distinctions in chromatin (Ahmad, 2002).
Centromeres are a defining feature of eukaryotes, and all are
likely to have a CenH3. However, the utilization of two conserved
versions like H3 and H3.3 is not universal. For example,
budding yeast has only one canonical H3 histone, which undergoes
both replication-coupled and RI deposition. Surprisingly,
this is H3.3: phylogenetic analysis reveals that ascomycetes
have lost H3, whereas their sister clade basidiomycetes have both
H3 and H3.3, as do animals. Therefore, an H3.3
gene performs all general functions in some organisms. The
extraordinary conservation of H3.3, which is identical from
mollusks to mammals, speaks to its fundamental role in the
eukaryotic nucleus (Ahmad, 2002).
Acetylation and methylation: Covalent modifications of chromatin and DNA that establish and maintain the heterochromatin-induced silenced state
A self-reinforcing network of interactions among the three best-characterized covalent modifications that mark heterochromatin (histone hypoacetylation, histone H3-Lys9 methylation, and cytosine methylation) suggests a mechanistic basis for spreading of heterochromatin over large domains and for stable epigenetic inheritance of the silent state. Early cytological studies have distinguished two types of chromatin: euchromatin and heterochromatin. Heterochromatin was originally defined as that portion of the genome that remains condensed and deeply staining (heteropycnotic) as the cell makes the transition from metaphase to interphase; such material is generally associated with the telomeres and pericentric regions of chromosomes. Subsequent work has identified a cluster of structural features that characterizes heterochromatin. While heterochromatic regions are rich in repetitive sequences and have a low gene density, they are not devoid of genes; it is estimated that there are ~40-50 genes within the pericentric heterochromatin of Drosophila. An altered packaging of heterochromatin, to a less-accessible form, has been demonstrated by probing with nucleases and other reagents such as prokaryotic DNA methyltransferases. The data suggest that while nucleosome arrays in euchromatin are irregular, punctuated by the nucleosome-free hypersensitive sites (HS sites) characteristic of active genes, the nucleosomes in heterochromatin have a regular spacing over large arrays, with a higher proportion of the DNA associated with the histone core rather than in the linker. Euchromatic regions silenced by nucleosome packaging is refered to as 'silent chromatin,' reserving the term 'heterochromatin' for the classically defined heterochromatin (Richards, 2002).
It is an interesting paradox that while the histones are among the most conserved proteins known in evolution, they are also among the most variable in posttranslational modification. The pattern of modifications has been suggested to act as an information code (the histone code), dictating both nucleosomal interactions and the association of nonhistone chromosomal proteins that collectively influence packaging and gene regulation. Modifications include acetylation, methylation, phosphorylation, ubiquitination, and ADP-ribosylation. Given the number of sites of posttranslational modification for each of the four core histones, an imposing number of differently modified nucleosomes is possible. The modification states of the N-terminal tails of histones H3 and H4 appear to play a major role in heterochromatin formation (Richards, 2002).
One modification of histones, hypoacetylation of lysine residues, is associated with both formation of heterochromatin and gene silencing. Early attempts to fractionate chromatin and characterize the components led to the suggestion that heterochromatic domains were associated with hypoacetylated histones, while euchromatic domains were associated with hyperacetylated histones. This distinction is observed not only between constitutive heterochromatin and euchromatin, but also in mapping studies comparing an active or inducible gene to flanking regions (Richards, 2002).
Histone H3 methylated at lysine 9 (H3-mLys9), a second modification of histones, has been identified as characteristic of the heterochromatic state. Immunofluorescent staining of Drosophila polytene chromosomes shows that the bulk of the H3-mLys9 is present in the pericentric heterochromatin and in a banded pattern on the fourth chromosome, known sites of repetitive DNA (Jacobs, 2001). Similarly, chromatin immunoprecipitation (ChIP) experiments demonstrate that H3-mLys9 is a prominent component of the silent mating type locus in fission yeast (Schizosaccharomyces pombe), while essentially absent from flanking regions containing inducible genes. Methylation of histone H3-Lys9 has also been associated with the silencing of euchromatic genes (Richards, 2002).
A third biochemical marker of heterochromatin is the most common form of DNA modification in eukaryotes, namely cytosine methylation. Although absent in some eukaryotes, this DNA modification is widely distributed in the eukaryotic kingdom. It is particularly prevalent in plants and mammals where it is an important epigenetic mark that contributes to the stability of pericentromeric heterochromatin and plays a central role in cementing and maintaining epigenetic expression states, not only in heterochromatin but in silenced euchromatic domains (Richards, 2002).
Hypoacetylation, particularly of histones H3 and H4, associated with heterochromatic domains from a range of organisms, has been studied in greatest detail in Saccharomyces cerevisiae. Many of the cis- and trans-acting factors necessary to establish and maintain the silent state at the telomeres and HML/HMR loci have been identified. These studies have demonstrated the need for hypoacetylated histones. Silencing is mediated by the multiprotein, nucleosome binding SIR(1-4) complex, recruited by interaction with specific DNA binding proteins. Sir3 and Sir4 interact specifically with the N-terminal tails of histones H3 and H4 in the hypoacetylated state. While the N-terminal tails of the histones are not required individually for growth in yeast, they do play an essential role in silencing, amino acids 4-20 of H3 and 16-29 of H4 being required. Certain sir3 alleles can suppress the silencing defect of histone H4 tail mutations, and Sir3 and Sir4 can bind to the amino termini of histones H3 and H4 in vitro, suggesting direct interaction. Recent studies using antibodies against different histone acetylated isoforms indicate that histones in the telomeric and HML/HMR heterochromatin are hypoacetylated at all modification sites (Richards, 2002 and references therein).
What is the mechanism for histone hypoacetylation specifically at the heterochromatic domains? This function is apparently provided, at least in part, by Sir2 (Drosophila homolog: Sir2), shown to have a NAD-dependent protein deacetylase activity. Sir2 can efficiently deacetylate histones in vitro, preferentially deacetylating histone H4 at Lys16, although direct action in vivo has not yet been reported. Enzymatic activity of Sir2 is required for silencing in the heterochromatic domains. The acetylation status of H4-Lys16 may be of particular importance. Lys16 is the preferred site of acetylation in monoacetylated H4 of euchromatin in yeast, and this is the only acetylatable H4 site whose mutation strongly affects Sir3 binding in heterochromatin. Deletion of sir3 results in increased histone acetylation in heterochromatic domains, as well as a loss in silencing. The results suggest an assembly model in which interaction of the Sir2-Sir4 complex with specific DNA binding proteins leads to local histone deacetylation, permitting binding of Sir3. It appears that binding of Sir3 to the hypoacetylated histone blocks reacetylation. Given the interactions between Sir2, Sir4, and Sir3, once initiated, such a complex could spread along the nucleosome array, generating and maintaining the altered modification state (Richards, 2002 and references therein).
In addition to the above, studies in S. pombe, Drosophila, and other organisms suggest that the histone acetylation level is used as a heritable mark of the chromatin state. Mutations in HDACs or treatment with trichostatin A (TSA), an inhibitor of some HDACs, frequently results in a loss of function in heterochromatic domains and a relaxation of silencing. For example, treatment with TSA results in functionally deficient centromeres and chromosome loss in S. pombe, concomitant with a loss of silencing for test genes within the centromeric heterochromatin. The hyperacetylated state is heritable following removal of TSA; it is linked in cis to the treated centromere locus, and correlates with inheritance of functionally defective centromeres, demonstrating an epigenetic phenomenon based on the chromatin structure. In contrast, acetylation is used as an inherited mark of activity. Histone H4-aLys16 is prominently associated with the dosage-compensated, 2-fold active X chromosome in males of Drosophila. This specific modification is due to MOF, an essential acetyltransferase of the dosage compensation complex that coats the male X chromosome. The complex remains associated with its target DNA throughout the cell cycle, providing the means to replicate the modification state. Recruitment of HATs to chromosome regions showing histone acetylation patterns corresponding to their own catalytic specificity has been observed, e.g., the histone acetyltransferase P/CAF binds preferentially to acetylated H4 and H3 peptides via a bromo domain. These observations provide evidence for use of the histone acetylation state as an epigenetic mark (Richards, 2002 and references therein).
How does hypoacetylation impact chromatin structure? In the case described above, the hypoacetylated histone tails interact specifically with the SIR complex. While Sir2 orthologs have been identified, few proteins with similarity to the other Sir proteins have been found in multicellular eukaryotes. Nonetheless, there may be an equivalent of the SIR complex that makes similar use of the histone hypoacetylation signal. However, a significant effect might be realized through the interaction of the histone H3/H4 tails with the DNA and/or other nucleosomes in the chromatin fiber. The regions of the histone H3 and H4 tails that contribute to DNA binding, as observed in the crystal structure, are necessary for silencing of basal transcription in vivo. However, these regions are distinct from those critical for repression at the HM loci and telomeres. It appears unlikely that simply weakening intranucleosomal histone-DNA interactions by histone acetylation could alleviate the inhibitory effect of heterochromatin structure on transcription. An alternative possibility was suggested by the original crystals of the nucleosome (using histones from Xenopus), where histone H4 amino acids 16-24 were observed to interact with the acidic region formed by histones H2A and H2B on the surface of the adjacent histone octamer. The eight H2A/H2B amino acids involved in forming this negatively charged patch are highly conserved. Acetylation of the H4 tail might disrupt this interaction, leading to a loss of compaction along the chromatin fiber. However, this disposition of the histone H4 tail is not seen in crystals of the nucleosome made using yeast histones, and additional studies are needed to resolve this interesting question (Richards, 2002 and references therein).
A key role for a second histone modification in the specification of heterochromatin is shown by the recent demonstration that mammalian homologs of Drosophila Su(var)3-9, including human SUV39H1 and murine Suv39h1, encode enzymes that specifically methylate histone H3 on lysine 9 (Rea, 2000). Su(var)3-9 was originally identified as a suppressor of PEV in Drosophila, indicating that the wild-type gene product is involved in heterochromatin formation (Tschiersch, 1994). A homolog in S. pombe, Clr4, is also a specific histone H3-Lys9 methyltransferase, suggesting that this activity is widely distributed and well conserved. clr4 mutants exhibit reduced heterochromatin formation at centromeres, with elevated mitotic chromosome loss and reduced silencing within both pericentromeric heterochromatin and the silent mating type locus. Similarly, mammalian Su(var)3-9-like proteins have been implicated in both centromere activity and gene silencing. Disruption of the murine Suv39h1 and Suv39h2 paralogs causes genome instability, chromosome mis-segregation, and male meiotic defects (Richards, 2002 and references therein).
Further, the Suv39h1/SUV39H1 proteins are found in association with M31, a mouse Heterochromatin protein 1 (HP1) homolog. HP1, perhaps the best-characterized protein found in heterochromatin, was identified in Drosophila melanogaster in a screen of monoclonal antibodies prepared against proteins tightly bound in the nucleus. Immunofluorescent staining of the polytene chromosomes shows HP1 concentrated in the pericentric heterochromatin, the telomeres, and a banded pattern across the small fourth chromosome, known sites of repetitive DNA with characteristics of heterochromatin. A few prominent HP1 sites are observed within the euchromatic arms (e.g., region 31). Homologs of HP1 are associated with pericentric heterochromatin in organisms from S. pombe to humans. The protein (206 amino acids in Drosophila) has a conserved N-terminal chromo domain (CD) followed by a variable hinge region and a conserved C-terminal chromo shadow domain (CSD). The chromo domain was first recognized by similarity with a domain in Polycomb, a protein associated with silencing of the homeotic genes during development; this domain has now been identified in many other chromosomal proteins. Both point mutations in the chromo domain and presumed null mutations (early truncation of the translation product) in the gene encoding HP1 [Su(var)2-5] result in a loss of silencing, while an additional dose will increase silencing of a variegating euchromatic gene, i.e., one placed in a heterochromatic environment. Interestingly, the converse is true for those few genes normally resident within the pericentric heterochromatin (e.g., light), which appear to be dependent on HP1 for normal activity. The conserved structure of HP1 suggests that it might serve as a bifunctional reagent, helping to organize and maintain heterochromatin structure. HP1 interacts with a number of other chromosomal proteins, including several involved in nuclear assembly, replication, and gene regulation. These interactions have generally been mapped to the chromo shadow domain. The chromo shadow domain can homodimerize, and the dimer has been suggested to be the interactive species (Richards, 2002 and references therein).
The HP1 chromo domain specifically binds histone H3 N-terminal tails methylated on lysine 9, and a variety of data suggest that this interaction is essential for maintenance of heterochromatin. The interaction appears quite specific; neither the chromo domain of Polycomb nor the chromo shadow domain of HP1 shows this interaction. The H3 tail fits within a groove established by conserved chromo domain residues; Su(var) mutation V26M results in an alteration of the structure and loss of H3-mLys9 binding. Studies in mammalian cells suggest that localization of HP1 in heterochromatin is dependent on the presence of histone H3-mLys9. However, HP1 association with heterochromatin in Drosophila can be driven either by the N-terminal portion (with the chromo domain) or the C-terminal portion (with the shadow domain), emphasizing the bifunctional nature of the protein. The above results argue that an interaction between the specifically modified histone H3 and HP1 is essential for maintaining a stable heterochromatin structure (Richards, 2002 and references therein).
Histone H3-Lys9 methylation is influenced by preexisting modifications of histone H3 and affects other histone modifications, implying a set of functional interactions. The relationship between hypoacetylation of H3/H4 and methylation of H3 has been clarified by studies of heterochromatin formation in S. pombe. clr1-clr4, clr6, swi6, and rik1 mutations all identify trans-acting factors necessary for silencing at the S. pombe mating type locus. Swi6 is a homolog of HP1, while clr1 and rik1 code for putative DNA binding proteins. The products of clr3 and clr6 are homologs of HDACs. Clr4 is the H3-Lys9 methyltransferase. These genes work together, acting on the entire silent mating type domain to maintain it in the repressed state. Clr3, an H3-specific deacetylase, and Rik1 are required for histone H3-Lys9 methylation by Clr4, and Swi6 localization is dependent on Clr4 and Rik1 (Richards, 2002 and references therein)
These observations suggest a progression of events leading to establishment of a distinctive heterochromatic structure based on the histone modification pattern. Deacetylation of histone H3 by Clr6 and/or Clr3 creates conditions favoring methylation at H3 Lys9 by the Clr4/Rik1 complex; methylation leads to binding of Swi6, establishing a chromatin configuration that is refractory to transcription and stably maintained. Mapping studies using chromatin immunoprecipitation show H3-mLys9 and Swi6 found throughout, and limited to, the 20 kb silent mating type domain. This 20 kb region is flanked by inverted repeats IR-L and IR-R, which appear to serve as barriers to the spread of silencing; removal of these repeats results in the appearance of H3-mLys9 and Swi6 on neighboring sequences. Silencing is dependent on the dosage of Swi6, which remains bound to the mating type region throughout the cell cycle and may itself be a marker for heterochromatin formation (Richards, 2002 and references therein).
The findings suggest a mechanism for maintaining heterochromatin structure following replication and for driving the spread of heterochromatin. During replication, the DNA must be 'unpackaged' and the daughter DNA molecules repackaged into nucleosomes. Parental histones are efficiently reutilized, distributed randomly to the two daughter DNA molecules; an equal amount of newly synthesized histone is required to complete assembly. Assuming that the histone H3-mLys9 in a heterochromatic domain is stable (no histone demethylases have been identified as yet), it will associate with HP1 through the chromo domain. The presence of HP1 will result in assembly of a modifying complex, presumably through the chromo shadow domain, that will deacetylate and specifically methylate the newly arrived histone, perpetuating the pattern of modification and HP1 binding to establish a heterochromatic structure. Recovery of a SUV39H1-HDAC1 complex from Drosophila embryo extracts that can methylate preacetylated histones supports such a model (Czermin, 2001). Formation of complexes that both recognize a particular pattern of histone modification and have the ability to achieve that pattern provides a mechanism for epigenetic inheritance of chromatin structure. The same machinery could account for spreading of heterochromatin, requiring that boundaries to such spread be established (Richards, 2002 and references therein).
Genetic analyses in S. pombe and Drosophila indicate that while the H3-mLys9/HP1 system is critical for heterochromatin formation and silencing in pericentric heterochromatin, it is of less importance at the telomeres, suggesting that an additional mechanism is used in those domains. Association of HP1 and a dependence on Su(var)3-9 activity have also been identified as critical in silencing particular euchromatic genes, both in mammalian systems and in Drosophila (Hwang, 2001). Interestingly, it appears that the histone H3-mLys9 modification at Rb-associated genes is quite limited; one nucleosome at the promoter is so modified, while an immediately upstream nucleosome is not, suggesting a difference in the capacity of the modified structure to spread. Histone H3-mLys9 is also associated with the inactive X chromosome in human cells, but no HP1 homologs have been identified preferentially associated with this domain. Whether differences in the degree of histone methylation or other modifications of histone H3 are important in determining any partner of H3-mLys9 in this case remains to be seen (Richards, 2002 and references therein).
A third silent chromatin mark, 5-methylcytosine (5mC), affects the DNA itself. Postreplicative methylation of cytosine is carried out by a diverse group of cytosine DNA methyltransferases (Dnmt's). Beyond this, little is known about the mechanisms that establish, maintain, and modify cytosine methylation patterns. At the whole genome level, it is clear that cytosine methylation patterns can be quite dynamic. The best example is the erasure and resetting of cytosine methylation in early mammalian development. However, large swings in cytosine methylation levels have not been detected during zebrafish development, and the evidence in plants is contradictory. Regardless of whether de novo methylation occurs every generation or in rare initiating events, certain DNA sequences must be targeted for cytosine methylation. At present, little is understood about the primary DNA sequence determinants for targeting, if any. Analysis of a Neurospora sequence prone to de novo methylation indicates the presence of redundant elements promoting methylation and suggests that TpA-rich sequences may be important. Unfortunately, similar detailed studies are not available in other organisms. Certain cytosine methyltransferases, such as mouse Dnmt3a and Dnmt3b, are specialized to carry out de novo methylation. However, these enzymes do not appear to have the intrinsic capacity for discrimination among primary nucleotide sequences, nor among higher-order structures. These considerations suggest that de novo cytosine methyltransferases might be taking cues from another epigenetic mark (Richards, 2002 and references therein).
Communication between the histone code and cytosine methylation may provide at least a partial answer to the long-standing question of how cytosine methylation patterns are established. The most direct evidence for a connection with histone methylation comes from genetic screens for cytosine hypomethylation mutants in Neurospora. The genome of this filamentous fungus contains 5-methylcytosine (~1.5 % of total C) concentrated in repetitive DNA (e.g., rRNA genes) and remnants from RIP activity (repeat induced point mutation, a hypermutation surveillance system that detects sequence duplications). Two Neurospora mutations completely abolish cytosine methylation in vegetative cells. One of these, dim-2, disrupts a gene encoding a cytosine methyltransferase. The other, dim-5, maps to a gene encoding a histone H3 methyltransferase. The predicted DIM-5 gene product contains a SET domain flanked by cysteine-rich elements and has sequence similarity to the histone methyltransferases Clr4 and Su(var)3-9, although it lacks a chromo domain. Recombinant DIM-5 protein exhibits histone methyltransferase activity in vitro. Strikingly, transformation of Neurospora with modified histone H3 genes with a substituted amino acid at Lys9 (the probable site of methylation by DIM-5) reduces cytosine methylation and relieves 5mC mediated gene silencing. Given that the dim-5 mutation appears to abolish all cytosine methylation, the results suggest that all DNA methylation in Neurospora takes its cue from histone H3-mLys9. It will be important to determine whether the histone methylation-DNA methylation connection is also found in other organisms, and if so, whether all cytosine methylation lies downstream of histone methylation. The dim-5 mutation causes more phenotypic defects than the dim-2 cytosine methyltransferase mutation, suggesting that a histone methylation deficiency has effects beyond those that result from loss of cytosine methylation (Richards, 2002 and references therein).
A connection between the histone code and the 5mC code is supported by other findings. The presence in flowering plants of cytosine methyltransferases that contain a chromo domain is particularly intriguing. Such 'chromo methyltransferases' (CMTs) might be recruited to a genomic region by nucleosomes containing histone H3-mLys9; thus, histone modification would provide a foundation for establishing DNA methylation patterns. However, the CMTs have not yet been demonstrated to bind methylated histone H3, nor is it clear that these methyltransferases possess de novo methyltransferase activity. Moreover, chromo methyltransferases have not been documented outside of plant species. Consequently, chromo methyltransferases are unlikely to be solely responsible for translating the histone methylation code into the 5mC epigenetic mark (Richards, 2002 and references therein).
Indirect models for the flow of information from histone H3-mLys9 to 5mC also need to be considered. The H3-mLys9 mark creates a foundation for HP1 interaction and subsequent heterochromatin formation. Cytosine methylation may be targeted to heterochromatin due to any number of characteristics, including nonhistone chromosomal protein content, subnuclear localization, or DNA replication timing. Disruption of heterochromatin by loss of the H3-mLys9 mark may lead to loss of 5mC through a number of intermediary steps. A 'chromatin first/cytosine methylation second' model is consistent with the demonstration that loss or alteration of cytosine methylation can be caused by mutations in SWI2/SNF2-like proteins in Arabidopsis, mice, and humans (Richards, 2002 and references therein).
Once 5mC patterns have been established, they must be maintained in order to serve as an inherited epigenetic code. The potential of cytosine methylation as a mitotic memory device was first described in the 'maintenance methylation' model. The essential feature of the model is clonal inheritance of the 5mC patterns through mitotic, and possibly meiotic, divisions based on the symmetrical nature of the sequences modified (e.g., CpG) and the specificity of 'maintenance' DNA methyltransferases for hemimethylated DNA. The basic tenets of the maintenance methylation model have been supported by a wealth of evidence. The bulk of cytosine methylation occurs very shortly after DNA replication, catalyzed by methyltransferases that have hemimethylated substrate preferences, recruited to the vicinity of the replication fork by interaction with PCNA. However, the classic maintenance methylation model is inadequate to explain the variability of 5mC patterns within individuals and omits some of the known components of the cytosine methylation system. Not all cytosine methylation occurs at short symmetrical sequences, so a simple maintenance methyltransferase, making reference solely to cytosine methylation on the template strand, cannot perpetuate methylation patterns. Maintenance of 5mC patterns at nonsymmetrical sites might involve reiterated de novo methylation and may represent an additional tier of DNA methylation superimposed on the pattern of 5mC at symmetrical sites. The machinery necessary to maintain 5mC at asymmetric sites has not been firmly established, but clues are emerging. Dnmt3a has been implicated in the synthesis of 5mC at asymmetric sites in mice. In Neurospora, a single cytosine methyltransferase, DIM-2, is responsible for all vegetative 5mC, including both symmetrical and asymmetrical sites. In plants, 5mC in asymmetrical sequences has been associated with chromo methyltransferases and RNA-dependent DNA methylation (Richards, 2002 and references therein).
The classical methylation maintenance model accounts for loss of 5mC through a passive mechanism: DNA replication in the absence of maintenance methylation. Cytological data using immuno-detection of 5mC argue that passive demethylation causes the dramatic erasure of DNA methylation patterns in early mammalian development. However, observation of 5mC loss in the absence of DNA replication has suggested an active demethylation mechanism as well. A 5mC-DNA glycosylase might also contribute to the dramatic swings in cytosine methylation seen in mammalian development (Richards, 2002 and references therein).
The execution of gene silencing from the 5mC mark involves modulation of another epigenetic mark: hypoacetylation of histones. Two independent pathways have been discovered in vertebrates connecting 5mC to histone deacetylation. The first uses methyl cytosine binding proteins, MeCP, or MBD (methyl binding domain) proteins as adaptors connecting 5mC to histone deacetylase complexes. Several MBD/MeCP protein-HDAC complexes have been identified in mammalian cells. These complexes act to reduce local histone acetylation levels using the 5mC marks on the DNA as a guide (Richards, 2002 and references therein).
A second pathway, also uncovered in mammals, operates through a physical interaction between the maintenance cytosine methyltransferase DNMT1 and HDACs. The catalytic domain of DNMT1 is not necessary for this interaction, suggesting that this cytosine methyltransferase is actually a transcriptional corepressor independent of its ability to methylate DNA. This interaction could act to reinforce inheritance of silent chromatin by facilitating histone deacetylation at the replication forks, where DNMT1 acts to maintain the 5mC epigenetic mark on methylated DNA sequences (Richards, 2002 and references therein).
Epigenetic information may also flow from the histone acetylation state back to cytosine methylation. The HDAC inhibitor TSA leads to cytosine hypomethylation at specific sequences in Neurospora, and a similar effect has been noted in mammalian cells. The loss of DNA methylation may be related to transcriptional activation, but other mechanisms have been proposed, including activation of cytosine demethylases. Inhibition of histone deacetylation does not lead to global loss of DNA methylation, however. For example, disruption of a histone deacetylase gene in plants did not lead to a generalized loss of 5mC despite a 10-fold elevation in histone H4 acetylation. Regardless of the significance of the retrograde signaling, the well-established flow of information from 5mC to histone deacetylation closes the loop of a self-reinforcing cycle for those organisms that utilize cytosine methylation (Richards, 2002).
The cycle of epigenetic marks discussed here suggests that initiation of heterochromatin formation, or similar silencing of euchromatic domains, requires acquisition of at least one epigenetic mark. What is known about entry into the cycle? In S. cerevisiae, protein interactions with specific cis-acting DNA sequences, such as E and I at the HM loci, or telomeric repeats, provide the foundation to recruit the SIR silencing complexes. The EF2-Rb-SUV39H1-HP1 interaction in mammals also implicates specific DNA sequences (binding sites for EF2) as initiation sites for silencing. Silencing within the mating type locus of S. pombe appears to be controlled both by local elements (REII and mat3 silencer) operating similarly to E and I in S. cerevisiae and by packaging of the domain as a whole, dependent on a block of repetitive DNA. In other organisms, the repetitive nature of the locus, rather than the primary DNA sequence, may be a trigger. The mechanisms at work are not clear, but hints can be derived from the repeat sensing/silencing phenomena in filamentous fungi, MIP (methylation induced premeiotically) in Ascobolus and RIP in Neurospora. In these systems, repeats appear to be recognized by a DNA-DNA pairing mechanism. In Ascobolus, cytosine methylation can be transferred between alleles, accompanying meiotic pairing and recombination events. RNA signals may provide another entrée into the cycle of epigenetic silencing. Two noncoding RNA species, Xist and Tsix, are pivotal for initiation and choice in X chromosome inactivation in mice, where H3-Lys9 methylation is an early event. RNA may also have a role in initiating silent chromatin formation by directing the acquisition of cytosine methylation marks. Resolution of this question will be one of the major goals of future research (Richards, 2002 and references therein).
Once a genomic region has been targeted for silencing by acquisition of one or more covalent epigenetic marks, a silent chromatin identity can be propagated. The general features of the system include (1) positive signaling between the different covalent epigenetic marks and (2) enzymatic complexes/pathways that recognize each mark and catalyze the formation of the same mark. For example, in yeast, the histone H3/H4 deacetylation mark is recognized by Sir3, leading to recruitment of the Sir2 histone deacetylase. The histone H3-mLys9 mark is recognized by HP1, which can apparently recruit the histone methyltransferase activity of Su(var)3-9 homologs. The third self-reinforcing loop is carried out by maintenance cytosine methyltransferases, which have a substrate preference for hemimethylated DNA. The modification pathways operating on each covalent mark also interact and reinforce each other (Richards, 2002 and references therein).
In organisms lacking 5mC, a histone modification code appears to be sufficient to mark and perpetuate silent chromatin domains. The feedback loop between histone methylation and histone deacetylation, coupled with mechanisms to maintain these modifications, apparently provides stable silencing. In fact, S. cerevisiae appears to utilize neither DNA modification nor the HP1/histone H3-mLys9 complex, relying solely on deacetylation of histones H3/H4 as an epigenetic mark to maintain silencing. The transmission of chromatin states requires that at least one of the covalent marks be inherited through mitotic, and possibly meiotic, cell divisions. All three of these marks meet the criteria of persistence through mitosis (Richards, 2002).
While self-reinforcing mechanisms may be advantageous to ensure maintenance of silencing forgenomic sequences to be archived for the long-term in a nonexpressed state (e.g., transposons, pericentromeric repeats), there may be a need to reconfigure silenced chromatin as a prerequisite to expression of specific genes (e.g., mating type switching). In this case, what general mechanisms can be used to break the heterochromatin reinforcing cycle? Removal of the histone H3-mLys9 mark may require turnover of the entire protein, since no histone demethylase has yet been identified. Histones, however, are generally very stable. In comparison, the 5mC mark is more easily erased by passive or active demethylation mechanisms. The most malleable mark is the deacetylation of histones, the levels of which are set by the competing activities of histone acetylases and histone deacetylases (Richards, 2002).
Polycomb response elements (PREs) are cis-regulatory sequences required for Polycomb repression of Hox genes in Drosophila. PREs function as potent silencers in the context of Hox reporter genes and they have been shown to partially repress a linked miniwhite reporter gene. The silencing capacity of PREs has not been systematically tested and, therefore, it has remained unclear whether only specific enhancers and promoters can respond to Polycomb silencing. Using a reporter gene assay in imaginal discs, it has been shown that a PRE from the Drosophila Hox gene Ultrabithorax potently silences different heterologous enhancers and promoters that are normally not subject to Polycomb repression. Silencing of these reporter genes is abolished in PcG mutants and excision of the PRE from the reporter gene during development results in loss of silencing within one cell generation. Together, these results suggest that PREs function as general silencer elements through which PcG proteins mediate transcriptional repression (Sengupta, 2004).
A 1.6 kb fragment encompassing the PRE from the Ubx
upstream control region was tested for its capacity to prevent transcriptional activation by enhancers from genes that are normally not under PcG control. For this
purpose, three different enhancers were tested in a lacZ reporter
gene assay in imaginal discs: dppWE, the imaginal disc
enhancer from the decapentaplegic (dpp) gene; vgQE the quadrant enhancer from the
vestigial (vg) gene; and vgBE, the vg D/V boundary enhancer. If linked to a reporter gene, each of these enhancers directs a distinct pattern of expression in the wing imaginal disc and activation by each enhancer is regulated by transcription factors that are
controlled by a different signaling pathway. Specifically, the dpp
enhancer contains binding sites for the Ci protein and is activated in
response to hedgehog signaling, the vg quadrant enhancer contains binding sites
for the Mad transcriptional regulator and is activated in response to
dpp signaling, and the vg boundary enhancer contains binding sites for the Su(H) transcription factor and is regulated by Notch signaling.
The dppWE, vgQE and
vgBE enhancers were individually inserted into a lacZ reporter gene construct that contained the PRE fragment and either a TATA box
minimal promoter from the hsp70 gene (here referred to as
TATA), or a 4.1 kb fragment of the proximal Ubx promoter
(here referred to as UbxP), fused to lacZ. In each construct, the PRE fragment was flanked by FRT sites that permit excision of the PRE fragment by flp recombinase. Several
independent transgenic lines for each of the six PRE transgenes were generated. From individual transgene insertions, derivative transgenic lines were then generated by flp-mediated excision of the PRE in the germline. Thus
expression of individual transgene insertions could be compared in the presence and absence of the PRE by staining wing imaginal discs for ß-galactosidase (ß-gal) activity. In the absence of the PRE, each of the three enhancers tested directs ß-gal expression in a characteristic previously characterized pattern. Each enhancer activated expression in the same pattern from
either the TATA box minimal promoter or the Ubx promoter with some
minor, promoter-specific differences with respect to the expression levels. By contrast, in most of the
parental transformant lines, i.e., those carrying the corresponding reporter gene with the PRE, ß-gal expression is completely suppressed. These observations suggest that the PRE fragment very potently silences each of the six reporter genes. It is noted, however, that, at some transgene insertion sites, efficiency of silencing by the PRE fragment appeared to be impeded by flanking chromosomal sequences; in these cases, it was found that ß-gal expression is activated even in the presence of the PRE (Sengupta, 2004).
To test whether silencing of the reporter genes by the PRE depends on PcG gene function, the PRE-containing transgenes
>PRE>dppWE-TATA-lacZ and
>PRE>vgQE-Ubx-lacZ were introduced into larvae that carried mutations in the PcG gene Suppressor of zeste 12 [Su(z)12]. Su(z)12 encodes a core component of the Esc-E(z) histone methyltransferase. Silencing of both transgenes is lost in
Su(z)122/Su(z)123 mutant larvae, and the
transgenes express ß-gal expression at levels comparable with the
transgene derivatives that lack the PRE fragment. Taken together, these observations suggest that the 1.6 kb PRE fragment from Ubx is a very potent general transcriptional silencer element that represses transcription in a PcG protein-dependent manner. Thus, it appears that this PRE acts indiscriminately to block transcriptional activation by a variety of different activator proteins (Sengupta, 2004),
To test the long-term requirement for the PRE for silencing of these
reporter genes, the PRE was excised during larval development and ß-gal expression was then
monitored at different time points after excision.
Forty-eight hours after induction of flp expression, all six reporter genes
showed robust derepression of ß-gal, suggesting that, in each
case, removal of the PRE results in the loss of PcG silencing.
Among the different enhancer-promoter combinations used in this study, the
dppW enhancer fused to the TATA box minimal promoter
appears to direct the highest levels of lacZ expression;
>PRE>dppWTZ transformant lines consistently
show the strongest ß-gal staining after excision of the PRE. Therefore >PRE>dppW-TZ transformants were analyzed at 4, 8, 12 and 24 hours
after induction of flp expression to study the kinetics of this derepression.
No ß-gal signal was detected at 4 hours or even at 8 hours after flp
induction, but 12 hours after flp induction, all discs showed robust ß-gal expression. Thus, even
in the case of the most potent enhancer-promoter combination used (i.e.
dppW enhancer and TATA box minimal promoter), a
delay of 12 hours between flp induction and ß-gal expression was observed. Since the average cell cycle length of imaginal disc cells in third instar larvae is 12 hours, this implies that most disc cells have undergone a full division cycle within this period. Derepression of the reporter gene in
this experiment requires several steps: (1) excision of the PRE by the flp
recombinase; (2) dissociation of the PRE and PcG proteins attached to it
-- possibly by disrupting PcG protein complexes formed between the PRE
and factors bound at the promoter, and (3) transcriptional activation by factors binding to the enhancer in the construct. It is possible that one or several steps in this process require a specific process during the cell cycle (e.g., passage through S phase) (Sengupta, 2004),
These experiments here show that three reporter genes, each containing a different enhancer linked to a canonical TATA box promoter, are completely silenced by a PRE placed upstream of the enhancer. The data suggest that PcG proteins that act through this PRE prevent indiscriminately activation by a variety of different transcription factors. The PcG machinery thus does not seem to require any specific enhancer and/or promoter sequences for repression (Sengupta, 2004),
Two points deserve to be discussed in more detail. The first concerns the stability of silencing imposed by a PRE. Previous studies have suggested that transcriptional activation in the early embryo could prevent the establishment of PcG silencing by PREs. More specifically, early transcriptional activation of Hox
genes by blastoderm enhancers may play an important role in preventing the
establishment of permanent PcG silencing in segment primordia in which Hox
genes need to be expressed at later developmental stages.
Importantly, none of the three enhancers used in this study is active in the early embryo. Moreover, these enhancers probably do not contain binding sites for specific transcriptional repressors, such as the gap repressors, which are required for establishment of PcG silencing at some PREs in the early embryo. It is therefore imagined that, in these constructs, PcG silencing complexes assemble by default on the 1.6 kb Ubx PRE in the early embryo and that PcG
silencing is thus firmly established by the stage when the imaginal discs
enhancers would become active. Silencing by the PRE during larval stages
therefore appears to be dominant overactivation and cannot be overcome by any of the enhancers used in this study. There is other evidence in support of the idea that PcG silencing during larval development is more stable than in embryos. In particular, a PRE reporter gene that contains a Gal4-inducible promoter is only transiently activated if a pulse of the transcriptional activator Gal4 is supplied during larval development; by contrast, a pulse of Gal4 during embryogenesis switches the PRE into an 'active mode' that supports
transcriptional activation throughout development.
Furthermore, recent studies in imaginal discs suggest that there is a
distinction between transcriptional repression and the inheritance of the
silenced state; the silenced state can be propagated for some period even if repression is lost. Specifically, loss of Hox gene silencing after removal of PcG proteins in proliferating cells can be reversed if the depleted PcG protein is resupplied within a few cell generations. Taken
together, it thus appears that PcG silencing during postembryonic development is a remarkably stable process. Finally, the results reported in this study also imply that, once PcG silencing is established, Hox genes can `make use of virtually any type of transcriptional activator to maintain their expression; PcG silencing will ensure that activation by these factors only occurs in cells in which the Hox gene should be active. The analysis of Ubx control sequences supports this view; if individually linked to a reporter gene, most late-acting enhancers direct expression both within as well as outside of the normal Ubx expression domain (Sengupta, 2004),
The second point to discuss concerns the repression mechanism used by
PcG proteins. Biochemical purification of PRC1 has revealed that several TFIID
components co-purify with the PcG proteins that constitute the core of PRC1. Moreover, formaldehyde crosslinking experiments in tissue culture cells showed that TFIID components are associated with promoters, even if these are repressed by PcG proteins. This suggests that PcG protein complexes anchored at the
PRE interact with general transcription factors bound at the promoter. One
possibility would be that PcG repressors directly target components of the
general transcription machinery to prevent transcriptional activation by
enhancer-binding factors. Three distinct activators act
through the three enhancers used in this study and, according to these results, none of them is able to overcome the block imposed by the PcG machinery. But how do the known
activities of PcG protein complexes [i.e., histone methylation by the Esc-E(z) complex and inhibition of chromatin remodeling by PRC1] fit into this
scenario? Both these activities may be required for the repression process by altering the structure of chromatin around the transcription start site and thus prevent the formation of productive RNA Pol II complexes. Other scenarios are possible. For example, histone methylation may primarily serve to mark the chromatin for binding of PRC1 through Pc, and PRC1 components such as Psc then perform the actual repression process.
Whatever the exact repression mechanism may be, the PRE-excision experiment shows that this repression is lost within one cell generation after removal of the PRE. This implies that changes in the chromatin generated by the action of PcG proteins cannot be propagated by the flanking chromatin (Sengupta, 2004),
References Ahmad, K. and Henikoff, S. (2002). Histone H3 variants specify modes of chromatin assembly. Proc. Natl. Acad. Sci. 99 Suppl 4: 16477-84. 1217744
Ayer, D. E., Kretzner, L. and Eisenman, R. N. (1993). Mad: a heterodimeric partner for Max that antagonizes Myc transcriptional activity. Cell 72: 211-222. 8425218
Czermin, B., Schotta, G., Hulsmann, B. B., Brehm, A., Becker, P. B., Reuter, G. and Imhof, A. (2001). Physical and functional association of SU(VAR)3-9 and HDAC1 in Drosophila EMBO Rep. 2: 915-919. 11571273
Hassig, C. A., et al. (1997). Histone deacetylase activity is required for full
transcriptional repression by mSin3A. Cell 89 (3): 341-347
Hwang, K. K., Eissenberg, J. C. and Worman, H. J. (2001). Transcriptional repression of euchromatic genes by Drosophila heterochromatin protein 1 and histone modifiers. Proc. Natl. Acad. Sci. 98: 11423-11427. 11562500
Jacobs, S. A., Taverna, S. D., Zhang, Y., Briggs, S. D., Li, J., Eissenberg, J. C., Allis, C. D. and Khorasanizadeh, S. (2001). Specificity of the HP1 chromo domain for the methylated N-terminus of histone H3. EMBO J. 20: 5232-5241. 11566886
Nagy, L., et al. (1997). Nuclear receptor repression mediated by a complex
containing SMRT, mSin3A, and histone deacetylase. Cell 89 (3): 373-380
Rea, S., Eisenhaber, F., O'Carroll, D., Strahl, B. D., Sun, Z. W., Schmid, M., Opravil, S., Mechtler, K., Ponting, C. P., Allis, C. D. and Jenuwein, T. (2000). Regulation of chromatin structure by site-specific histone H3 methyltransferases. Nature 406: 593-599. 10949293
Richards, E. J. and Elgin, S. R. C. (2002). Epigenetic codes for heterochromatin formation and silencing: rounding up the usual suspects. Cell 108: 489-500. 11909520
Ringrose, L., Rehmsmeier, M., Dura, J. M. and Paro, R. (2003). Genome-wide prediction of Polycomb/Trithorax response elements in Drosophila melanogaster. Dev. Cell 5: 759-771. 14602076
Schwartz, Y. B., et al. (2006). Genome-wide analysis of Polycomb targets in Drosophila melanogaster. Nat. Genet. 38(6): 700-5. 16732288
Sengupta, A. K., Kuhrs, A. and M¸ller, J. (2004). General transcriptional silencing by a Polycomb response element in Drosophila. Development 131: 1959-1965. 15056613
Tschiersch, B., Hofmann, A., Krauss, V., Dorn, R., Korge, G. and Reuter, G. (1994). The protein encoded by the Drosophila position-effect variegation suppressor gene Su(var)3-9 combines domains of antagonistic regulators of homeotic gene complexes. EMBO J. 13: 3822-3831. 7915232
Tyler, J. K., et al. (1996). The p55 subunit of Drosophila chromatin assembly factor 1 is homologous to a Histone deacetylase-associated protein. Mol. Cell. Biol. 16: 6149-6159. 8887645
Home page: The Interactive Fly © 1995, 1996 Thomas B. Brody, Ph.D.
The Interactive Fly resides on the
Society for Developmental Biology's Web server.