Male-specific lethal 3

male-specific lethal 3

REGULATION

Targeting determinants of dosage compensation in Drosophila

The dosage compensation complex (DCC) in Drosophila melanogasteris responsible for up-regulating transcription from the single male X chromosome to equal the transcription from the two X chromosomes in females. Visualization of the DCC, a large ribonucleoprotein complex, on male larval polytene chromosomes reveals that the complex binds selectively to many interbands on the X chromosome. The targeting of the DCC is thought to be in part determined by DNA sequences that are enriched on the X. So far, lack of knowledge about DCC binding sites has prevented the identification of sequence determinants. Only three binding sites have been identified to date, but analysis of their DNA sequence did not allow the prediction of further binding sites. Chromatin immunoprecipitation was used to identify a number of new DCC binding fragments and characterized them in vivo by visualizing DCC binding to autosomal insertions of these fragments, and it has been demonstrated that these fragments possess a wide range of potential to recruit the DCC. By varying the in vivo concentration of the DCC, evidence is provided that this range of recruitment potential is due to differences in affinity of the complex to these sites. It was also established that DCC binding to ectopic high-affinity sites can allow nearby low-affinity sites to recruit the complex. Using the sequences of the newly identified and previously characterized binding fragments, a number of short sequence motifs have been uncovered, that in combination may contribute to DCC recruitment. These findings suggest that the DCC is recruited to the X via a number of binding sites of decreasing affinities, and that the presence of high- and moderate-affinity sites on the X may ensure that lower-affinity sites are occupied in a context-dependent manner. Bioinformatics analysis suggests that DCC binding sites may be composed of variable combinations of degenerate motifs (Dahlsveen, 2006).

Using a ChIP strategy, several new DCC binding fragments have been identified and it has been demonstrated that they possess a wide range of potential to recruit the DCC. Because the majority of the isolated candidate fragments co-map with endogenous DCC binding sites at the resolution afforded by staining of polytene chromosomes, it is believed that the ChIP selection procedure is appropriate. By tuning DCC levels in vivo, it was concluded that the difference in recruitment ability is due to different affinity of the DCC for these fragments. At limiting concentrations of complex, only the sites of highest affinity are occupied. Conversely, at non-physiologically high concentrations of DCC, even 'cryptic' binding sites on autosomes are recognized by the complex. This suggests, in accord with previous observations, that selective interaction of the DCC with the X chromosome is a function of tightly controlled levels of complex components that are adjusted to assure interaction with binding sites of varying affinity clustered on the X, but insufficient to occupy cryptic sequences on autosomes. These data are also in broad agreement with observations that numerous sites on the X chromosomes contain DCC binding determinants. These determinants are not all equal, but represent a diverse set of DCC targets that differ by a wide range of affinities for the complex, as expected from a sequence determinant that during evolution became gradually enriched on the X chromosome (Dahlsveen, 2006).

The use of the term 'chromatin entry sites' for the subset of DCC binding sites that are still occupied by partial complexes in the absence of MSL3, implies that these sites were somehow qualitatively and perhaps functionally distinct from the remaining sites that only attract the intact complex. Although it is possible that not all DCC binding sites are functionally equivalent, the characterization of several new examples of both types of DCC binding sites suggests support for the 'affinities model'. According to this model, 'chromatin entry sites' are not qualitatively different from other sites, but only represent those sites with the highest affinity for the complex. A prediction from this model that is further substantiated by the results is that non-functional complexes that lack MSL3 or the acetyltransferase activity of MOF have lower affinity for target sites. Only those determinants with highest affinity for the DCC are able to recruit partial complexes in the absence of MSL3. Sites with slightly lower affinity are still able to recruit the complex in the mof¹ mutant. Because the interaction of the DCC with the X chromosome is thought to be largely mediated by MSL1 and MSL2, it remains to be explored whether MSL3 and the acetylase activity of MOF affect the active concentration of MSL1 and MSL2 or lead instead to the adoption of a high-affinity conformation of the complex. Conversely, it remains to be seen if over-expression of MSL1 and MSL2 in the msl-3¹ and mof¹ mutants would allow partial complexes to bind additional sites. In this respect it is intriguing that the mutation of both roX RNAs, which is presumed to lead to incomplete and non-functional complexes, can be partially rescued by the over-expression of MSL1 and MSL2 (Dahlsveen, 2006).

During analysis of DCC recruitment to high-affinity sites inserted into autosomes of wild-type males, an additional band of DCC binding was observed close to the insertion site in three independent cases (one insert each of DBF9, DBF5, and DBF7). Such minimal and rare 'spreading' has previously been observed for ectopic insertions of the 18D high-affinity site and from roX transgenes in the wild-type male background. This study now reveals that these additional DCC binding sites are not a result of random spreading, but are most likely due to interaction of the DCC with one of the low-affinity sites on autosomes that happened to reside close to the insertion site. These sites are usually observed only when the DCC concentrations are globally increased by over-expression of MSL1 and MSL2. Accordingly, it is suggested that the autosomal insertion of a high-affinity DCC binding site leads to a local rise in complex concentration, which allows these low-affinity sites to be recognized by the DCC even in wild-type males. However, additional requirements must clearly be met to allow low-affinity sites to profit from local increases in complex concentration, since not all ectopic high-affinity sites support the phenomenon. Permissive conditions may include active transcription or the presence of specific epigenetic marks (Dahlsveen, 2006).

It is envisioned that the clustering of DCC binding determinants of high and intermediate affinity on the X chromosome (combined with the transcription of the roX RNAs) elevates the concentration of the DCC within the X chromosomal territory and ensures the occupancy of lower-affinity sites in a context-dependent manner. This may explain the observation that autosomally derived transgenes often acquire dosage compensation. The transgenes may contain cryptic DCC binding determinants and may thus acquire binding if placed in the context of the X chromosomal territory. Conversely, an X chromosomal fragment that harbors only low-affinity sites may not be recognized if translocated to an autosomal context, and the fragment DBF3 may be an example for such a scenario. The presence of a large number of low-affinity sites may also contribute significantly to restricting the binding of the DCC to the X chromosome (Dahlsveen, 2006).

The term 'spreading' has been used to describe the appearance of additional bands of DCC binding around autosomal insertions of roX cDNAs or fragments derived thereof. However, extensive, long-range spreading from roX transgenes, which leads to the appearance of many ectopic DCC bands at greater distances from the insertion sites, occurs only under unusual conditions and depends on the transcription of the roX RNA rather than the DCC binding sites on DNA. Long-range spreading of the complex also does not occur into autosomal chromatin translocated to the X chromosome. It is suggested that large translocations maintain their original chromosomal context (DCC enriched or not), and therefore no redistribution of DCC over the new chromosomal junction is observable at the resolution of the polytene chromosomes. Importantly, this study does not address the higher-resolution distribution of the DCC within a chromosomal band. It is possible that such a band contains many individual binding sites, also of varying affinity. At this resolution, the term 'spreading' may characterize the local diffusion of the DCC from high- to low-affinity sites. This study does not exclude this type of spreading, or indeed any other kind of complex distribution within a chromosomal band. High-resolution ChIP analyses will be necessary to resolve the detailed nature of DCC distribution (Dahlsveen, 2006).

Previously, only three high-affinity binding sites for DCC were known. This study identified nine more fragments, and this encouraged investigation of common features within a larger pool. Interestingly, all new DBFs were found to map to gene-rich regions and either overlap with or lie close to essential genes. Three high-affinity fragments (DBF12, DBF9, and DBF6) reside entirely within genes. It is possible that specific recruitment sites, such as those inferred to reside within the DBFs, have been enriched in and around genes that require dosage compensation during evolution, and consequently, high-affinity sites may represent loci that are particularly dosage sensitive. Previous experiments indicated that the DCC tends to bind to the coding regions of genes, and it was suggested that this was linked to transcriptional activity. Although recent observations suggest that transcriptional activity alone is not sufficient to attract DCC binding, it is possible that transcription influences DCC recruitment to specific sites. For example, high-affinity sites, which show consistent and strong recruitment of the DCC at many chromosomal positions, may not be influenced by transcription. However, sites with lower affinity and variable recruitment ability may profit from transcriptional activity. Developmental differences in transcriptional activity may therefore also explain the lack of DCC recruitment in salivary glands to fragments isolated by ChIP from embryos (Dahlsveen, 2006).

This study has attempted to identify common sequence elements within previously characterized and new high-affinity DCC binding fragments and have uncovered a number of short sequence elements, whose clustering in combinations could contribute to DCC recruitment. Clearly, the importance of these elements remains to be tested experimentally. Previous analysis of the roX DCC binding sites identified a 110 bp sequence containing several blocks of conservation between roX1 and roX2. DCC binding was affected by mutation in several of the conserved blocks, indicating that DCC binding sites may be made up of combinations of shorter elements. Such combinations have be sought by defining pairs of elements found within a 200 bp window in the high-affinity DCC binding fragments. Those pairs that are significantly enriched on the X chromosome compared to other chromosomes are presented. Importantly, these X-enriched pairs often occur in multiple copies in the high-affinity fragments and at higher frequencies compared to the lower-affinity fragments DBF9-A, DBF1, DBF11, DBF13, and DBF3. Nonetheless, there is no obvious correlation between the location of individual pairs on the X and any specific features such as predicted genes. It is hypothesized that the elements that define these pairs (and other such elements that may have escaped attention) correspond to building blocks of DCC binding sites. Accordingly, a DCC binding site of given affinity for the complex would not be determined by a unique DNA sequence, but by clustering of variable combinations of short, degenerate sequence motifs. Individual low-affinity binding sites may not be unique to the X, but their clustering on the X may contribute to high-affinity binding. There are already indications that the DCC binds to several sites in close proximity. The two parts of DBF9, DBF9-A and DBF9-B, are both able to recruit the DCC, albeit with different affinity. The analysis of the 18D high-affinity fragment also suggested that multiple elements over 8.8 kb contribute to the binding of the complex (Dahlsveen, 2006).

The pairs have been ordered according to sequence similarity. Interestingly, a large family of elements contain GAGA-related motifs. Mutation of GAGA or CTCT motifs in the 110 bp roX1/roX2 consensus severely affects DCC recruitment to that sequence, indicating that GAGA motifs are involved in DCC binding. The fact these elements enriched in several independently identified high-affinity fragments demonstrates the appropriateness of the algorithms used to find them. Besides elements with a clear relationship to GAGA motifs, several other element families were identified defined by sequence similarity. In order to visualize the element families, the related words may be aligned such that sequence logos representing degenerate motifs can be derived using the WebLogo software (http://weblogo.cbr.nrc.ca). It is considered possible that some of these degenerate motifs may contribute to DCC binding sites. Evaluation of the contributions of these novel motifs to the targeting of the complex will require increased resolution analysis and systematic evaluation of candidate sequences in the in vivo recruitment assay (Dahlsveen, 2006).

This study suggests that high-affinity DCC binding sites are composed of variable combinations of clustered, degenerate sequence motifs. The degeneracy of the sequence motifs indicates that many individual elements may have low affinity. Therefore, the interaction of the DCC with each individual site should be in dynamic equilibrium. However, it was recently observed by photobleaching techniques that the DCC components most likely involved in chromatin binding, MSL2 and MSL1, interact with the X chromosomal territory in cultured cells in an unusually stable manner, which is not compatible with binding equilibria involving off-rates that commonly characterize protein-DNA interactions. Several hypotheses can be formulated, whose evaluation may lead to resolution of this apparent contradiction. (1) Formation of higher-order structures involving many DCC components engaged in numerous simultaneous DNA interactions may lead to a trapping of the DCC within the X chromosome territory. (2) An initial sequence-directed targeting event may be followed by a stabilization of the interaction through positive reinforcement involving additional principles, such as epigenetic marks or a topological linkage. (3) It is considered that the arrangement of the interphase genome in polytene chromosomes may differ in a relevant aspect from the more compact chromosomal territories of diploid cultured cells. Ultimately, the identification of the DNA-binding domains of DCC components and analysis of their mode of DNA interaction will be required to solve the targeting issue (Dahlsveen, 2006).

Long-range spreading of dosage compensation in Drosophila captures transcribed autosomal genes inserted on X

Dosage compensation in Drosophila males is achieved via targeting of male-specific lethal (MSL) complex to X-linked genes. This is proposed to involve sequence-specific recognition of the X at approximately 150-300 chromatin entry sites, and subsequent spreading to active genes. This study asked whether the spreading step requires transcription and is sequence-independent. It was found that MSL complex binds, acetylates, and up-regulates autosomal genes inserted on X, but only if transcriptionally active. It is concluded that a long-sought specific DNA sequence within X-linked genes is not obligatory for MSL binding. Instead, linkage and transcription play the pivotal roles in MSL targeting irrespective of gene origin and DNA sequence (Gorchakov, 2009).

To ask whether autosomal genes can effectively recruit MSL complex, a construct was created that was named TrojanHorse, consisting of a 14-kb fragment from chromosome arm 2L encompassing two small genes, Rpl40 and cg3702. These genes were chosen because they are expressed at all stages of development and are associated with H3K36me3, both hallmarks of typical MSL targets on the X. The sequence of the 14-kb fragment was altered by incorporating five 0.2-kb tags of non-Drosophila origin into the 3'-untranslated regions (UTRs) of Rpl40 and cg3702, and the nontranscribed regions of TrojanHorse. This design allowed clear discrimination between transcripts derived from the endogenous Rpl40 and cg3702 genes and their transposed copies on the X. Additionally, these tags enabled direct assessment of MSL protein and histone modification profiles across the TrojanHorse insertion on the X chromosome. The TrojanHorse construct was inserted in two precise genomic positions using the recently developed attB/attP phage phiC31 integration system. One insertion site (attP18 at cytological position 6C12) was in a gene-rich region encompassing numerous MSL gene targets. In contrast, the second landing site (attP3 at 19C4) was intentionally chosen to be distant from MSL targets and active genes, with the nearest MSL target 60 kb proximal to the TrojanHorse insertion. Thus, in the second case, the two constitutively expressed autosomal genes were inserted on X within a large, MSL-depleted region (Gorchakov, 2009).

If specific DNA sequences at each gene are required to complete the MSL pattern on X, then TrojanHorse genes, originating from an autosome, should not acquire MSL binding. Alternatively, if linkage to the X chromosome and the active state are sufficient, TrojanHorse genes should be bound and acquire H4K16ac. Therefore, ChIP assays were performed to determine the binding profiles for two MSL subunits, MSL2 and TAP-tagged MSL3, within TrojanHorse placed at either attP18 or attP3. Independent of the integration site, pronounced binding was observed of MSL proteins to the tags located at the 3' ends of both cg3702 and Rpl40 of TrojanHorse in mixed sex embryos and male larvae. In contrast, the untranscribed regions of TrojanHorse did not attract the MSL complex. Furthermore, MSL complex binding led to H4K16 acetylation of TrojanHorse chromatin. In agreement with the binding profiles for MSL2 and MSL3-TAP, H4K16ac was found to peak at the 3'-UTRs of cg3702 and Rpl40 in both embryos and male larvae. Notably, the histone H4 acetylation profile was broader than that of the MSL complex, consistent with recent studies. Finally, it was confirmed that the two genes within TrojanHorse were indeed transcribed in larvae and male and female embryos by mapping RNA polymerase II and histone methylation marks expected for active genes: dimethylated H3K4 (H3K4me2) at the 5' ends, and H3K36me3 biased toward the 3' ends. Taken together, these results demonstrate that transcribed and H3K36me3-marked autosomal genes can become typical MSL targets when placed on the male X, favoring a sequence-independent model for spreading. These data also indicate that targeting can occur in the apparent absence of a nearby CES, as the closest identified entry sites are located ~54 kb and 80 kb proximal to attP18 and attP3, respectively. Furthermore, attP3 is 60 kb away from the nearest active gene cluster or MSL-bound region, indicating that targeting can occur over long distances in cis (Gorchakov, 2009).

Despite the clear association of transposed autosomal genes with MSL protein in the male X environment, it remained to be determined whether their transcription or some other features were required for the observed complex binding. In order to address this question, a 'promoterless' version of TrojanHorse was created where the 1.2-kb region, including the promoters and 5' ends of both cg3702 and Rpl40, was deleted. This TrojanHorseδ, otherwise identical to TrojanHorse, was placed in the same attP18 site by attP/attB recombination. Quantitative PCR (qPCR) analysis confirmed that transcription of both genes in TrojanHorseδ was dramatically reduced (at least 25-fold), approaching the detection limit. In addition, the genes exhibited background levels of H3K36me3 in TrojanHorseδ. It was then asked whether transcriptionally inactive genes showed any changes in binding patterns of MSL and other proteins. Indeed, MSL3-TAP no longer bound nontranscribed cg3702 and Rpl40, resulting in decreased H4K16ac levels throughout TrojanHorseδ. This direct comparison clearly supports the importance of transcription for MSL recognition of its targets (Gorchakov, 2009).

To test whether the genes within the intact TrojanHorse were not only associated with MSL complex and marked with H4K16ac but also were subject to dosage compensation, the degree of up-regulation of cg3702 and Rpl40 was directly measured in the TrojanHorse context. These genes are transcribed and dosage-compensated in male larvae so that expression of one copy in males is approximately equal to two copies in females, and about twofold higher than in heterozygous females. Likewise, up-regulation is observed in male embryos. However, in this case the dosage compensation appears less complete, perhaps due to maternal deposition of the transcripts leading to an underestimate of the male/female zygotic transcription ratio. Taken together, these data indicate that the MSL complex bound to TrojanHorse retains all its functions: It acetylates the underlying chromatin and increases transcriptional output, resulting in dosage compensation of both Rpl40 and cg3702 (Gorchakov, 2009).

To exclude the possibility that the two genes in the TrojanHorse construct were somehow exceptional in their ability to attract MSL complex, a more extensive set of active autosomal genes was tested by engineering a larger A-to-X transposition. TrojanElephant was created by inserting a 65-kb region from chromosome 2L, containing 20 genes from cg13773 to snRNP70K into the attB-P[acman] vector via recombineering. None of the TrojanHorse or TrojanElephant genes attracted MSL complex when located in their endogenous positions on autosomes. Therefore, the TrojanElephant construct was inserted at the previously characterized attP3 site on the X, which is a gene-poor region. Several possible results were hypothesized: (1) Within the 65-kb segment, the MSL complex might target all H3K36me3-positive genes. (2) MSL binding might be detected over the active genes closest to the ends of the 65-kb TrojanElephant, with a gradual decrease in binding closer to the center. (3) The MSL complex might skip genes in TrojanElephant altogether (Gorchakov, 2009).

To determine which of these scenarios occurs in vivo, MSL binding across TrojanElephant was measured in third-instar larvae by ChIP followed by qPCR. The data obtained from both experiments clearly demonstrated the ability of the MSL complex to faithfully recognize the transcribed and H3K36me3-marked genes within the transposed material, with no evidence for skipping any of the active genes. Therefore, if the MSL complex spreads linearly from the closest identified flanking CES, it can travel at least 83 kb. The fact that all active autosome-derived genes were bound by MSL complex argues strongly against the idea that each X-chromosomal gene possesses special MSL recruiting signals. Instead, it is proposed that after the initial attraction of the MSL complex by a chromatin entry sites (CES) or roX RNA gene, transcription of active genes serves as the main guiding feature for MSL complex binding (Gorchakov, 2009).

These results are consistent with early transgenic studies, indicating that single genes could become dosage-compensated when inserted on X. However, the work still needs to be reconciled with results from established transposition stocks, in which much larger inserts of autosomal material onto X, failed to acquire dosage compensation or cytologically visible MSL binding, or showed binding only a few kilobases into the insertion by ChIP analysis. One possibility is that stable stocks carrying large spontaneous or X-ray-induced rearrangements may display exceptional behavior, as they have been preselected for viability and thus possibly for the maintenance of chromosome of origin regulation. Alternatively, these results may indicate that distance to the nearest functional CES is critical for MSL targeting, but that this critical distance cannot be reached in the current experiments. It may be that once a certain size is attained, any DNA insertion will tend to associate with its chromosome of origin rather than be generally localized within the X-chromosome three-dimensional territory, thus excluding spreading. Testing these possibilities may be within reach in the near future, as improvements in transgenic technology allow larger insertions at predefined breakpoints to be obtained (Gorchakov, 2009).

The data allow further refinement of the two-step model for MSL complex recruitment to the male X chromosome. In the first step, MSLs are thought to bind 150-300 CES containing MRE motifs on X, and ignore autosomes. In the second step, MSLs recognize the active genes on the X irrespectively of their sequence and origin through a transcription-dependent mechanism. Trimethylation of H3K36 is partially responsible for this sequence-independent step. Here it is speculated on the coexistence of two modes of MSL spreading on the X: (1) long-range spreading from the roX genes, and (2) local distribution of the MSL complex from the CES scattered throughout the X (with a median distance of ~100 kb). It is only in the context of the X chromosome that both roX genes and CES are found in cis to each other, making possible both correct chromosome identification and efficient spreading. The enigmatic nature of roX spreading remains to be understood, including whether features of active genes might be recognized directly by roX RNAs (Gorchakov, 2009).

High-resolution ChIP-chip analysis reveals that the Drosophila MSL complex selectively identifies active genes on the male X chromosome

X-chromosome dosage compensation in Drosophila requires the male-specific lethal (MSL) complex, which up-regulates gene expression from the single male X chromosome. This study defines X-chromosome-specific MSL binding at high resolution in two male cell lines and in late-stage embryos. The MSL complex is highly enriched over most expressed genes, with binding biased toward the 3' end of transcription units. The binding patterns are largely similar in the distinct cell types, with ~600 genes clearly bound in all three cases. Genes identified as clearly bound in one cell type and not in another indicate that attraction of MSL complex correlates with expression state. Thus, sequence alone is not sufficient to explain MSL targeting. It is proposed that the MSL complex recognizes most X-linked genes, but only in the context of chromatin factors or modifications indicative of active transcription. Distinguishing expressed genes from the bulk of the genome is likely to be an important function common to many chromatin organizing and modifying activities (Alekseyenko, 2006).

To precisely map MSL binding along the X chromosome at high resolution, a TAP-tagged MSL complex was created that could be isolated with high affinity. The MSL3 chromodomain protein was TAP-tagged at its C terminus and expressed from its native promoter without any loss of protein-coding information. To test the construct for MSL3 function, transgenic flies were created that contain a single copy of MSL3-TAP; this construct fully rescued msl3 mutant males. Furthermore, the MSL3-TAP protein was detected along the length of the polytene male X chromosome in the normal MSL pattern. MSL3-TAP immunostaining of the X was not diminished in a wild-type background by the presence of endogenous MSL3 protein, suggesting that the epitope-tagged protein competes well with the native protein (Alekseyenko, 2006).

The MSL-binding profile correlates well with that of its targeted modification, H4K16ac, on selected X-linked genes. The skew toward the 3' end of genes is unlike the profile of transcription initiation factors and instead reminiscent of factors that function in transcription elongation or termination. Together these results suggest that the MSL complex is unlikely to function directly at the promoter like a typical transcription factor. An appealing idea is that an improvement of transcription elongation might improve ultimate mRNA production, perhaps by local recycling of RNA polymerase or other components of the general transcriptional machinery. Recently, the genomic distribution of histone H3.3, a histone variant associated with transcription, showed increased enrichment on X-linked genes in Drosophila SL2 cells when compared with autosomal genes. This enrichment favors the 5' ends of transcription units and so might reflect a stimulation of transcription initiation or elongation due to MSL action (Alekseyenko, 2006).

The results suggest that the MSL complex targets genes predominantly in the context of active transcription. This is consistent with the predominance of MSL complex in interband regions of polytene chromosomes, and with experiments in which enhancer sequences responsive to the Gal4 activator protein were able to create new MSL-binding sites that required the expression of Gal4. At the same time, the results are also consistent with recent cytological comparisons of the elongating form of RNA polymerase II with the MSL pattern on the polytene X chromosome, in which colocalization was observed but was clearly incomplete (Kotlikova, 2006). For example, many genes that were identified as differentially transcribed between two lines of cultured cells were unbound in both. This type of gene would show a lack of colocalization of RNA polymerase and MSL complex when transcribed. Consistent with the largely invariant pattern of MSL binding seen on polytene chromosomes by Kotlikova (2006), it was found that the majority of MSL targets are commonly expressed genes. Differentially regulated genes may have been less likely to evolve the ability to attract MSL complex and perhaps may have other mechanisms to compensate for dosage differences. The results suggest intrinsic recognition of many, but not all, X-linked genes, within the context of transcription (Alekseyenko, 2006).

Recognition of expressed genes makes excellent biological sense for the MSL complex in two ways. The most obvious is that only expressed genes need to be up-regulated. In this regard, it is notable that binding is independent of the absolute transcription level of individual genes, as dosage compensation must be able to operate on genes with a wide range of intrinsic expression levels. Another important reason to link binding to transcription may be to prevent MSL complex from ectopically influencing genes that should not be expressed. When roX genes are inserted into P transposons and mislocalized at random positions in the genome, they attract MSL complexes that can spread from the site of insertion into flanking chromatin. In several instances, such insertions have occurred in regions where the mini-white reporter gene is silenced in females, but activated in males through action of the MSL complex. MSL action appeared to have the capacity to overcome Polycomb, HP1, and unidentified modes of silencing. Clearly MSL complex must normally be limited in its targeting to avoid potentially catastrophic male-specific activation of silent genes (Alekseyenko, 2006).

How does MSL complex locate its target genes? Studies of roX genes suggest that spreading in cis can occur from high local concentrations of MSL complex. An interesting extension of this idea is that the covering of large segments of transcription units may occur by a very local spreading mechanism related to the much longer range spreading that can be seen from roX transgenes inserted on autosomes. Both long-range and local spreading could be the consequence of attraction of the MSL complex to chromatin modifications that mark RNA polymerase II transcription units, such as histone H3 methylated at Lys 36. Distinguishing expressed genes from the bulk of the genome is likely to be an important function common to many chromatin organizing and modifying activities (Alekseyenko, 2006),

The Drosophila dosage compensation complex binds to polytene chromosomes independently of developmental changes in transcription

In Drosophila, the dosage compensation complex (DCC) mediates upregulation of transcription from the single male X chromosome. Despite coating the polytene male X, the DCC pattern looks discontinuous and probably reflects DCC dynamic associations with genes active at a given moment of development in a salivary gland. To test this hypothesis, binding patterns of the DCC and of the elongating form of RNA polymerase II (PolIIo) were compared. Unlike PolIIo, the DCC demonstrates a stable banded pattern throughout larval development and escapes binding to a subset of transcriptionally active areas, including developmental puffs. Moreover, these proteins are not completely colocalized at the electron microscopy level. These data combined imply that simple recognition of PolII machinery or of general features of active chromatin is either insufficient or not involved in DCC recruitment to its targets. It is proposed that DCC-mediated site-specific upregulation of transcription is not the fate of all active X-linked genes in males. Additionally, it was found that DCC subunit MLE associates dynamically with developmental and heat-shock-induced puffs and, surprisingly, with those developing within DCC-devoid regions of the male X, thus resembling the PolIIo pattern. These data imply that, independently of other MSL proteins, the RNA-helicase MLE might participate in general transcriptional regulation or RNA processing (Kotlikova, 2006).

The nature of targets for the DCC is a long-standing problem. Originally, it was proposed that specific enhancer-like sequences might reside close to individual X-linked genes, serving as targets for DCC. In contrast, the 'spreading' model postulates that the male X is marked by a quite limited number of DNA sequences (35) that recruit the DCC and accumulate locally at high levels, which in turn results in association of the complex with numerous sites of low affinity. Nevertheless, the modern view of the problem assumes that there might be many more DNA sequences required both for the initial recruitment/assembly of DCC (CES) and for the association of a functional complex with additional sites (non-CES). In contrast to these postulated DNA sequences, most sites on the X, which are targets for functional DCC, are thought to mark genes actively transcribed in a given tissue and time of development. This idea implies that DCC mediates transcription enhancement via direct involvement in transcription regulation of each active gene. In this article, an effort was made to test further this model by precisely investigating the relative localization of DCC and PolIIo along the male X in the course of larval development (Kotlikova, 2006).

Previously it was demonstrated that in vivo PolIIo and various elongation factors, as well as the H3.3 histone variant, dynamically associate with active genes, accompany their expression, and look colocalized in Drosophila polytene chromosomes. This overlap is most obvious as diffuse labeling of developmental and heat-shock-induced puffs. Intriguingly, it was found that DCC demonstrates striking stability in both the number and intensity of binding sites along the X throughout larval development. Moreover, despite being targets for MLE, the sites of the most intensive gene expression both on the male X and within an autosomal DCC-spreading area appear to not be targets for DCC at all. DCC gaps were demonstrated in regions associated with active genes. Additionally, DCC skips over a number of transcriptionally active regions when it inappropriately spreads in cis from an autosomal roX1 transgene. It is therefore suggested that active transcriptional status of the chromosomal region or association with MLE is not sufficient for DCC targeting and that, if DCC binds to actively transcribed regions, it does so very selectively (Kotlikova, 2006).

The findings on MLE localization on the polytene chromosomes might reflect dual functioning of MLE on the male X. In addition to being a subunit of DCC, MLE probably accomplishes some unrelated functions, which are neither X nor sex specific. On the basis of the observed MLE association with a large number of sites of active transcription in both sexes, it is believed that this RNA-helicase might be important not only for splicing certain genes, as was shown for the gene para, but also for playing some general role in the transcription process. In support of this idea, the mammalian MLE homolog, RHA, was demonstrated to contribute to various steps of transcription -- from initiation to processing of nascent transcripts (Zhang, 2004). Assuming that MLE apparently is able to bridge DCC with transcriptionally active regions, this might occur only if some additional requirements for DCC binding are realized (Kotlikova, 2006).

Earlier, it was reported that a partial MSL complex lacking MSL2 protein is present in normal female nuclei. Accordingly, mutations in various msl genes except mof disassociate all MSLs from the chromosomes in females. Nevertheless, the data on MLE distribution in polytene chromosomes of females homozygous for msl1 or msl3 null alleles indicate that even if MSLs form a partial complex in females, MLE binds the chromatin in an MSL-independent manner (Kotlikova, 2006).

These findings raise questions as to what are the reasons for exclusion of DCC from some active X-linked regions and whether dosage compensation does take place there. One can speculate that highly active chromatin in puffing regions turns into a poor substrate for the DCC due to drastic changes in packaging, possibly, up to nucleosome removal. However, a cluster of CESs bound by DCC is detected in the puffed 2B region throughout larval development. Moreover, strong transcription induced in EP transposons on the male X sometimes results in ectopic DCC recruitment, suggesting that the complex is able to recognize very active chromatin (Kotlikova, 2006).

Revealing active genes within each of the cytologically extensive DCC gaps provides yet another puzzle. In the neo X chromosome of D. miranda, the blocks of chromatin escaping dosage compensation do alternate with other blocks that are dosage compensated and therefore bind DCC. However, no data indicate that such clustering takes place in D. melanogaster. If X-linked genes actually possess still unknown features needed for DCC targeting, then DCC gaps might reflect evolutionary incompleteness of this process in D. melanogaster. It should be noted that, in contrast, 70 autosomal regions are competent to recruit functional DCC in wild-type males. Alternatively, the active genes located within the DCC gaps might serve as targets for the complex but cannot realize this ability probably due to the chromatin environment (Kotlikova, 2006).

Whether the genes within puffs and DCC gaps undergo dosage compensation remains to be answered. It seems plausible to suggest that transcription upregulation could be not so essential for a subset of highly expressed genes. Nevertheless, whatever the reasons for highly expressed loci to escape association with DCC, these genes are probably dosage compensated, which was shown at least for Sgs4 and the Broad-Complex. If many active genes lack DCC-binding sites in the immediate vicinity, they might achieve dosage compensation by a yet unknown pathway. It is very possible that upregulation of active genes within DCC gaps and puffing regions might be achieved, at least to some extent, via DCC-mediated establishment of a more open chromatin structure of the whole male X, suggesting that DCC affects transcription indirectly. Site-specific localization of H4Ac16 probably initiates a cascade of molecular remodeling events resulting in diffuse appearance of the whole male X chromosome. Generally, such a chromatin state would facilitate the access of various transcription and replication factors. Accordingly, in females having ectopic dosage compensation induced, the DCC gap corresponding to the intercalary heterochromatin region on the polytene X demonstrated a greater extent of both polytenization and replication than in the wild type. This clearly correlated with the higher local concentrations of the DCC in neighboring areas. Thus, despite the fact that the DCC-mediated site-specific histone acetylation pattern correlates with an increase in transcription of the underlying sequences, it would be more accurate to suggest that there is no common scenario of dosage compensation for all the X-linked genes. Also, the DCC pattern appears essentially permanent and displays only negligible variations, both in the course of larval development and in different tissues, which might point to the contribution of yet unidentified epigenetic factors in the establishment and maintenance of DCC binding. For example, the transcriptional activity of X-linked genes could govern DCC settling on the male X in early embryogenesis, and this pattern might be subsequently reproduced epigenetically. Hence, this scenario would imply high stability of the DCC pattern at least for the housekeeping genes, rather than dramatic changes in DCC distribution resulting from fine-tuned transcriptional programs further in development (Kotlikova, 2006).

Future molecular studies of the dosage compensation status of active genes mapping to the DCC gaps could determine whether dosage compensation can also utilize some unknown mechanisms other than site-specific acetylation of H4 at lysine 16 leading to site-specific transcription enhancement. Alternatively, there may be many more X-linked genes whose expression does not require dosage compensation than was expected to date. Regardless, the important question remains how functional DCC recognizes its targets among the active genes on the male X chromosome (Kotlikova, 2006).

Modulation of heterochromatin by male specific lethal proteins and roX RNA in Drosophila melanogaster males

The ribonucleoprotein Male Specific Lethal (MSL) complex is required for X chromosome dosage compensation in Drosophila males. Beginning at 3 h of development the MSL complex binds transcribed X-linked genes and modifies chromatin. A subset of MSL complex proteins, including MSL1 and MSL3, is also necessary for full expression of autosomal heterochromatic genes in males, but not females. Loss of the non-coding roX RNAs, essential components of the MSL complex, lowers the expression of heterochromatic genes and suppresses position effect variegation (PEV) only in males, revealing a sex-limited disruption of heterochromatin. MLE, but not Jil-1 kinase, was found to contribute to heterochromatic gene expression. To determine if identical regions of roX RNA are required for dosage compensation and heterochromatic silencing, a panel of roX1 transgenes and deletions was tested; the X chromosome and heterochromatin functions were found to be separable by some mutations. Widespread autosomal binding of MSL3 occurs before and after localization of the MSL complex to the X chromosome at 3 h AEL. Autosomal MSL3 binding was dependent on MSL1, supporting the idea that a subset of MSL proteins associates with chromatin throughout the genome during early development. It is postulated that this binding may contribute to the sex-specific differences in heterochromatin that have been noted (Koya, 2015).

A central question raised by this study is how factors known for their role in X chromosome dosage compensation also modulate autosomal heterochromatin. Although the MSL proteins were first identified by their role in X chromosome compensation, homologues of these proteins participate in chromatin organization, DNA repair, gene expression, cell metabolism and neural function throughout the eukaryotes. Furthermore, flies contain a distinct complex, the Non-Sex specific Lethal (NSL) complex, containing MOF and the MSL orthologs NSL1, NSL2 and NSL3. The essential NSL complex is broadly associated with promoters throughout the fly genome, where it acetylates multiple H4 residues. In light of the discovery that the MSL proteins represent an ancient lineage of chromatin regulators, it is unsurprising that members of this complex fulfill additional functions (Koya, 2015).

An alternative hypothesis for the dosage compensation of male X-linked genes proposes that the MSL proteins are general transcription regulators, and recruitment of these factors to the male X chromosome reduces autosomal gene expression, thus equalizing the X:A expression ratio. Arguing against this idea are ChIP studies finding that the MSL complex, and engaged RNA polymerase II, are increased within the bodies of compensated X-linked genes. In agreement with this, a study that normalized expression to genomic DNA concluded that compensation increases the expression of male X-linked genes. The current study now reveals that autosomal heterochromatic genes are indeed dependent on a subset of MSL proteins for full expression. However, native heterochromatic genes make up only 4% of autosomal genes, and their misregulation is not expected to compromise genome-wide expression studies normalized to autosomal expression (Koya, 2015).

Expression of heterochromatic genes is thought to involve mechanisms to overcome the repressive chromatin environment. It is possible that a complex composed of roX RNA and a subset of MSL proteins participates in this process. This would explain why heterochromatic genes are particularly sensitive to the loss of these factors. Alternatively, it is possible that roX and MSL proteins participate in heterochromatin assembly. This would explain the simultaneous disruption of heterochromatic gene expression and suppression of PEV at transgene insertions (Koya, 2015).

Heterochromatin assembly is first detected at 3-4 h AEL, a time when MSL3 is bound throughout the genome. Intriguingly, studies from yeast identify a role for H3K4 and H4K16 acetylation in formation of heterochromatin. Active deacetylation of H4K16ac is necessary for spreading of chromatin-based silencing in yeast, demonstrating the need for a sequential and ordered series of histone modifications (Koya, 2015).

As MOF is responsible for the majority of H4K16ac in the fly, a MOF-containing complex could fulfill a similar role during heterochromatin formation. While this study found a significant effect of MOF in expression only on the X and 4th chromosomes, it is possible that examination of a larger number of genes would reveal a more widespread autosomal effect (Koya, 2015).

In roX1 roX2 males the 4th chromosome displays stronger suppression of PEV and more profound gene misregulation than do other heterochromatic regions. This is consistent with the observation that heterochromatin on the 4th chromosome is genetically and biochemically different from that on other chromosomes. Loss of roX RNA leads to misregulation of genes in distinct genomic regions, the dosage compensated X chromosome and autosomal heterochromatin. This study found that the regulation of these two groups is, to some extent, genetically separable. MSL2, which binds roX1 RNA and is an essential member of the dosage compensation complex, is not required for full expression of heterochromatic genes in males. Ectopic expression of MSL2 in females induces formation of MSL complexes that localize to both X chromosomes, inducing inappropriate dosage compensation. As would be expected from the lack of a role for MSL2 in autosomal heterochromatin in males, ectopic expression of this protein in females has no effect on PEV (Koya, 2015).

Elegant, high-resolution studies reveal that MLE and MSL2 bind essentially indistinguishable regions of roX1. Three prominent regions of MLE/MSL2 binding have been identified, one overlapping the 3' stem loop. This stem loop incorporates a short 'roX box' consensus sequence that is present in D. melanogaster roX1 and roX2, and conserved in roX RNAs in related species (Koya, 2015).

An experimentally supported explanation for the concurrence of MLE and MSL2 binding at the 3' stem loop is that MLE, an ATP-dependent RNA/DNA helicase, remodels this structure to permit MSL2 binding. The finding that disruption of this stem blocks dosage compensation but does not influence heterochromatic integrity is consistent with participation of roX1 in two processes that differ in MSL2 involvement. However, a region surrounding the stem loop is required for the heterochromatic function of roX1, as roX1^Δ10, removing the stem loop and upstream regions, is deficient in both dosage compensation and heterochromatic silencing. Further differentiating these processes is the finding that low levels of roX RNA from a repressed transgene fully rescue heterochromatic silencing, but not dosage compensation. An intriguing question raised by this study is why the sexes display differences in autosomal heterochromatin (Koya, 2015).

The chromatin content of males and females are substantially different as XY males have a single X and a large, heterochromatic Y chromosome. It is speculated that this has driven changes in how heterochromatin is established or maintained in one sex. A search for the genetic regulators of the sex difference in autosomal heterochromatin eliminated the Y chromosome and the conventional sex determination pathway, suggesting that the number of X chromosomes determines the sensitivity of autosomal heterochromatin to loss of roX activity. Interestingly, the amount of pericentromeric X heterochromatin, rather than the euchromatic 'numerator' elements, appears to be the critical factor. The recognition that heterochromatin displays differences in the sexes, and that a specific set of proteins are required for normal function of autosomal heterochromatin in males suggests a useful paradigm for the evolution of chromatin in response to genomic content (Koya, 2015).

m6A potentiates Sxl alternative pre-mRNA splicing for robust Drosophila sex determination

N6-methyladenosine (m6A) is the most common internal modification of eukaryotic messenger RNA (mRNA) and is decoded by YTH domain proteins. Drosophila mRNA m6A methylosome consists of Ime4 and KAR4 (Inducer of meiosis 4 and Karyogamy protein 4), and Female-lethal (2)d (Fl(2)d) and Virilizer (Vir). In Drosophila, fl(2)d and vir are required for sex-dependent regulation of alternative splicing of the sex determination factor Sex lethal (Sxl). However, the functions of m6A in introns in the regulation of alternative splicing remain uncertain. This study shows that m6A is absent in the mRNA of Drosophila lacking Ime4. In contrast to mouse and plant knockout models, Drosophila Ime4-null mutants remain viable, though flightless, and show a sex bias towards maleness. This is because m6A is required for female-specific alternative splicing of Sxl, which determines female physiognomy, but also translationally represses male-specific lethal 2 (msl-2) to prevent dosage compensation in females. The m6A reader protein YT521-B decodes m6A in the sex-specifically spliced intron of Sxl, as its absence phenocopies Ime4 mutants. Loss of m6A also affects alternative splicing of additional genes, predominantly in the 5' untranslated region, and has global effects on the expression of metabolic genes. The requirement of m6A and its reader YT521-B for female-specific Sxl alternative splicing reveals that this hitherto enigmatic mRNA modification constitutes an ancient and specific mechanism to adjust levels of gene expression (Haussmann, 2016).

Hrp48 and eIF3d contribute to msl-2 mRNA translational repression

Translational repression of msl-2 mRNA in females of Drosophila melanogaster is an essential step in the regulation of X-chromosome dosage compensation. Repression is orchestrated by Sex-lethal (SXL), which binds to both untranslated regions (UTRs) of msl-2 and inhibits translation initiation by poorly understood mechanisms. This study identified Heterogeneous nuclear ribonucleoprotein at 27C as a SXL co-factor. Hrp48 binds to the 3' UTR of msl-2 and is required for optimal repression by SXL. Hrp48 interacts with eIF3d, a subunit of the eIF3 translation initiation complex. Reporter and RNA chromatography assays showed that eIF3d binds to msl-2 5' UTR, and is required for efficient translation and translational repression of msl-2 mRNA. In line with these results, eIF3d depletion -but not depletion of other eIF3 subunits- de-represses msl-2 expression in female flies. These data are consistent with a model where Hrp48 inhibits msl-2 translation by targeting eIF3d. These results uncover an important step in the mechanism of msl-2 translation regulation, and illustrate how general translation initiation factors can be co-opted by RNA binding proteins to achieve mRNA-specific control (Szostak, 2018).

Valsecchi, C. I. K., Basilicata, M. F., Semplicio, G., Georgiev, P., Gutierrez, N. M. and Akhtar, A. (2018). Facultative dosage compensation of developmental genes on autosomes in Drosophila and mouse embryonic stem cells. Nat Commun 9(1): 3626. PubMed ID: 30194291

Facultative dosage compensation of developmental genes on autosomes in Drosophila and mouse embryonic stem cells

Haploinsufficiency and aneuploidy are two phenomena, where gene dosage alterations cause severe defects ultimately resulting in developmental failures and disease. One remarkable exception is the X chromosome, where copy number differences between sexes are buffered by dosage compensation systems. In Drosophila, the Male-Specific Lethal complex (MSLc) mediates upregulation of the single male X chromosome. The evolutionary origin and conservation of this process orchestrated by MSL2, the only male-specific protein within the fly MSLc, have remained unclear. This study reports that MSL2, in addition to regulating the X chromosome, targets autosomal genes involved in patterning and morphogenesis. Precise regulation of these genes by MSL2 is required for proper development. This set of dosage-sensitive genes maintains such regulation during evolution, as MSL2 binds and similarly regulates mouse orthologues via Histone H4 lysine 16 acetylation. It is proposed that this gene-by-gene dosage compensation mechanism was co-opted during evolution for chromosome-wide regulation of the Drosophila male X (Valsecchi, 2018).

Protein Interactions

MSL1 plays a central role in assembly of the MSL complex, essential for dosage compensation in Drosophila

In male Drosophila, histone H4 acetylated at Lys16 is enriched on the X chromosome, and most X-linked genes are transcribed at a higher rate than in females (thus achieving dosage compensation). Five proteins, collectively called the MSLs, are required for dosage compensation and male viability. Here it has been shown that one of these proteins, MSL1, interacts with three others, MSL2, MSL3 and MOF. The latter is a putative histone acetyl transferase. Overexpression of either the N- or C-terminal domain of MSL1 has dominant-negative effects, i.e. causes male-specific lethality. The lethality due to expression of the N-terminal domain is reduced if msl2 is co-overexpressed. MSL2 co-purifies over a FLAG affinity column with the tagged region of MSL1, and both MSL3 and MOF co-purify with the FLAG-tagged MSL1 C-terminal domain. Furthermore, the MSL1 C-terminal domain binds specifically to a GST-MOF fusion protein and co-immunoprecipitates with HA-tagged MSL3. The MSL1 C-terminal domain shows similarity to a region of mouse CBP, a transcription co-activator. It is concluded that a main role of MSL1 is to serve as the backbone for assembly of the MSL complex (Scott, 2000).

In general, the amino acid sequences of the MSLs suggest regions or domains within the proteins that could be important for function in vivo. Indeed, this has been confirmed by mapping loss-of-function mutations to the domain, such as the helicase domain of MLE, the putative acetylase domain of MOF and the RING finger region of MSL2. The amino acid sequence of MSL1 is the least informative, containing no recognizable domains, although regions rich in acidic amino acids and possible PEST sequences have been identified. To identify regions within MSL1 that are important for function in vivo, it was determined which regions have dominant-negative effects when overexpressed. Two regions of MSL1, one near the N-terminus and the other at the C-terminus, are likely to be important for assembly of the MSL complex in vivo, because overexpression of either region causes male-specific lethality. Genetic evidence, decreased male viability of msl2 heterozygotes and increased male viability by co-overexpression of MSL2, suggests that the region of MSL1 at the N-terminus interacts with MSL2. This has been confirmed by co-purification of MSL2 with FLAG-tagged versions of MSL1 over FLAG affinity columns. Similarly, the C-terminal region of MSL1 interacts with both MOF and MSL3. Furthermore, expression of the C-terminal domain results in significant loss of MOF from the male X chromosome (Scott, 2000).

The N-terminal FN region of MSL1 that binds to MSL2 was chosen originally for expression in flies because it was predicted that almost half of FN (amino acids 96-172) would form a two-stranded, alpha-helical, coiled-coil structure. Coiled-coil structures are comprised of a heptad repeat (abcdefg)n where hydrophobic residues occupy positions a and d on the same side of the alpha-helix. The coiled-coil motif of GCN4 mediates dimerization. If a similar structure mediates the formation of the MSL1-MSL2 heterodimer, then part of the region of MSL2 that interacts with MSL1 should form a coiled-coil structure. The Ring finger domain region of MSL2 interacts with MSL1. It is predicted that the region immediately preceding the RING finger could form a coiled-coil structure. It is particularly significant that several of the mutations that disrupt the interaction with MSL1 in yeast introduce amino acid changes that either significantly disrupt the alpha-helix (leucine to proline) or introduce a charged amino acid into the predicted hydrophobic face of the alpha-helix. The RING domain is found in a number of proteins, including the V(D)J recombination-activating protein RAG1. The crystal structure of the RAG1 dimerization domain, which includes the RING finger, reveals that dimerization is stabilized by interaction between alpha-helices that form a hydrophobic core. The RING finger is thought to form the structural scaffold upon which the dimer interface is formed. It is tempting to speculate, by analogy with RAG1, that the association of MSL1 and MSL2 involves the interaction of amphipathic alpha-helices that depend on the RING finger domain. This could best be addressed by determining the crystal structure of the MSL1-MSL2 complex (Scott, 2000).

In vitro translated MSL1 C-terminal domain co-immunoprecipitates with in vitro translated HA·MSL3 but not HA·MOF. Thus, C interacts directly with MSL3 but the interaction with MOF requires either another factor present in fly extracts or post-translational modification of MSL1 or MOF. While the possibility of a nucleic acid component of the FC-MOF complex cannot be ruled out, the possiblity (post-translational modification of MSL1 or MOF) is favored since a silver stain of FLAG affinity-purified FC-MOF complex separated by SDS-PAGE shows only two main bands corresponding to the sizes expected for FC and MOF. The C-terminal domain of MSL1 is rich in serine and threonine residues, and contains several potential phosphorylation sites and a predicted PEST sequence. PEST sequences have been suggested to contribute to the instability of the MSL1 protein. However, the role of these sequences in MSL1 has not been determined. Indeed, an alternative function for the PEST sequences is suggested by the observations that the PEST domains of PU.1 and IB are required for their respective interactions with Pip and c-Rel. In both cases, phosphorylation of a serine residue within the PEST sequence is required for the respective protein-protein interactions. The recent finding that a serine/threonine kinase is associated preferentially with the male X chromosome raises the possibility that MSL1 or another MSL is phosphorylated by this enzyme (Scott, 2000).

In the sequential model for assembly of the MSL complex, the first step involves the binding of the MSL1-MSL2 complex to several 'high affinity' sites on the male X chromosome. Since the localization of both MOF and MSL3 to the X chromosome requires mle+ function, this suggests that the association of MOF and MSL3 with the MSL1-MSL2 complex is MLE dependent. MLE could either bind directly to MOF and/or MSL3, or somehow stabilize the MSL complex together with roX RNA. In support of the latter model, MOF and MSL3 bind directly to the C-terminal domain of MSL1. Furthermore, MLE did not co-purify with an FC-MOF-MSL3 complex over an affinity column. However, the affinity chromatography experiments were designed to maximize the likelihood of detecting protein-protein association and are not quantitative. It is possible that MOF and MSL3 may have a higher affinity for the C-terminal domain of MSL1 than full-length MSL1. Thus, one possible mechanism is that in vivo the C-terminal domain of MSL1 is not freely available to bind to MOF and/or MSL3, and that the binding of MLE to the MSL1-MSL2 complex causes a conformational change in MSL1, such that the C-terminal domain becomes more accessible (Scott, 2000).

Previous searches of the protein sequence database with the complete MSL1 sequence have failed to identify any significant similarities. However, when a search is carried out with just the C-terminal domain sequence, some similarity is found to a 254 amino acid region of mouse CBP. Although the similarity is not high, given that the similarity extends across almost the entire C-terminal domain of MSL1, and that both CBP and the MSL1 C-terminal domain bind to histone acetyl transferases (or putative histone acetyl transferases), it is thought that this homology may be significant. If this similarity reflects a conserved function, then it would be predicted that the MSL1-similar region of CBP, which has no known function, would associate with either an MOF-like histone acetyl transferase or an MSL3-like protein in mammalian cells (Scott, 2000).

It is not known how the MSL complex binds to the male X chromosome. None of the MSLs contain a recognizable DNA-binding motif. The F84 version of MSL1, lacking the first 84 amino acids, binds to MSL2, MSL3 and MOF but does not bind preferentially to the male X chromosome. This suggests that the male lethality that results from overexpression of F84 is due to this protein being able to bind to three MSLs, but not being able to bind to the X chromosome because the first 84 amino acids of MSL1 are required for recognition of the X chromosome. Alternatively, the lack of binding of F84 to the male X chromosome could be because the beginning of MSL1 is required for assembly of the MSL complex in vivo. However, if so, then it would be expected that F84 would have bound to the 'high affinity' sites since F84 does bind to MSL2. Assuming that MSL1 and MSL2 are the only components of the high affinity complex, it would then appear more likely that the first 84 amino acids of MSL1 are required for X chromosome binding rather than complex formation. However, there are several lines of evidence that suggest that the roX RNAs are part of the MSL complex, which raises the possibility that one or both of the roX RNAs could be part of the high affinity complex. Thus it will be of interest to determine if the MSL complex containing the F84 protein binds to roX RNA with a lower affinity than the complex containing full-length MSL1 (Scott, 2000).

Interaction of Mof and Msl-3 with RNA

In Drosophila, compensation for the reduced dosage of genes located on the single male X chromosome involves doubling their expression in relation to their counterparts on female X chromosomes. Dosage compensation is an epigenetic process involving the specific acetylation of histone H4 at lysine 16 by the histone acetyltransferase Mof. Although Mof is expressed in both sexes, it only associates with the X chromosome in males. Its absence causes male-specific lethality. Mof is part of a chromosome-associated complex comprising male-specific lethal (MSL) proteins and at least one non-coding roX RNA. How Mof is integrated into the dosage compensation complex is unknown. Association of Mof with the male X chromosome is shown in this study to depend on its interaction with RNA. Mof specifically binds through its chromodomain to roX2 RNA in vivo. In vitro analyses of the Mof and Msl-3 chromodomains indicate that these chromodomains may function as RNA interaction modules. Their interaction with non-coding RNA may target regulators to specific chromosomal sites (Akhtar, 2000).

The association of (MSL) proteins (Msl-1, Msl-2, Msl-3, Mof, Mle) and roX RNA with the male X chromosome has been visualized by immunofluorescence analysis of larval polytene chromosomes. Drosophila SL-2 cells can act as a model system for male features, because functional dosage compensation complex (DCC) has been purified from them, and they have also been used to study the post-transcriptional regulation of dosage compensation. The nuclear territory of the X chromosome in SL-2 cells can be visualized by immunostaining with antisera against Msl-1, Mof, the histone H4 isoform acetylated at Lys 16 (K16) and Mle. Msl-1 remains localized to the X chromosome after permeabilization of the cells and RNase treatment. In contrast, the bulk of Mof staining disappears from the X chromosome after RNase treatment. Mle, which interacts with larval polytene chromosomes in an RNase-sensitive manner, dissociates from the chromosome upon permeablization of the cells ; therefore, the RNase sensitivity of Mle's chromosomal association in this system could not be confirmed. Loss of Mof correlates with a reduction of the H4 acetyl-K16 histone isoform at the X chromosome, suggesting a high turnover of the modification under these conditions. These results suggest that the stable integration of Mof into chromosome-bound DCC involves an RNase-sensitive structure, and that the continued association of Mof with DCC is independent of Mle (Akhtar, 2000).

Candidate RNAs that may contribute to the assembly and chromosomal association of the DCC are the roX1 and roX2 RNAs that are stably expressed in male but not female flies, and that colocalize with MSL proteins on the male X chromosome. The two RNAs have no effect on the integrity of DCC, because deletion of a single roX RNA has no phenotype, but mutation of both genes abolishes the interaction of MSL proteins with the chromosome. SL-2 cells express only roX2 RNA, which forms a complex with MSL proteins and is therefore an excellent candidate for an RNA involved on the targeting of Mof. To test whether Mof interacts with roX2 RNA in vivo, nuclear extracts were prepared from SL-2 cells and the soluble DCC was isolated by immunoprecipitation with antibodies specific for either Mof or Mle. The RNase sensitivity of interactions was tested by RNase treatment of extracts before immunoprecipitation. The immunoprecipitated complexes were characterized by western blot analysis. Under these conditions, Mof, Mle, Msl-3 and Msl-2 co-immunoprecipitate independently of RNase treatment, indicating that the DCC is held together by protein-protein interactions and/or that bridging RNA is protected from the nuclease attack. In the absence of RNase, roX2 RNA is readily detected in the immunopurified complex by reverse transcription of RNA followed by polymerase chain reaction with roX2 specific primers, whether the immunoprecipitation has been performed with antibodies specific for Mof or Mle (Akhtar, 2000).

These experiments confirm earlier reports that roX2 RNA can be part of soluble DCC, but they raise the issues of how soluble DCC may differ from the chromosome-associated complex and whether Mof interacts with roX2 RNA directly or indirectly (for example, through Mle). To clarify the latter issue, use was made of the observation that the association of Mle with DCC is sensitive to elevated ionic strength. Under these stringent immunoprecipitation conditions, Mle is no longer associated with Mof, although Msl-2 and Msl-3 are still detectable. In the absence of Mle, a significant amount of roX2 RNA remains associated with Mof (Akhtar, 2000).

Although these experiments rule out that Mle, to date the best candidate for an RNA interacting factor, bridges between Mof and roX2, the involvement of Msl-3 or other unknown protein(s) remain possible. To determine whether Mof interacts with RNA directly, the recombinant enzyme was studied in vitro. Electrophoretic mobility shift assay (EMSA) with fragments derived from roX1 reveals a nonspecific interaction of Mof with RNA. Competition experiments show that Mof interacts with RNA with high preference over DNA. Mof is unable to interact with DNA efficiently. A lack of specificity of the RNA interaction may be due to misfolding of the in vitro transcribed RNAs in the absence of chaperones or, most likely, to the use of arbitrary fragments of roX RNA, which lack optimum binding sites (Akhtar, 2000).

The specific interaction of Mof with RNA should be distinguished from insignificant 'sticking', by determining whether Mof has a specific domain for RNA interaction. The RNA interaction domain was mapped by creating a series of recombinant Mof derivatives that are all active in chromatin binding and histone acetylation, and analysing their potential to interact with RNA. A Mof protein truncated at its amino terminus (N352) still interacts with RNA, and further deletion of the chromodomain (N518) abolishes this interaction. Since this result suggests that the chromodomain is involved in RNA binding, hydrophobic residues that are conserved in chromodomains from various origins were mutated. Mof derivatives with single amino-acid exchanges (W426G and Y416D) are unable to interact with RNA, whereas a point mutation in the acetyl CoA-binding site, G691E4, does not affect RNA binding. The W426G and Y416D mutations affect the structure of the chromodomain only locally because the mutant enzymes are still active histone acetyltransferases (HATs). Confirmatory results were obtained with an alternative binding assay involving roX1 RNA immobilized on streptavidin-sepharose beads (Akhtar, 2000).

To establish the physiological relevance of the chromodomain for RNA interaction in vivo Mof derivatives were transiently expressed from a metallothioneine promoter in SL-2 cells. The addition of an N-terminal haemagglutinin (HA) epitope allowed the distinguishing of ectopic enzymes from endogenous Mof. Whole-cell extracts were prepared before or after induction of transgene expression with copper. Under stringent immunoprecipitation conditions with antibody directed against HA, roX2 RNA associates with HA-Mof. When expressed to the same level as intact Mof, both chromodomain mutants W426G and Y416D show an impaired interaction with roX2 RNA. Almost no roX2 RNA could be detected after immunoprecipitation of Y416D, whereas the effect of the W426G mutation was less severe, but still clear (Akhtar, 2000).

To establish whether RNA interaction is a specific feature of the Mof chromodomain or whether it is a more general property of chromodomains, Msl-3, another dosage compensation protein containing two chromodomains was tested for interaction with RNA. Recombinant Msl-3 interacts efficiently with RNA, forming several complexes in the EMSA assay at higher concentrations. Competition experiments have established that Msl-3 interacts far better with RNA than with DNA. Even the carboxy-terminal chromodomain of Msl-3 (CD2) when fused to glutathione-S-transferase (GST) is able to interact with RNA but not DNA, whereas the GST moiety alone is inactive (Akhtar, 2000).

These results suggest that the Mof chromodomain interacts with roX RNA in vivo, which may contribute to the integration of Mof into DCC at the male X chromosome. The association of Mof with DCC is a rather late step in the assembly of the complex and does not occur in the absence of Mle. The earlier incorporation of roX2 RNA also depends on Mle. The stable association of any one subunit with DCC may rely on multiple interactions with protein and/or RNA subunits, and an additional direct contact between Mof and Mle remains possible. Whether RNA interaction is a general property of chromodomains or restricted to a subfamily of the chromodomain superfamily remains to be seen. Chromodomains are important for the function of a number of chromatin regulators, but their modes of action have remained enigmatic. Although the related 'chromo shadow domain' of heterochromatin protein 1 (HP1) mediates interactions with several proteins and an interacting peptide has been identified, chromodomains have so far not been shown to contact proteins or peptides. Mutations of the clr protein analogous to the ones made in Mof abolish its silencing capacity. Small deletions of the polycomb protein containing these residues lead to its delocalization in SL-2 cells. Non-coding RNAs may be more commonly involved in organizing regulatory complexes than has been appreciated to date. Identification of the RNA structure motif that determines the specific interaction with the chromodomain remains a challenge for the future. Interestingly, dosage compensation in mammals also involves a non-coding RNA, Xist, coating the inactive X. It is tempting to speculate that roX and Xist RNAs may target regulatory proteins to the X chromosomes of Drosophila and humans (Akhtar, 2000).

MOF acetylates MSL-3 in the dosage compensation complex

Dosage compensation ensures equal expression of X-linked genes in males and females. In Drosophila, equalization is achieved by hypertranscription of the male X chromosome. This process requires an RNA/protein containing dosage compensation complex (DCC). RNA interference of individual DCC components has been used to define the order of complex assembly in Schneider cells. Interaction of MOF with MSL-3 leads to specific acetylation of MSL-3 at a single lysine residue adjacent to one of its chromodomains. Localization of MSL-3 to the X chromosome is RNA dependent and acetylation sensitive. The acetylation status of MSL-3 determines its interaction with roX2 RNA. Furthermore, RPD3 interacts with MSL-3 and MSL-3 can be deacetylated by the RPD3 complex. It is proposed that regulated acetylation of MSL-3 may provide a mechanistic explanation for spreading of the dosage compensation complex along the male X chromosome (Buscaino, 2003).

As in male flies, MSL-2 is central to the assembly of the dosage compensation complex in SL-2 cells, since its depletion by RNAi leads to disassembly of the complex. MOF protein requires prior assembly of MSL-1, MSL-2, and MSL-3, while the MSL-3 protein requires prior assembly of at least MSL-1 and MSL-2 proteins. Since in MOF dsRNA-treated cells approximately 10% of MOF protein can still be detected by Western blot analysis, it remains possible that small amount of MOF enzyme may be sufficient for MSL-3 localization to the X chromosome. In MOF mutant flies, MSL-3 protein localization on polytene chromosomes is restricted only to chromatin entry sites, as detected by immunostaining of the polytene chromosomes, suggesting that incomplete knockdown of MOF may be a plausible explanation for the apparent unaffected MSL-3 localization in these cells. However, it is also important to note that due to the limited size of SL-2 cells, it is difficult to resolve entry sites in SL-2 cells in comparison to the polytene chromosomes. Alternatively, MSL-3 localization to the X chromosome in MOF dsRNA-treated cells could also be a feature specific to SL-2 cells. Interestingly, depletion of MSL-2, MSL-3, or MOF led to dissociation of MLE from the X chromosome. It is therefore suggested that MLE localization is sensitive to the assembly of the rest of the complex in SL-2 cells (Buscaino, 2003).

MOF protein is associated with the X chromosome in an RNase-sensitive manner. This observation has been extended and it has been show that MSL-3 protein is also tethered to the X chromosome via RNA. MSL-3 interacts with rox2 in immunoprecipitation experiments and not with another nuclear RNA, suggesting that roX2 is a likely candidate for mediating this interaction in SL-2 cells. However, it remains plausible that an as yet unidentified RNA (or protein) may act as a bridge between MSL-3 and the X chromosome in vivo (Buscaino, 2003).

Surprisingly, it was found that association of MSL-3 with the X chromosome is not only sensitive to RNase treatment but also to TSA treatment. These results suggest that the acetylation status of MSL-3 protein is likely to be of a dynamic nature and that slight imbalance in cellular acetylation levels leads to dramatic consequences for the MSL-3 protein in vivo. This result is intriguing, since MOF, which is also capable of autoacetylation, did not show the same phenotype under these conditions. Whether autoacetylation of MOF leads to other effects remains possible (Buscaino, 2003).

In the case where the MSL-3 protein levels are reduced by MSL-3 dsRNA treatment, the localization of MOF protein to the X chromosome is severely compromised. This result at first sight appears contradictory to the observation that MOF localization to the X chromosome is unaffected upon TSA treatment for 30 min. However, it is important to note that incubation of SL-2 cells with TSA for periods longer than 4 hr also affects localization of the rest of the complex, including MOF, suggesting that the DCC as a whole is sensitive to overall acetylation levels within the cells. Furthermore, it remains possible that additional modifications of DCC factors contribute to the stability and dynamics of the complex as a whole. However, the findings strongly suggest that MSL-3 is particularly sensitive to acetylation changes and therefore follows a more rapid dissociation than the rest of the complex members in the conditions tested (Buscaino, 2003).

The results demonstrate that the MSL-3 protein is regulated by acetylation and that MOF acetylates a single lysine residue in MSL-3 in vitro. This finding underscores the stringent substrate specificity of MOF, consistent with the fact that MOF also acetylates only a single lysine in a nucleosomal substrate. A striking consequence of MSL-3 acetylation is the loss of its interaction with RNA and its failure to localize to the X chromosome. The sensitivity of acetylated MSL-3 protein seems to be specific for roX2 in vitro since neither nonspecific DNA nor nonspecific RNA binding was affected. The MSL-3 protein contains two chromodomains, and MSL-3 has been shown to bind RNA in vitro via its chromodomain; remarkably, the acetylation occurs next to one of its chromodomains (Buscaino, 2003).

Acetylation of the X chromosome seems to have a high turnover, and the dosage compensation complex members, particularly MSL-3, appear to be sensitive to changes in endogenous acetylation levels. It is therefore proposed that regulated acetylation of MSL-3 may cause a conformational change that leads to temporary loss of interaction with RNA from one of the chromodomains. A cycle of deacetylation may follow that will allow MSL-3 to contact RNA again on a nearby affinity site. By continuous cycle of acetylation and deacetylation, MSL-3, along with the rest of the complex, may be able to spread from a chromatin entry site. These findings also provide further insight into the essential nature of MOF histone acetyl transferase, which is required not only for acetylation of the X chromosome but also for regulation of other members of the complex. Association of RPD3 with the dosage compensation members provides strong supporting evidence for this hypothesis. Interestingly, RPD3 hypomorphic mutants show a reduced male to female ratio. Furthermore, the S. cerevisiae homolog of MSL-3, EAF3, part of the NuA4 complex that contains ESA1, copurifyies with yeast RPD3 using the tandem affinity purification (TAP) procedure. It therefore appears likely that this property of MSL-3/EAF3 is conserved and that MSL-3/EAF3 may act as a bridge between two different histone-modifying activities. The transient interaction of dosage compensation complex members with a histone deacetylase may be required for fine tuning the effects of hyperacetylation by MOF protein to achieve the proper level of dosage compensation (Buscaino, 2003).

Functional integration of the histone acetyltransferase MOF into the dosage compensation complex

Dosage compensation in flies involves doubling the transcription of genes on the single male X chromosome to match the combined expression level of the two female X chromosomes. Crucial for this activation is the acetylation of histone H4 by the histone acetyltransferase (HAT) MOF. In male cells, MOF resides in a complex (dosage compensation complex, DCC) with MSL proteins and noncoding roX RNA. Previous studies suggested that MOF's localization to the X chromosome was largely RNA-mediated. Contact of the MOF chromo-related domain with roX RNA plays has now been found to play only a minor role in correct targeting to the X chromosome in vivo. Instead, a strong, direct interaction between a conserved MSL1 domain and a zinc finger within MOF's HAT domain is crucial. The functional consequences of this interaction were studied in vitro. Simultaneous contact of MOF with MSL1 and MSL3 leads to its recruitment to chromatin, a dramatic stimulation of HAT activity and to improved substrate specificity. Activation of MOF's HAT activity upon integration into the DCC may serve to restrict the critical histone modification to the male X chromosome (Morales, 2004).

Activation of the male X chromosome in Drosophila requires acetylation of H4K16 by MOF. In vitro, untargeted acetylation of H4K16 is sufficient to activate any chromatin template. Targeting MOF to a promoter in yeast via fusion to a heterologous DNA-binding domain also leads to derepression of transcription (Akhtar, 2000). Given the potential of MOF to activate transcription, fine-tuning the expression of X-linked genes crucially relies on restricting MOF activity to the X chromosome (Morales, 2004).

This study highlights the protein-protein interactions that dictate the incorporation of MOF into the DCC. More importantly, it illustrates a novel principle of conditional activation of the enzyme. Because MSL proteins are limiting in male cells and their association with the X chromosome is stable, essentially all MSL complexes are chromosomal. Activation of MOF requires interaction with MSL1, which initiates the DCC assembly, and MSL3, which is thought to associate with the complex after MSL1 and MSL2, suggesting that MOF may sense completed complex assembly. Integration into the complex unleashes MOF activity thereby restricting acetylation of H4K16 to the X chromosome. Rendering a regulatory acetylase activity dependent on the appropriate molecular context may be a more widespread principle. Recombinant Tip60 acetylase, like MOF a MYST family member, is unable to acetylate its physiological nucleosome substrate unless incorporated into a native complex. The HAT activity of the MYST member Sas2 absolutely requires Sas4 and is stimulated by Sas5 (Morales, 2004 and references therein).

Faithful association of MOF with the X chromosomal territory has been shown to be lost upon RNase treatment of nuclei in permeabilized cells and the chromo-related domain of MOF has been shown to be important for interaction with roX RNA in vivo, suggesting that targeting of MOF relies heavily on RNA interactions (Akhtar, 2000). The current results suggest that chromo-related domain-RNA interactions contribute to targeting but are not the primary targeting determinants for MOF. The RNase treatment not only results in displacement of MOF but also of MLE and MSL3 and might also affect other, yet unknown factors in the complex. RNA degradation thus leads to the simultaneous disruption of many protein-RNA interactions, which collectively may be required for complex integrity. Although roX RNA improves the assembly of the DCC and its distribution over the X chromosome under conditions of limiting MSL proteins in wild-type flies, the deficiency due to the absence of both roX RNAs can be partially overcome by overexpressing MSL1 and MSL2 in flies. This finding is consistent with the observation that protein-protein interactions are essential for targeting MOF to the X chromosome. How the incorporation of roX RNA into the DCC modulates the protein interactions discussed in this study and the dynamics of chromatin association remains to be explored (Morales, 2004).

Extending previous observations, this study emphasizes the central role of MSL1 in the DCC complex formation and chromatin recruitment. In addition to its well-documented association with MSL2, MSL1 directly interacts with MOF and MSL3 via two distinct surfaces. Interestingly, two phylogenetically conserved regions of MSL1 have recently been identified (Marin, 2003). The first one corresponds to an N-terminal coiled-coil domain involved in the interaction of MSL1 with the ring-finger domain of MSL2. The second one, called PEHE domain, overlaps with the fragment E (covering aa 766-939 and containing the conserved PEHE domain). It is therefore likely that a MOF interaction surface is present within the N-terminal part of this sequence conservation. The conservation of the PEHE region and the existence of MOF and MSL3 homologues in yeast and humans underline the functional importance of the observed interactions (Morales, 2004).

MOF interacts with MSL1 via the zinc-finger domain, a hallmark of MYST-type HAT domains. While this interaction is necessary for targeting MOF to the X chromosome, it is not known whether it is sufficient. Molecular modelling suggests that the zinc finger is an integral part of the HAT domain and hence additional surfaces may be involved in the contact. Interestingly, the very same mutations that abolish this interaction with MSL1 also led to reduced acetylation of histones (Akhtar, 2001). Since the zinc finger is not close to the substrate-binding pocket, it is considered that modulation of the zinc-finger structure, either through mutation or MSL1-MSL3 interaction, may have an allosteric negative or positive effect, respectively, on the ability of the catalytic site to interact with the histone tail substrate productively. Although the interactions of MSL3 and MOF are weak by comparison, MSL3 regulates MOF activity quantitatively and qualitatively. MSL3 and MOF interact with adjacent regions in the C-terminus of MSL1, which may promote their direct interaction. Alternatively, MSL3 may modulate MOF activity indirectly, through changes in MSL1 conformation (Morales, 2004).

In order to explore whether the activation of MOF's HAT activity by association with MSL1-MSL3 is due to enhanced binding to the chromatin substrate or an allosteric activation of catalysis, chromatin binding experiments were carried out. MSL1 interacts with chromatin and free DNA particularly well, and it helps both MSL3 and MOF to associate with chromatin. This observation lends additional support to the earlier notion of a central 'platform' function of MSL1. MSL1 interacts with MSL2, MSL3, MOF and chromatin and thus is ideally suited to function as a nucleation factor for the DCC complex assembly on chromatin. Interestingly, MSL3 also assists MOF's chromatin association, in keeping with the functional interactions between the two proteins observed in vitro and in vivo (Buscaino, 2003). Since the magnitude of the stimulation of chromatin binding is still an order of magnitude less than the observed stimulation of HAT activity, it is quite possible that allosteric effects of MSL protein association on the catalytic center of MOF contribute to activation of the HAT upon incorporation into the complex (Morales, 2004).

Interestingly, interaction of MOF with MSL1 and MSL3 also lead to a change in substrate specificity. In the absence of MSL3, MOF activity is mainly directed towards MSL1, even though the nucleosomal substrate is present. In reactions containing only MSL3 and MOF, acetylation of MSL3 can also be detected (Buscaino, 2003). In the presence of both MSL1 and MSL3, MOF does not acetylate either protein significantly, but histone H4 is the exclusive substrate. Whether acetylation of MSL1 occurs in the context of DCC assembly or its distribution over the X chromosome in vivo remains to be seen. The sensitivity of the metabolic labelling strategy did not suffice to detect acetylation of endogenous MSL1 in cells. The interaction of the DCC subunits is envisioned to be dynamic during the initial assembly of the complex, its propagation over the X chromosome and its perpetuation through replication and mitosis. Conceivably, MSL1 acetylation may occur transiently at one stage, may signal a particular functional status or be involved in feedback loops fine-tuning the two-fold enhancement of transcription from the male X chromosome (Morales, 2004).

The MRG domain mediates the functional integration of MSL3 into the dosage compensation complex

Clarification of the mechanism by which Drosophila melanogaster MSL3 activates the histone acetyltransferase activity of MOF (Morales, 2004) requires knowledge about potential interactions of MSL3 with the nucleic acids as well as with the components of the DCC. The functional significance of either the CRD or the MRG signature has not been explored so far. A set of MSL3 deletion mutants was generated and their ability to bind nucleic acids and MSL1 was tested (Morales, 2005).

It has been showed that MSL3 is able to bind DNA and RNA (Akhtar, 2000, Morales, 2004), but the domains involved in this interaction were not defined. A C-terminal fragment with limited similarity to chromo-related domains has been shown to bind RNA, but the affinity appears far too low to explain the RNA binding potential of the intact protein (Akhtar, 2000). In order to analyze the interactions of MSL3 with nucleic acids, linear double-stranded or single-stranded roX cDNA, roX sense or antisense RNA, as well as the unrelated Hsp26 RNA were immobilized on paramagnetic beads. These beads were incubated with baculovirus-expressed MSL3, and bound protein was separated from unbound protein and detected by Western blotting. The experiment revealed that MSL3 is able to bind to all nucleic acids, but binding to RNA and single-stranded DNA is consistently better than binding to double-stranded DNA of the same sequence and concentration. It is possible that the binding of MSL3 to RNA in comparison with ssDNA is underestimated in this experiment due to different immobilization strategies: ssDNA was immobilized by a terminal biotin group, whereas the RNA contains biotin groups throughout, which may interfere with MSL3 binding. The assay monitors only nonspecific RNA binding, since MSL3 binds different RNAs equally well. Since it is known that MSL3 binds RNA in vivo (Buscaino, 2003), binding to RNA and dsDNA was compared in a competition assay. Bound roX RNA or dsDNA was mixed with excess RNA or DNA, and the partitioning of MSL3 between bead-bound or soluble nucleic acid was monitored. The experiment confirms the conclusion that MSL3 binds significantly better to RNA than to dsDNA (Morales, 2005).

In order to map the part of MSL3 that mediates binding to nucleic acids, various MSL3 derivatives lacking parts of the protein were tested for RNA binding. The purity and integrity of these proteins were confirmed, and binding was assayed with carefully matched input concentrations. Deletion of the N-terminal 140 amino acids (MSL3_141-512), including the CRD, largely abolished the RNA binding potential of MSL3. In contrast, deletion of the C-terminal MRG sequences only slightly affected the RNA binding. However, an N-terminal MSL3 fragment consisting of only the first 140 amino acids, and therefore including the CRD, did not bind RNA, demonstrating that the CRD was not sufficient for RNA binding but that sequences included in the first 259 amino acids of MSL3 are critical (Morales, 2005).

In summary, the binding analysis documents that the nucleic acid binding surfaces of MSL3 reside in the N terminus of the protein and do not involve MRG sequences (Morales, 2005).

MSL3 directly interacts with the 66 C-terminal amino acids of MSL1 (Morales, 2004). Which part of MSL3 is involved in this interaction? A panel of flag-tagged MSL3 derivatives was co-expressed with full-length, untagged MSL1 in Sf9 cells using the baculovirus system. The MSL3 proteins were purified from total cell extracts by flag-mediated pull-down, then washed stringently. Associated MSL1 was detected by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) and Coomassie blue staining. Intact MSL3 and MSL3 derivatives lacking the N-terminal 140 amino acids interact efficiently with MSL1, which was readily detected on the gel. However, deletion of any part of the MRG similarity essentially abolished MSL1 binding. Complementary experiments monitoring the interaction of immobilized MSL1 with in vitro-translated MSL3 fragments confirmed these results. Additional experiments also revealed that deletion of MSL3-specific sequences between the MRG similarities (Delta328-433) also destroyed the interaction of MSL3 with MSL1. In summary, the deletion analysis failed to identify a small MSL1 interaction surface in MSL3 and suggests that the three separate regions of similarity to the MRG domain contribute to folding the MSL3 C terminus into a single, large domain (Morales, 2005).

The interaction studies above led to clear separation of the structures in MSL3 required for interaction with nucleic acids and for MSL1 and allowed the design of MSL3 derivatives that lacked one function without affecting the other. Different MSL3 deletion mutants were thus tested for their potential to activate the histone H4-directed HAT activity of MOF. HAT reaction mixtures contained recombinant MOF, radiolabeled acetyl coenzyme A (acetyl-CoA), nucleosome substrate, GST-tagged MSL1, and flag-tagged full-length or truncated MSL3. In the absence of MSL3, MOF activity was poor and acetylation of the proper H4 substrate was not detected (Morales, 2004), whereas in the presence of intact MSL3, robust acetylation of H4 was observed. A faint band at the position of GST-MSL1 demonstrates acetylation of MSL1 in this reaction (Morales, 2004). Deletion of the CRD had no effect on MOF activity. In contrast, two independent deletions in the C terminus of MSL3 that abolish interaction with MSL1 rendered MSL3 inactive as a modulator of MOF function. Nucleosomal H4 was no longer recognized as a substrate, and the remaining acetylase activity of MOF was directed toward MSL1. Taken together, the data suggest that MSL3 affects MOF's activity indirectly through its association with the common interaction partner and that the N-terminal CRD is not required for this activation in vitro (Morales, 2005).

Having documented separate interaction domains within MSL3 for nucleic acids and MSL1, the relevance of these interactions in vivo were evaluated. In the first series of experiments, it was asked whether these interactions contribute to targeting of MSL3 to the X-chromosomal territory. Male Drosophila SF4 cells were transiently transfected with vectors encoding MSL3-GFP fusion proteins, which were detected by immunofluorescence using antibodies directed against GFP. It was concluded that MSL3-GFP reliably localized to the X chromosome based on its colocalization with endogenous MSL1. MSL3 still colocalized if the N-terminal 180 amino acids containing the nucleic acid binding determinant were absent, although slightly higher background staining in the nucleoplasm was observed. In contrast, proteins suffering from two internal deletions that abolish MSL1 binding did not localize at all to the X-chromosome territory. It is concluded that neither interaction of MSL3 with DNA nor that with RNA is a primary determinant of MSL3 recruitment to the X chromosome. The data are consistent with the idea that MSL3 associates with the DCC mainly through a direct interaction between its MRG domain and the C terminus of MSL1 (Morales, 2005).

Deletion of the N terminus of MSL3, including its CRD, does not abolish targeting to the X-chromosomal territory. This truncated MSL3 is also still able to stimulate the HAT activity of MOF in vitro. In order to assess its functionality in vivo in the absence of endogenous MSL3, SL2 cells were depleted of endogenous MSL3 by RNAi. The dsRNA chosen corresponded to the first 500 nucleotides of msl3. Because these sequences are absent from the RNA encoding the N-terminally truncated MSL3_181-512, the dsRNA does not interfere with transient expression of this protein. It was verified that the MSL3_181-512 was able to interact with MSL1 as well as the longer MSL3_141-512 derivative by pull-down experiments with in vitro-translated proteins. As a control for nonspecific effects of dsRNA, cells were treated in parallel with dsRNA corresponding to GST, an alien protein to these cells (Morales, 2005).

After 4 days of RNAi against GST, 75% of the cells exhibited a nice X-territory staining for MSL3 and the acetylated lysine 16 of histone H4 (H4K16ac). In contrast, 4 days of RNAi against MSL3 led to loss of an X-territory staining for MSL3 and H4K16ac in 77% of the cells. Only 12.5% of the cells in the culture escaped from RNAi. The fact that 9.5% of the cells still showed localized H4K16 acetylation although MSL3 was not detectable by immunofluorescence may be due to low levels of MOF and MSL3 that suffice to maintain the territory or to variability in the turnover of the modification. To test for the ability of the MSL3 derivatives to rescue the depletion of the endogenous MSL3, cells were transfected with vectors expressing MSL3, MSL3_181-512, or MSL3_Delta260-393 after 2 days of RNAi. Two days later, the X-territory staining of GFP-positive cells was monitored. All GFP-fused proteins were expressed properly after RNAi of GST with a transfection efficiency of about 15%. In contrast, only the MSL3_181-512-GFP fusion protein was expressed after RNAi of MSL3, showing that the ds RNA corresponding to the 5' end of msl3 interfered only with expression of MSL3-GFP and MSL3_Delta260-393-GFP (Morales, 2005).

Remarkably, most cells (68%) with a medium-to-low GFP level not only showed a clear accumulation of the MSL3_181-512-GFP fusion at the X territory but also showed an obvious enrichment of MOF and H4K16ac. Since the percentage of H4K16ac territories was only 12.5% in the absence of MSL3_181-512-GFP expression, it is concluded that the N-terminally truncated protein can functionally rescue the depletion of endogenous MSL3 to a significant degree, which shows that the nucleic acid binding properties of MSL3 are dispensable (Morales, 2005).

Nuclear pore components are involved in the transcriptional regulation of dosage compensation in Drosophila

Dosage compensation in Drosophila is dependent on MSL proteins and involves hypertranscription of the male X chromosome, which ensures equal X-linked gene expression in both sexes. This paper reports the purification of enzymatically active MSL complexes from Drosophila embryos, Schneider cells, and human HeLa cells. A stable association of the histone H4 lysine 16-specific acetyltransferase MOF was found with the RNA/protein containing MSL complex as well as with an evolutionary conserved complex. The MSL complex interacts with several components of the nuclear pore, in particular Mtor/TPR and Nup153. Strikingly, knockdown of Mtor or Nup153 results in loss of the typical MSL X-chromosomal staining and dosage compensation in Drosophila male cells but not in female cells. These results reveal an unexpected physical and functional connection between nuclear pore components and chromatin regulation through MSL proteins, highlighting the role of nucleoporins in gene regulation in higher eukaryotes (Mendjan, 2006).

All Drosophila MSL proteins have mammalian orthologs. To address the evolutionary conservation, the human hMOF-containing complexes were purified from a stable HeLa cell line expressing hMOF tagged with one haemagglutinin (HA) and two FLAG epitopes (HA-2xFLAG-hMOF). The characterization of the interacting proteins revealed striking similarities in the complex composition between flies and humans (Mendjan, 2006).

Copurification of mammalian MSL orthologs showed that DCC is an evolutionary conserved protein complex. hMSL1, hMSL2, and hMSL3 were all present in the hMOF complex. Similar to Drosophila DCC, RNA helicase A (the ortholog of MLE) was not present in the complex, which is consistent with previous observations. Furthermore, two isoforms of hMSL3, hMSL3a and hMSL3c, were identified, copurifying with hMOF. The former represents the full-length protein, while the latter is an alternative splice isoform lacking the N-terminal chromobarrel domain (Mendjan, 2006).

In addition to the MSL proteins, most of the other proteins copurifying with TAP-MOF were also found in the hMOF complex. Z4 and Chriz/Chromator (Chr) lack clear mammalian orthologs, which could explain their absence. However, the Mtor ortholog TPR was identified in the HA-2xFLAG-hMOF purification. Human-specific proteins included the transcriptional coactivator HCF-1, O-linked N-acetylglucosaminetransferase OGT, and the forkhead and FHA domain containing transcription factor ILF-1/FOXK2. Interaction of hMSL3, hNSL1, hNSL2, hNSL3, and HCF-1 was further confirmed by Western blot analysis of eluted complex. Similar to the TAP-MOF and MSL-3FLAG complexes, the HA-2xFLAG-hMOF complex specifically acetylated histone H4 at lysine 16 on mononucleosomes (Mendjan, 2006).

Taken together, the data demonstrate that MOF interactions are evolutionary conserved and that the DCC is an evolutionary ancient complex that acetylates histone H4 at lysine 16 (Mendjan, 2006).

The purification of the MSL complex revealed quite an unusual complex composition. One would expect that a complex thought to modulate transcription and/or chromatin structure would contain a significant number of classical transcription factors, some of the numerous components associated with RNA polymerase II, or at least subunits of the ubiquitous chromatin remodeling and modifier complexes. However, none of these components was found. Instead, there seems to be a core MSL complex that interacts substoichiometrically with nucleoporins (Mtor, Nup153, Nup160, Nup98, and Nup154), interband binding proteins (Z4, Chromator/Chriz), and exosome components (Rrp6, Dis3) (Mendjan, 2006).

The results suggest that MOF is a subunit of two independent complexes in mammals and fruit flies. Several lines of evidence support this notion. This includes coimmunoprecipitation experiments and glycerol gradient centrifugation. Furthermore, hMOF was recently found in the MLL1 methyltransferase complex together with HCF-1, MCRS2, WDR5, NSL1, and PHF20, but this complex did not contain hMSL1. Finally, purification of the hMSL3 complex provides further evidence that hMSL3 does not associate with many of the MOF-interacting proteins. Therefore, it is suggested that the NSL complex contains at least MOF, NSL1, NSL2, NSL3, MCRS2, MBD-R2, and WDS, and in humans also HCF-1 and OGT (Mendjan, 2006).

The results presented here also suggest a molecular mechanism as to how the MOF complexes bifurcate. Both MSL-1 and NSL1 contain a PEHE domain in their C terminus. The NSL1 PEHE domain interacts directly with hMOF in vitro, and Drosophila MSL-1 has been shown to interact directly with MOF through the same domain. Furthermore, MSL-1 is required for full activity of MOF in vitro and for the assembly of the DCC on the male X chromosome. MSL-1 and NSL1 are the only two genes with a PEHE domain in the Drosophila genome, suggesting that it is an evolutionary conserved MOF-interacting domain. It is postulated that MSL1 and NSL1 serve as mutually exclusive bridging factors that assemble two different complexes around MOF, a histone H4 lysine 16-specific acetyltransferase (Mendjan, 2006).

In the current study, focus was placed on the mechanism of DCC function in Drosophila. All three purifications resulted in enzymatically active complexes with consistent copurification of MSL-1, MSL-2, MSL-3, MOF, roX1, and roX2 but not of MLE or JIL-1. The absence of MLE was expected, since its interaction with MSLs has reported to be salt and detergent sensitive. It is likely that JIL-1, like MLE, is sensitive to the purification conditions used in this study (Mendjan, 2006).

To examine the function of the new interacting proteins in dosage compensation, mutant flies were studied and RNAi was used in cell culture. In Z4 mutants or in MBD-R2-depleted SL-2 cells, MSL localization on the X chromosome was not affected. Consequently, these proteins are not required for MSL recruitment, or they have an alternative function with MOF that is independent of its role in dosage compensation (Mendjan, 2006).

However, an unexpected link was discovered between dosage compensation and the nuclear pore. Depletion of either Mtor or Nup153 but not of other nucleoporins or NXF1 delocalized MSL proteins from the X chromosome. The effects observed were not due to a general transport defect, since all the five MSL proteins and roX2 RNA remained nuclear in Mtor- and Nup153-depleted cells, and no accumulation was observe of bulk mRNA in these cells. Consistent with these observations, Mtor and Nup153 are required for proper dosage compensation of several classical MSL-dependent dosage-compensated genes in SL-2 cells. The expression of these genes was not affected in female Kc cells (Mendjan, 2006).

An important question raised from this study is whether the observed effects are due to a soluble fraction of Mtor and Nup153 in the nucleus or due to their function as components of the NPC. The latter is favored: (1) Nup153 staining is exclusively peripheral; (2) depletion of Nup153 delocalizes Mtor from the nuclear periphery and increases the soluble pool of Mtor in the nucleoplasm, but MSL proteins still remained delocalized in Nup153-depleted cells; (3) the fact that several nucleoporins, which exist together only at the nuclear pore, were copurified with the MSL complexes strongly favors the idea that there is an interaction between the DCC and the intact NPC. This interaction is substoichiometric but with clear functional importance for DCC assembly or maintenance on the X chromosome (Mendjan, 2006).

A wealth of information has been generated in budding yeast regarding nuclear organization and gene regulation. For instance, yeast telomeres associate with the nuclear periphery and form a transcriptionally silenced chromatin domain. However, a number of recent studies have shown that nuclear periphery is not just a domain of gene inactivation but also of activation. Consistent with these observations, yeast MLP1 and MLP2 (Mtor orthologs in yeast) associate with transcriptionally active genes and are involved in relocalization of active genes to the nuclear periphery. Furthermore, MLPs are involved in chromatin domain formation and pre-mRNA quality control (Mendjan, 2006 and references therein).

Interestingly, in Schneider cells, male embryos, salivary glands, and imaginal discs, the Drosophila male X chromosome appears localized at or near the nuclear periphery and in most cases even follows the nuclear rim curvature. The inactive X in mammals also localizes close to the nuclear periphery as the Barr body. Like the Drosophila male X chromosome, the inactive X has to be globally controlled (inactivated) and is characterized by a special histone modification (trimethylation of lysine 27 of histone H3). Another common feature between mammals and Drosophila is that noncoding RNAs play an essential role. A possible model that can account for these intriguing similarities is that the nuclear periphery is used to generate transcriptional domains that can be transcriptionally active or inactive in order to achieve coregulation of gene expression for a subset of genes. In the case of the Drosophila male X chromosome, hundreds of genes with different basal transcriptional properties need to be coactivated by a factor of two. This kind of a subtle transcriptional coregulation of a whole chromosome may be achieved by partial compartmentalization of the X chromosome mediated by the nucleoporin-MSL interaction, allowing the formation of hyperacetylated chromatin domains with unique transcriptional and/or posttranscriptionalproperties (Mendjan, 2006).

It is important to emphasize that Mtor and Nup153 may be required for general chromatin organization (not just individual chromosomes) through their interaction with chromatin-associated proteins. The DCC might mediate X-chromosomal tethering to the nuclear pore as a mechanism to coregulate a large set of genes by creating chromosomal loops or domains. This could happen by direct or indirect interactions of MSLs with Mtor/Nup153 located at or near high-affinity sites along the X chromosome, which are the binding sites of the DCC. Interactions with nuclear pore components may also be used to 'economize resources' and/or for efficient coupling of transcription to processing of the newly transcribed coregulated messages (Mendjan, 2006).

In summary, the purification of the MSL complex has revealed an unexpected link between dosage compensation and the NPC. In the context of data from other systems, this allows formulation of new hypotheses about the mechanism of dosage compensation that will be exciting to test in the future (Mendjan, 2006).

MSL complex is attracted to genes marked by H3K36 trimethylation using a sequence-independent mechanism

In Drosophila, X chromosome dosage compensation requires the male-specific lethal (MSL) complex, which associates with actively transcribed genes on the single male X chromosome to upregulate transcription 2-fold. On the male X chromosome, or when MSL complex is ectopically localized to an autosome, histone H3K36 trimethylation (H3K36me3) is a strong predictor of MSL binding. Mutants lacking Set2, the H3K36me3 methyltransferase, were isolated, and it was found that Set2 is an essential gene in both sexes of Drosophila. In set2 mutant males, MSL complex maintains X specificity but exhibits reduced binding to target genes. Furthermore, recombinant MSL3 protein preferentially binds nucleosomes marked by H3K36me3 in vitro. These results support a model in which MSL complex uses high-affinity sites to initially recognize the X chromosome and then associates with many of its targets through sequence-independent features of transcribed genes (Larschan, 2007).

MSL complex colocalizes with H3K36 trimethylation on X-linked genes: To investigate the relationship between MSL complex recruitment and histone methylation, ChIP-on-chip analysis of SL2 cells was performed with antibodies that recognize H3 trimethylated at K36 (H3K36me3) or dimethylated at K4 (H3K4me2). The SL2 cell line exhibits a male phenotype with respect to dosage compensation. NimbleGen tiling arrays were used; these contain the entire X chromosome and left arm of chromosome 2, tiled at 100 bp resolution. A general histone H3 antibody was used as a control for histone occupancy, and three biological replicates for tiling arrays indicated a high degree of reproducibility. As expected, the H3K36me3 and H3K4me2 modifications were associated with the 3' and 5' ends of transcribed genes, respectively, as previously reported for S. cerevisiae, mammals, and chicken. Close to 100% of transcribed genes on the X and 2L chromosomes were methylated at H3K36 and H3K4, largely independent of transcript level as previously reported for other organisms. Similar results were observed for MSL3-TAP, specifically on the X chromosome, but a lower fraction of transcribed genes on the X was bound (approximately 80%). With improved computational analysis, 1014 genes on the X chromosome scored positive for MSL binding in SL2 cells (up from previous estimate of 675 genes). 67% of the newly scored MSL-bound genes in SL2 cells were identified previouslyw was clearly bound in at least one cell type (Larschan, 2007).

To determine whether MSL binding colocalizes with H3K36me3 or H3K4me2, the correlation was examined between the data sets at the gene level. Of the 1014 MSL-bound genes in SL2 cells, 93% were positive for H3K36me3, and 83% were positive for H3K4me2. Interestingly, it was previously reported that a small percentage of untranscribed genes were bound by MSL3-TAP (7%), and the current study found that these genes also carried the H3K36me3 histone modification. In addition, untranscribed genes bound by MSL have significantly higher levels of H3K36me3 than untranscribed genes that are unbound by MSL complex. A likely explanation is that some nontranscribed genes are located near transcribed genes with very extensive H3K36me3 and MSL signals or within domains that have continuous strong signal over many kilobases. Specifically, 82% of MSL3-TAP-bound genes are transcribed, while 93% percent of MSL3-TAP-bound genes carry the H3K36me3 modification. Therefore, H3K36me3 is an even better predictor of MSL binding on the X than transcription state as defined by Affymetrix expression arrays. Similar results were observed for clone 8 cells, a Drosophila cell line derived from the wing disc (Larschan, 2007).

Colocalization in terms of whole genes could occur without coincident binding along the gene. It was previously reported that MSL3-TAP binds over the body of transcribed genes specifically on the X chromosome with a bias toward the 3' end. To determine whether H3K36me3 on the X chromosome and MSL complex colocalize spatially within transcription units, average gene profiles were compared for H3 methylation modifications and MSL3-TAP. It was found that H3K36me3 and MSL3-TAP exhibit a similar 3' biased profile, whereas H3 lysine 4 dimethylation is associated with the 5' end of transcription units, as reported in other organisms. Furthermore, at the probe level, a strong positive correlation is observed between MSL binding and H3K36me3 association. In contrast, a weaker correlation is observed with H3K4me2 that associates with the 5' ends of genes. These results demonstrate that H3K36 trimethylation is a 3' biased mark associated generally with active transcription units and that it is a very strong predictor of MSL binding on the X chromosome (Larschan, 2007).

MSL complex attracted to chromosome 2L by a roX2 transgene binds neighboring 2L genes marked by transcription and H3K36me3: When either a roX1 or a roX2 genomic transgene is inserted on an autosome, it attracts MSL complex to its site of insertion, with occasional signs of additional binding to neighboring regions along the autosome. Ectopic binding along the autosome is greatly increased when the X chromosome in the same nucleus is deleted for both roX1 and roX2. Such binding generally extends >1 Mb bidirectionally from the site of the roX transgene insertion, as measured by immunofluorescence for the MSL proteins. One interpretation is that nascent roX RNAs compete for attraction of the MSL proteins for assembly at their site of synthesis and that, after local assembly, MSL complex becomes competent to search for targets in its new chromosome environment. To determine whether ectopic binding on a normally untargeted chromosome would provide clues to the specificity of MSL binding, ChIP-on-chip analysis was performed on MSL3-TAP male larvae mutant for both roX1 and roX2 on the X chromosome and containing a roX2 transgene inserted at position 26D8-9 (near the CG9537 gene) on chromosome 2L. When assayed by immunostaining of polytene chromosomes, such males consistently show MSL binding in interbands along chromosome 2L, surrounding the site of the transgene insertion. At the level of genomic tiling arrays, ChIP results map this binding at high resolution. As a control, an additional array was used that contains the 3R chromosome and the entire X. It was found that the domain of MSL binding extends greater than 2 Mb in each direction from the insertion site on 2L, while binding to 3R was undetected. Importantly, the targets of binding are transcribed 2L genes, with the averaged binding profile showing enrichment over the bodies of genes, with a bias toward 3'ends. Each of these characteristics is typical of target genes on the X chromosome in wild-type larvae, cells, and embryos. Furthermore, when the 2L pattern of ectopic MSL binding in larvae was compared to the wild-type distribution of H3K36 trimethylation in tissue culture cells, a strong correlation was found between MSL binding and K36me3 within 1 Mb of the site of the roX transgenic insertion. Interestingly, although MSL-bound genes are consistently marked with H3K36me3, at greater than 1 Mb distances from the transgene insertion site, MSL complex increasingly skips some H3K36me3-bound genes while binding others. Overall, it was found that MSL targets selected on 2L were transcribed genes enriched for H3K36 trimethylation and that MSL binding showed a 3′ bias analogous to that normally found on X chromosome targets. These results raise the strong possibility that, once targeted to a chromosomal domain by a high-affinity site, MSL complex recognizes general marks for transcription such as H3K36me3 or other 3′-associated features rather than an X-specific sequence element at each individual target (Larschan, 2007).

Set2 is required for H3K36 trimethylation and for viability in both males and females in Drosophila : To investigate whether H3K36me3 plays a functional role in MSL complex targeting, a genetic approach was taken to inactivate the methyltransferase responsible for H3K36me3 in Drosophila. In S. cerevisiae, the Set2 histone methyltransferase is responsible for di- and trimethylation of H3K36. The CG1716-encoded protein has been identified as the likely functional homolog of ySet2 in Drosophila based on the presence of SRI and SET domains. Two initial tests were pursued to examine CG1716 function, the first in yeast and the second in Drosophila tissue culture cells. To test the function of CG1716 in yeast, an inducible CG1716 expression vector was transformed into set2Δ mutant S. cerevisiae that lack detectable H3K36me3. When CG1716 was induced by growth in media containing galactose, H3K36me3 (and some H3K36me2) was restored, demonstrating that a CG1716 cDNA functionally complements the yeast set2Δ. Also, the CG1716-encoded protein can interact with the RNA Pol II CTD as observed for S. cerevisiae Set2, further confirming the identity of CG1716 as the functional homolog of the S. cerevisiae SET2 gene. To test the function of CG1716 in Drosophila tissue culture cells, RNAi was used to target CG1716. A strong reduction of CG1716 mRNA was found to correlate with a significant loss of H3K36me3 by Western blot, immunostaining, and ChIP analysis. H3K4me2, a distinct chromatin mark for transcribed genes, was largely unaffected. ChIP analysis allowed quantification of a 3- to 5-fold reduction in H3K36me3 and only very small changes in H3K4me2. Based on these results, a Drosophila mutant was isolated that disrupts the CG1716 gene, henceforth referred to as the Set2 gene (Larschan, 2007).

Imprecise excision of a P element upstream of the Set2 gene was induced to create a series of Set2 deletion strains, and Set2¹ was selected for further analysis. dSet2¹ eliminates most of the coding region including the catalytic SET domain without extending bidirectionally into the neighboring CG1998 gene. Since the Set2 gene is located on the X chromosome, hemizygous males were initially isolated, and they were found to die as late third-instar larvae. To demonstrate that this lethality was due to loss of Set2, and not to any additional defects that might have been induced during P element excision, a transgene was constructed encompassing only the genomic region of Set2; it was able to fully rescue the Set2¹ mutants. Using the rescued males as fathers, homozygous mutant females were subsequently examined, and the Set2¹ mutation was found to cause late larval lethality in both sexes. To further analyze the viability of Set2 mutants at the cellular level, homozygous mutant Set2 eyes were created in the context of heterozygous mutant adult females, using the GMR-hid system. set2 mutant eyes were diminished in size and rough compared to wild-type eyes, which is a qualitative assay suggesting that Set2 is important for normal cell proliferation (Larschan, 2007).

To determine whether or not H3K36me3 was affected in the set2 mutant, polytene chromosome squashes of mutant larvae were were immunostained. H3K36me3 was significantly depleted in the Set2¹ mutant when compared to wild-type. As a control for the specificity of this defect, the same nuclei were immunostained for the interband protein Z4, which showed similar staining in wild-type and mutant. Set2¹ mutant larvae were further analyzed by ChIP to quantify the H3K36me3 levels in wild-type and Set2¹ mutants. H3K36me3 in the Set2¹ mutant was found to be dramatically decreased at the transcribed genes tested, to levels comparable to an untranscribed gene (CG15570). Changes in H3K4me2 varied from slight to none. Thus, Set2 is required for viability and methylation of H3K36 in Drosophila (Larschan, 2007).

Set2 contributes to optimal MSL complex targeting at transcribed genes, but not at high-affinity sites: To examine whether MSL complex targeting requires H3K36me3, polytene chromosomes of Set2¹ mutant larvae were immunostained with antibodies directed against MSL complex, but no difference in MSL pattern or intensity was detected at this level of resolution. Upon initial consideration, this result would appear to rule out a requirement for H3K36me3 in MSL targeting. However, when attempts were made to validate this observation with ChIP assays conducted with two independent fly stocks and ChIP protocols (both anti-MSL2 and MSL3-TAP IPs), it was found that wild-type and Set2¹ mutant larvae showed significant differences at many specific gene targets. Nine genes with high, medium, or low levels of MSL complex binding were assayed for recruitment of MSL2 and MSL3-TAP in wild-type and Set2¹ mutant third-instar larvae by ChIP analysis. Highly reproducible 2- to 10-fold decreases were observed in MSL2 and MSL3-TAP association at all nine genes assayed. In contrast, MSL complex association with previously reported 'high-affinity sites', such as roX1, roX2, and 18D11, was largely unaffected in the Set2¹ mutant (Larschan, 2007).

Such a result might be attributed to indirect effects in Set2¹ mutant larvae as opposed to specific defects in MSL targeting. To address this, roX RNA and msl2 mRNA levels were measured, and it was found that they were not affected significantly in the Set2¹ mutant, suggesting that H3K36me3 does not affect MSL complex recruitment indirectly by affecting expression of MSL components. Western and polytene staining analysis of Msl1 and Msl2 also indicate that protein levels are largely unchanged. It was also found that ChIP for H3K4me2 and RNA polymerase II were not significantly affected in set2 mutants, further supporting a direct role for H3K36me3 in stabilization of MSL complex at target genes (Larschan, 2007).

To address the functional role of H3K36me3 in transcription of genes bound by MSL complex, the transcript levels of MSL complex target genes were compared in wild-type and Set2¹ mutant larvae. Transcription of MSL target genes is not strongly affected in Set2¹ mutant larvae, although genes that exhibit the strongest loss of MSL complex binding (CG13316, CG12690, CG32555, and CG32575) exhibit decreases in transcript level. Dosage compensation involves a 2-fold upregulation of transcription, limiting the expected transcriptional changes to a 50% decrease in transcript. Furthermore, when H4K16 acetylation at these genes was examined, significant residual levels were found (10-fold over autosomal controls or untranscribed genes), even when very small amounts of MSL complex remain. Thus, residual MSL complex function may be largely sufficient for transcriptional upregulation in the Set2¹ mutant, yet MSL complex targeting is significantly reduced (Larschan, 2007).

Together, these results suggest that a subset of MSL binding sites is particularly sensitive to H3K36me3 levels, while others, including three previously defined high-affinity sites are not. Since MSL binding is diminished significantly but not ablated in the Set2¹ mutant, these results support a model in which recognition of H3K36me3 is one contributing factor to MSL complex targeting that functions with additional features of transcribed genes (Larschan, 2007).

An important caveat to the conclusion that H3K36me3 functions together with other recognition features is that the heterozygous mothers of hemizygous Set2¹ mutants carry a functional Set2 gene and thus could provide a maternal supply of wild-type Set2 mRNA or protein to the mutant embryos. This maternal contribution of H3K36me3 could be sufficient to initially establish MSL binding, which might be maintained through development, independent of the initial recognition mark. Thus, if the maternal contribution of H3K36me3 could be eliminated, it was hypothesized that an even more significant defect would be observed in MSL complex recruitment. To address this possibility genetically, a stock designed to create homozygous set2 mutant germline clones was constructed using FLP-FRT-mediated recombination in an ovo^D dominant female sterile mutant. After recombination, the set2 mutant germ cells would no longer carry ovo^D and thus should produce oocytes that would lack any maternal Set2 mRNA or protein. Despite recombination to remove ovo^D from germ cells, no functional oocytes were produced, demonstrating that Set2 is essential for oogenesis. Therefore, the maternal contribution of Set2 remains in these studies; its elimination might reveal an even more significant role or H3K36me3 in MSL recruitment than has been reported (Larschan, 2007).

Recombinant MSL3 binds preferentially to nucleosomes trimethylated at H3K36: Eaf3, the yeast member of the conserved MSL3/MRG family of proteins, has been implicated in a physical and functional interaction of Rpd3(S) complexes with H3K36me3, raising the attractive hypothesis that MSL3 plays an analogous function in MSL complex. Furthermore, the distinction between high-affinity MSL binding sites such as roX1, roX2, and 18D11 and the majority of MSL targets is that high-affinity sites are MSL3 independent. Therefore, sensitivity to loss of H3K36me3 might be a specific characteristic of MSL3-dependent targets. To test the idea that MSL3 contributes to specific recognition of H3K36me3-modified nucleosomes, gel shift analyses was performed with recombinant MSL3 protein produced in baculovirus using nucleosomes assembled in vitro. Using an EMSA assay system where specifically modified recombinant nucleosomes were assembled, it was found that purified MSL3 protein showed increased affinity to nucleosomes pretreated with active Set2, and thus marked with H3K36 methylation, as opposed to nucleosomes that were unmodified at H3K36. This preferential binding was only detected in nucleosomes bearing linker DNA, suggesting that affinity for free DNA may be contributing to the binding of MSL3 to the nucleosomes methylated at H3K36. Titrations were performed to measure the relative affinity of MSL3 association with methylated compared to unmethylated nucleosomes. The increased affinity of MSL3 for methylated nucleosomes is best observed at the 4.4 nM concentration. These results provide additional evidence supporting a model in which H3K36me3 is a 3' chromatin mark required for the robust, wild-type MSL binding pattern on the X chromosome (Larschan, 2007).

This study has found that ectopic spreading of MSL complex to the 3' ends of transcribed genes on autosomes indicates that a sequence-independent mechanism can define MSL complex target genes. Furthermore, trimethylation of H3K36 is required for optimal MSL complex targeting to transcribed genes on the male X chromosome subsequent to initial recognition of the X. In the absence of H3K36me3, MSL complex can associate with high-affinity sites on the X chromosome but exhibits reduced binding to target genes. Since MSL binding is reduced but is not eliminated, a model if favored in which association with H3K36me3 is a contributing factor that functions with recognition of one or more additional 3' features of transcribed genes such as nascent mRNAs or RNA Pol II CTD phosphorylation (Larschan, 2007).

In addition to a function for Set2 in MSL complex targeting, this study demonstrates that Set2 is essential for viability of both sexes in Drosophila. Conservation of the Set2 H3K36 methyltransferase function from S. cerevisiae to Drosophila was observed, as predicted by sequence conservation. A variety of roles have been reported for Set2 in several organisms. In Neurospora, S. pombe, and NIH 3T3 cells, Set2 is required for optimal growth rate. The S. cerevisiae set2Δ mutant suppresses the loss of positive elongation factors. In Drosophila, mutants lacking zygotic Set2 function fail to proceed through the developmental transitions from late larval to adult stages. The cause(s) of inviability in Drosophila set2 mutants remains to be determined, but eyes composed entirely of homozygous set2 mutant tissue were small and rough, indicating defects in cell proliferation (Larschan, 2007).

In vitro studies using recombinant MSL3 produced in baculovirus revealed preferential interaction with nucleosomes that were trimethylated at H3K36, suggesting that a direct interaction may occur between MSL complex and H3K36me3 chromatin on the X chromosome. In S. cerevisiae, an MSL3 homolog, Eaf3, mediates an interaction between the Rpd3(S) complex and H3K36me3 at active genes. If conserved, this function in Drosophila presumably would be played by another MSL3 family member, MRG15. In S. cerevisiae, Rpd3(S) is thought to deacetylate histones in the wake of RNA polymerase II to prevent uncontrolled activation and transcription initiation from cryptic start sites within genes. This raises the possibility that, on the X chromosome, MSL complex might compete for binding to H3K36me3 with the repressive deacetylation function of Rpd3(S). Alternatively, H3K36me3 may simply be a mark utilized by MSL complex to regulate target genes by a mechanism independent of Rpd3(S) (Larschan, 2007).

H3K36me3 marks transcribed genes independent of transcript level but is a weak modulator of endogenous transcript and RNA polymerase II levels. In S. cerevisiae, where its role is best understood, Set2 functions to suppress formation of aberrant internal transcripts by facilitating histone deacetylation yet has only small effects on endogenous transcript levels. In Drosophila, small but reproducible changes were detected in transcript levels at MSL complex target genes in set2 mutant larvae. Also, minimal changes were observed in RNA Pol II levels as previously reported for the set2Δ mutant in S. cerevisiae. Also, changes in transcription level due to loss of dosage compensation are small, with a maximal 50% decrease predicted. Thus, the combined loss of the Set2 protein and reduction in MSL complex recruitment did not cause dramatic changes in transcript level. Furthermore, levels of H4Ac16 were decreased but not eliminated at target genes, consistent with residual MSL function that can explain why more dramatic changes in transcription of MSL complex target genes were not observed (Larschan, 2007).

A defined mechanism for MSL complex targeting to hundreds of sites along the male X chromosome has remained elusive. Previous reports have posited two highly related models for MSL complex recruitment: a 'spreading' model and an 'affinities' model. Both models are based on the idea that specific MSL interaction occurs at high-affinity sites that mark the X chromosome. These sites have been mapped on polytene chromosomes, but most are not yet defined at the molecular level. roX genes and other high-affinity sites are thought to concentrate MSL complex within an X chromosome domain. In the spreading model, MSL complex creates the full MSL binding pattern by searching the X chromosome for general characteristics of active genes without necessarily requiring a specific DNA sequence at each gene. This could occur either by scanning along the chromosome in a linear manner or by releasing and rebinding chromosomal regions in close physical proximity. It has been demonstrated that roX RNAs can move in trans from one DNA molecule to another, so linear scanning is possible but not obligatory. The affinities model proposes that there is a continuum of affinity sites for MSL complex, ranging from high to low. Only when high-affinity sites are locally concentrated can low-affinity sites be recognized, similar to the spreading model. The major difference is that even low-affinity sites are predicted to contain sequence elements that direct MSL binding. It is thought that the results documenting the pattern of ectopic MSL binding on chromosome 2L surrounding a roX transgene make the existence of sequence elements at every MSL binding site on the X chromosome unlikely. That the 2L pattern was analogous to that normally found on the X chromosome, targeting transcribed genes marked by H3K36me3 and binding with a 3' bias, is strong evidence that MSL complex recognizes target genes marked by transcription. This does not exclude the possibility that transcribed genes carry common sequence elements but makes it unlikely that such sequence elements differ between autosomal genes and the majority of MSL target genes on the X chromosome (Larschan, 2007).

In summary, the data are consistent with a model in which MSL complex first recognizes nascent roX transcripts and a series of high-affinity sequences along the male X chromosome and then scans the X for target genes that exhibit H3K36 trimethylation and other marks of active transcription. Recognition may involve the MSL3 chromodomain and additional factors. Trimethylation of H3K36 marks the middle and 3' ends of transcription units, independent of absolute transcript levels in Drosophila, consistent with S. cerevisiae and mammalian systems. Thus, MSL complex recognition of H3K36me3 provides an important mechanism for identification of transcribed genes and avoidance of silenced regions (Larschan, 2007).

The Interactive Fly resides on the
Society for Developmental Biology's Web server.