Dosage compensation in Drosophila is the mechanism by which X-linked gene expression is made equal in males and females. Proper regulation of this process is critical to the survival of both sexes. Males must turn the male-specific lethal (msl)-mediated pathway of dosage compensation on and females must keep it off. The msl2 gene is the primary target of negative regulation in females. Preventing production of MSL2 protein is sufficient to prevent dosage compensation; however, ectopic expression of MSL2 protein in females is not sufficient to induce an insurmountable level of dosage compensation, suggesting that an additional component is limiting in females. A candidate for this limiting factor is MSL1, because the amount of MSL1 protein in females is reduced compared to males. This study dentified two levels of negative regulation of msl1 in females. The predominant regulation is at the level of protein stability, while a second regulatory mechanism functions at the level of protein synthesis. Overcoming these control mechanisms by overexpressing both MSL1 and MSL2 in females results in 100% female-specific lethality (Chang, 1998: full text of article).
The dosage compensation complex (DCC) of Drosophila melanogaster is capable of distinguishing the single male X from the other chromosomes in the nucleus. It selectively interacts in a discontinuous pattern with much of the X chromosome. How the DCC identifies and binds the X, including binding to the many genes that require dosage compensation, is currently unknown. To identify bound genes and attempt to isolate the targeting cues, male-specific lethal 1 (MSL1) protein binding was visualized along the X chromosome by combining chromatin immunoprecipitation with high-resolution microarrays. More than 700 binding regions for the DCC were observed, encompassing more than half the genes found on the X chromosome. In addition, several rare autosomal binding sites were identified. Essential genes are preferred targets, and genes binding high levels of DCC appear to experience the most compensation (i.e., greatest increase in expression). DCC binding clearly favors genes over intergenic regions, and binds most strongly to the 3' end of transcription units. Within the targeted genes, the DCC exhibits a strong preference for exons and coding sequences. These results demonstrate gene-specific binding of the DCC, and identify several sequence elements that may partly direct its targeting (Gilfillan, 2006: full text of article).
It has long been known that not all genes on the X chromosome are subject to dosage compensation, but those known examples are apparently exceptional cases, including loci also present on the Y chromosome, female-specific genes, and larval genes proposed to be members of redundant gene families. Similar estimates of the number of bound genes were derived from the cDNA and tiling arrays, despite normalization to different control DNAs (mock IP and input DNA, respectively). In the 2-h time window of embryonic development studied in this analysis, it was found that just over half of the annotated genes on the X chromosome were bound by the DCC. This may represent a slight overestimate of the genes actually binding the DCC, as the resolution of the ChIP analysis is defined by the average length of the input chromatin. Thus, 'spill-over' of signal from genuine binding sites, due to the resolution of 700 bp, may account for a number of genes considered targets for the DCC. The recent study by Kuroda and colleagues (Hamada, 2005) provided a substantial list of genes subject to dosage compensation by the DCC. However, difficulties in measuring twofold changes in gene expression by microarray analysis mean that this gene list is almost certainly an underestimate of the true number of compensated genes. Although the current target list of DCC-binding genes is longer, nonetheless, a large number of genes were found that bind polymerase and yet do not bind the DCC; thus the number of compensated genes on the X may, in fact, be lower than previously assumed (Gilfillan, 2006).
Several autosomal binding sites identified using the cDNA array are also presented. Notably, the majority of these do not co-map with sites of autosomal DCC binding observed on polytene chromosomes. Several reasons may explain these differences, including false-positive hits, inaccuracies of mapped polytene positions, the different developmental stages and tissues concerned, and the variable nature of the autosomal sites seen on polytene chromosomes. It is also worth noting that the only strong autosomal MSL1-binding site found in the tiling array reveals binding to an intergenic sequence. The X-chromosomal-binding sites for the DCC are very specific for genic sequences; therefore, if genuine autosomal sites for the DCC represent a different, perhaps nonfunctional, binding to intergenic sequences, they would not be recovered by the use of a cDNA array. Nonetheless, the autosomal sites are of interest because they may provide clues to the DNA sequences attracting the DCC (Gilfillan, 2006).
The observed binding to many single genes and the obvious peaks within bound regions are incompatible with a model for DCC binding based on coating of large chromosomal domains, and suggest instead a gene-specific targeting. Furthermore, the observed specificity for genes over intergenic regions implies that the DCC binds directly to its target genes, rather than applying control of a domain analogous to that of the Locus Control Region regulating β-globin gene expression. The results instead favor a model in which the DCC binds directly to the genes that are targets for dosage compensation. Further evidence was found to support this, based on analysis of bound sequences, suggesting that DCC targeting is at least in part directed by DNA sequence (Gilfillan, 2006).
Motif-finding algorithms commonly used to define transcription factor-binding sites were unable to isolate a targeting sequence that could direct DCC binding. Their failure suggests that DCC binding may be directed by a more complex combination of degenerate sequence motifs. A recent analysis of high-affinity sequences defined several paired motifs found enriched in MSL-binding sequences (Dahlsveen, 2006), but these were also insufficient to predict further DCC-binding sites. To examine more complex word combinations, partial least squares regression was used, with which several sequences were identified that to some extent explain the observed DCC binding. MSL1-binding sequences from a section of the X chromosome could be used to predict MSL1 binding on further stretches of the X chromosome. The diversity of motifs required to describe MSL1 binding suggests that combinations of short sequence motifs, dispersed through target sequence, are responsible for attracting the DCC. The identified motifs notably contain GA and CA dinucleotides. Sequences containing the GAGA motif appear to have an important role in attracting the DCC to the roX2 high-affinity site, and have also been found in additional high-affinity sites. Furthermore, CA and GA dinucleotide repeats are enriched on the X chromosomes of Drosophila species exhibiting dosage compensation. Whether sequence motifs, in complex combinations, are sufficient to explain all DCC targeting is currently unclear. However, the results strongly suggest that such combinations of sequence motifs have an essential role in DCC targeting, not limited to the high-affinity sites (Gilfillan, 2006).
The finding that targeting signals encoded in the DNA sequence attract the DCC implies that binding would be identical in all cell types, unless access of the DCC to target regions is regulated. However, analysis of different cell types in an accompanying paper by Alekseyenko (2006) found that a minority of genes display differential DCC binding. Clearly, the simplest targeting theory that the DCC recognizes a set of DNA motifs early in development and establishes an inflexible binding pattern is incorrect. In this context, the genes found in embryos to be binding MSL1 and not polymerase may be examples of genes already expressed and since silenced, or genes awaiting activation. In Drosophila larvae, conflicting reports observed either a sustained pattern of MSL1 and MSL3 binding to polytene chromosomes (Kotlikova, 2006), or subtle changes throughout larval development (Sass, 2003). The steady expression levels observed throughout development for many embryonic DCC-target genes in this study suggests that those genes will be expressed constitutively. Accordingly, it is suggested that sustained DCC binding is the rule, and developmental changes the exception. Refinement of targeting motifs may allow a more accurate definition of the rules governing DCC targeting. Nonetheless, the current ability to predict DCC binding based on DNA sequence is limited, and it is therefore possible that other factors such as chromatin modifications or transcription have a role in targeting the DCC. For example, although transcription does not appear sufficient to attract the DCC, it may be a prerequisite for binding. Conceivably, chromatin changes induced by a 'pioneer polymerase' could enable binding of the DCC, possibly explaining a limited amount of developmental regulation of DCC binding (Gilfillan, 2006).
The striking specificity for DCC binding to exons and coding sequences is curious, since the sequences responsible for targeting the DCC must simultaneously perform the additional function of encoding functional protein, with accompanying constraints on sequence evolution. X-chromosomal genes have been shown to have a higher codon bias than autosomal genes. DCC target genes have a higher expression than nontargets. Highly expressed genes typically have high codon bias. It must therefore be considered that the motifs identified by partial least squares analysis may not direct DCC targeting, but instead may be the consequence of preferred codon usage in highly expressed, compensated genes. Some caution is also required in interpreting the observed specificity for exons, since small introns are a feature of Drosophila genes, and the resolution of the ChIP technique cannot exclude binding to many such features. However, many intron-less genes are targets of the DCC, confirming assertions that coding sequences can attract the DCC (Gilfillan, 2006).
A requirement for the conclusion that the DCC binds to its site of action would be that bound genes demonstrate dosage compensation. Despite being limited to comparing the current binding data from embryos with a published study from tissue culture cells (Hamada, 2005), the correlation between the data sets would suggest that this is, indeed, the case. Further correlation of MSL1 binding to published gene expression data demonstrates that many gene targets for the DCC have a sustained expression throughout development, and as such may perform 'housekeeping' functions, as previously suggested (Sass, 2003). In keeping with this, it was found that more MSL1 is bound to essential genes. Based on these observations, it is proposed that only those genes for which dosage is critical have evolved the ability to recruit the DCC. In this scenario, many genes not binding the DCC will exhibit lower expression in males than females. Furthermore, the finding that MSL levels positively correlate with the level of dosage compensation is important. This implies that many genes may not absolutely require an exact balancing of dosage between males and females, and partial compensation is sufficient to balance fitness between the sexes. An extension of this theory is that only those genes for which dosage is critical may have evolved the capacity to attract a large amount of DCC. Parallels to mammalian dosage compensation can be drawn, where leaky inactivation in the mammalian system may be mirrored by incomplete activation in Drosophila (Gilfillan, 2006).
The observation that there is no more polymerase on the X chromosome than autosomal sequences is consistent with previous suggestions that the DCC does not increase the amount of polymerase loading or promoter clearance, but rather the speed at which a transcript is completed (i.e., transcription elongation). The detection of negligible amounts of DCC on promoters further supports this conclusion. The observed 3' bias of DCC binding within many genes seen in this study also favors the idea that the DCC operates by assisting transcription elongation. Increased passage of polymerase through X-chromosomal genes is consistent with the elevated levels of H3.3 found on the X, purportedly due to replication-independent replacement of canonical H3 by H3.3 during transcription. The observation that tRNA genes are bound by the DCC also suggests that the mechanism of dosage compensation may be applicable to both pol II and pol III. The acetylation of H4K16 by the DCC may serve to increase the rate of polymerase progression through chromatin, for example, by reducing polymerase pausing. Pausing has been noted for all RNA polymerases, and is exacerbated on chromatin templates. The acetylation of H4K16 in the middle and 3' end of two X-chromosomal genes suggests that H4K16 acetylation may follow a similar pattern to the DCC within genes themselves. This study has shown that H4K16 acetylation is very similar to the binding of the DCC at intermediate, restriction fragment resolution. Chromosome-wide mapping of H4K16 acetylation at high resolution on a genome-wide scale is therefore a priority (Gilfillan, 2006).
The DNA sequence elements capable of predicting DCC binding consist of many combinations of different motifs. To recognize such a variety of different sequences, the DCC must allow promiscuous protein-DNA sequence recognition. Binding sites would, in this scenario, be determined by the concentration of many recognition elements in a particular DNA sequence. These observations are compatible with the 'affinities' model for targeting of the DCC (see Dahlsveen, 2006). However, affinities alone fail to explain the recent observation that the MSL2 protein exhibits exceptionally stable binding to the male X chromosome, as sites of low affinity would be expected to demonstrate highly dynamic binding to the MSL complex (Gilfillan, 2006).
The motifs identified also do not explain the 3' bias. While the MSL1-binding profile suggests a 3' accumulation of the DCC, it cannot be excluded that instead a 5' depletion of DCC is observed similar to histone depletion around promoters. Interestingly, similar protein gradients have been observed of yeast cohesins, which form a ring around the DNA helix. It has been suggested that these cohesin rings can slide along the DNA, and may be 'pushed' by transcribing polymerase to the 3' ends of genes, with their resulting accumulation at sites of convergent transcription. However, no accumulation of DCC is seen between convergently transcribed genes as reported for the cohesins (Gilfillan, 2006).
It is speculated that the DNA elements of highest affinity responsible for initially targeting the complex to the X (recognition elements) may be sites for loading of DCC onto the X chromosome. The DCC may form a topological linkage around DNA at these points, similar to that proposed for the cohesins. From these loading points, the DCC may spread to sites of lower affinity, its contact with DNA stabilized by a ring-like structure, allowing the promiscuous yet stable binding (Gilfillan, 2006).
In Drosophila, dosage compensation is achieved by a twofold up-regulation of the male X-linked genes and requires the association of the male-specific lethal complex (MSL) on the X chromosome. How the MSL complex is targeted to X-linked genes and whether its recruitment at a local level is necessary and sufficient to ensure dosage compensation remain poorly understood. This study documents the MSL-1-binding profile along the male X chromosome in embryos and male salivary glands isolated from third instar larvae using chromatin immunoprecipitation (ChIP) coupled with DNA microarray (ChIP-chip). This analysis has revealed that majority of the MSL-1 targets are primarily expressed during early embryogenesis and many target genes possess DNA replication element factor (DREF)-binding sites in their promoters. In addition, MSL-1 distribution remains stable across development and binding of MSL-1 on X-chromosomal genes does not correlate with transcription in male salivary glands. These results show that transcription per se on the X chromosome cannot be the sole signal for MSL-1 recruitment. Furthermore, genome-wide analysis of the dosage-compensated status of X-linked genes in male and female shows that most of the X chromosome remains compensated without direct MSL-1 binding near the gene. These results, therefore, provide a comprehensive overview of MSL-1 binding and dosage-compensated status of X-linked genes and suggest a more global effect of MSL complex on X-chromosome regulation (Legube, 2006).
Using the ChIPchip strategy, the MSL-1 distribution was examinedorder to investigate how the MSL complex achieves specific targeting on the X chromosome. This analysis provides a first comprehensive list of MSL-1-binding sites along the X chromosome in early embryos and male salivary glands. Furthermore, MSL-1 is shown to bind individual gene loci rather than broad chromosomal domains. More specifically, whether or not binding of MSL-1 is directed by transcription activation was addressed, since it has been shown that driving transcription from a transgene inserted on the X chromosome is able to induce local DCC recruitment. The results show that even though the MSL complex is predominantly bound on transcriptionally active genes, transcription of endogenous X-linked genes per se is not sufficient to attract the complex in salivary glands. This conclusion is based on the following observations. The first evidence is provided by the fact that the MSL-1 distribution is highly stable between two tissues as different as whole 46-h embryos and third instar salivary glands, indicating that the MSL distribution is unlikely to reflect expression profiles. Indeed, 40% of the target genes in 46-h embryos were still bound by MSL-1 in larvae, which represent most (82.7%) of the MSL-1 target genes in salivary glands. Furthermore by comparing expression and MSL-1-binding profiles in male salivary glands, it was observed that recruitment of the MSL complex is not a general property of active X-linked genes, since only 15% of transcribed genes present on the array are directly bound by MSL-1. It remains possible that a number of target sites in the intergenic or intronic regions were missed since the analysis was performed on cDNA arrays. However, the comparison of the MSL-1-binding pattern with transcription-associated factors such as Spt-5, Spt-6, and S5-P PolII on intact polytene chromosomes also did not reveal a strong overlap between these proteins and MSL-1. These results are also consistent with recent observations that these transcription-associated factors localize both in bands enriched in MSL-1 and in bands depleted of MSL-1 (Legube, 2006).
How is then the MSL complex targeted on the X chromosome? It is possible that transcription on the X acts as a signal for MSL-1 targeting during early stages of development. Once bound, the complex would then either be stabilized on the target genes or decay, independently of further changes in expression (accounting for the absence of major developmental change in the DCC distribution). In favor for this model it was observed that MSL-1 binding correlates more with transcription in early embryos than in larvae, and that most of the MSL-1 target genes in all the stages tested show high expression levels during early embryogenesis (Legube, 2006).
Alternatively, MSL-1 could be targeted by specific transcription factors to a small subset of genes. Indeed it was observed that MSL-1 target genes are enriched in DREF-binding sites. DREF is a potential regulator of genes involved in cell cycle and growth regulation. DREF associates in vivo with the core promoter transcription complex TRF2 (TBP-related factor 2) and it has been proposed that it may target TRF2 to a subset of core promoters. In a similar way DREF may promote gene selectivity for the DCC and act as a DNA-targeting component for the DCC. Consistent with this hypothesis, it was found that the X chromosome is enriched in comparison to autosomes in genes that possess several DRE or DRE-related sites in their 2-kb upstream sequence. Another interesting candidate would be the GAGA factor (GAF), which has been shown to colocalize to some extent with the MSL complex on polytene chromosomes, and which is required for the proper localization of MSL on the X chromosome. It is noteworthy that the transgenes that were previously shown to attract the MSL complex when transcribed also possessed a few GAF-binding sites. A careful analysis of MSL-1 distribution in 014-h embryos revealed that MSL-1 target genes are slightly more enriched in GAF-binding sites than nontarget genes (Legube, 2006).
Targeting of MSL proteins on selected X-linked genes could then be achieved by a combinatorial effect of several independent motifs. This may explain why previous attempts to find specific X-chromosomal consensus sequences were not successful. Therefore, ChIPchip approaches such as used in this study, which allow the discovery of physiological targets in an unbiased and global manner, will be important to unravel the complexity hidden within this system (Legube, 2006).
In addition to providing insights into the MSL-1 targeting mechanism, the current data also raise important questions about the role of the MSL complex in transcription activation and in the dosage compensation mechanism. First of all, according to the data, >66% of the transcribed X-linked genes are compensated, whereas only 15% are bound by MSL-1. Moreover, the array and qPCR analyses suggest that local MSL-1 binding is neither sufficient nor necessary to ensure dosage compensation in salivary glands (Legube, 2006).
One possibility is that the genes not bound by the MSL complex, but still compensated, could be up-regulated by an MSL-independent mechanism. However, this possibility is considered unlikely, as such an MSL-independent mechanism would have been already discovered by genetic screens (since it would then concern most of the dosage-compensated genes). In addition most of the genes on the X chromosome seem to be compensated in an MSL-2-dependent manner. Therefore, the idea is favored that the MSL complex could have long-range effects on the transcription of X-linked genes. It has been already reported that the Sgs4 and BR-C genes are dosage-compensated in third instar larvae in an MSL-dependent manner. Neither of them are bound by MSL-1 according to the current data, and moreover Sgs4 is localized in a band depleted in MSLs as assessed by immunofluorescence on polytene chromosomes, supporting the idea of an action of MSL in a long-range manner (Legube, 2006).
How the MSL complex could operate to fine-tune the transcription of genes, distant from the direct MSL-binding sites, still remains a mystery. One exciting possibility may be that the MSL complex is actually required to recruit parts of the X chromosome into a nuclear domain with unique transcriptional/post-transcriptional properties. It has become clear that spatial positioning within the nucleus also plays a central role in the control of gene expression, allowing the coregulation of subsets of genes. MSL binding on discrete loci could induce the localization of a broad X-chromosomal domain, containing several genes, to a nuclear compartment that possesses specific transcriptional properties. This may ensure the dosage compensation of many X-linked genes, without a need of direct MSL binding on these genes. Interestingly, purification of the MSL complex has revealed coassociation of several nucleoporins in embryos and Schneider cells. One may envisage that concerted action of MSL with nuclear pore components may help to define such domains needed for cis-regulation of many genes (Legube, 2006).
An interesting feature of the dosage-compensated X-chromosome includes the colocalization of MSL proteins with specific histone H4 Lys 16 acetylation (H4-K16Ac). This histone modification has been proposed many times to be related to transcription activation and appears to have a unique, although still poorly understood, role in transcriptional regulation. In Drosophila, H4-K16 acetylation overlaps with MSL binding on the male X chromosome on polytene chromosomes, and driving artificially H4-K16 acetylation by MOF is able to increase transcription. However, the situation for endogenous X-linked genes appears to be more complex, at least in salivary glands, since a number of genes bound by MSL-1 were found that did not seem to be transcribed. Interestingly, and in agreement with the observation in Drosophila, genome-wide analysis in Saccharomyces cerevisae showed that the distribution of H4-K16 acetylation along the chromosomes does not correlate with transcriptional activity. Therefore, the function of H4-K16 acetylation in the dosage compensation process and more generally in transcriptional regulation appears unclear (Legube, 2006).
Another important issue to consider is that Drosophila salivary glands, although differentiated, undergo endoreplication. Global analyses revealed that there is a correlation between replication timing and transcription activity. However, this correlation is not absolute. Furthermore, similar replication timing profiles were obtained in two human cell types as different as fibroblasts and lymphoblasts, expected to show quite different gene expression profiles. Since H4 hyperacetylation has been associated with active replication origins, it is tempting to speculate that MSL/H4-K16Ac distributions may correlate with replication timing. Interestingly, the DREF transcription factor, which binds the DRE sequence, identified in MSL-1 target genes, plays a role in endoreplication in salivary glands. Future global analyses of binding profiles of all MSL components together with the comparison of profiles of other transcription/replication factors and histone modifications will certainly help in the understanding of H4-K16 acetylation and dosage compensation in Drosophila (Legube, 2006).
The dosage compensation complex (DCC) in Drosophila is responsible for up-regulating transcription from the single male X chromosome to equal the transcription from the two X chromosomes in females. Visualization of the DCC, a large ribonucleoprotein complex, on male larval polytene chromosomes reveals that the complex binds selectively to many interbands on the X chromosome. The targeting of the DCC is thought to be in part determined by DNA sequences that are enriched on the X. So far, lack of knowledge about DCC binding sites has prevented the identification of sequence determinants. Only three binding sites have been identified to date, but analysis of their DNA sequence did not allow the prediction of further binding sites. Chromatin immunoprecipitation was used to identify a number of new DCC binding fragments and characterized them in vivo by visualizing DCC binding to autosomal insertions of these fragments, and it has been demonstrated that they possess a wide range of potential to recruit the DCC. By varying the in vivo concentration of the DCC, evidence is provided that this range of recruitment potential is due to differences in affinity of the complex to these sites. DCC binding to ectopic high-affinity sites can allow nearby low-affinity sites to recruit the complex. Using the sequences of the newly identified and previously characterized binding fragments, a number of short sequence motifs, which in combination may contribute to DCC recruitment, were identified. These findings suggest that the DCC is recruited to the X via a number of binding sites of decreasing affinities, and that the presence of high- and moderate-affinity sites on the X may ensure that lower-affinity sites are occupied in a context-dependent manner. The bioinformatics analysis suggests that DCC binding sites may be composed of variable combinations of degenerate motifs (Dahlsveen, 2006; full text of article).
Using a ChIP strategy and antiserum to Msl-1, several new DCC binding fragments have been identified and it has been demonstrated that they possess a wide range of potential to recruit the DCC. Because the majority of the isolated candidate fragments co-map with endogenous DCC binding sites at the resolution afforded by staining of polytene chromosomes, it is thought that the ChIP selection procedure was appropriate. By tuning DCC levels in vivo, it was concluded that the difference in recruitment ability is due to different affinity of the DCC for these fragments. At limiting concentrations of complex, only the sites of highest affinity are occupied. Conversely, at non-physiologically high concentrations of DCC, even 'cryptic'binding sites on autosomes are recognized by the complex. This suggests, in concordance with previous observations, that selective interaction of the DCC with the X chromosome is a function of tightly controlled levels of complex components that are adjusted to assure interaction with binding sites of varying affinity clustered on the X, but insufficient to occupy cryptic sequences on autosomes. The data are also in broad agreement with recent observations that have shown that numerous sites on the X chromosomes contain DCC binding determinants. These determinants are not all equal, but represent a diverse set of DCC targets that differ by a wide range of affinities for the complex, as expected from a sequence determinant that became gradually enriched on the X chromosome during evolution (Dahlsveen, 2006).
The use of the term 'chromatin entry sites' for the subset of DCC binding sites that are still occupied by partial complexes in the absence of MSL3, implied that these sites were somehow qualitatively and perhaps functionally distinct from the remaining sites that only attract the intact complex. Although it is possible that not all DCC binding sites are functionally equivalent, the characterization of several new examples of both types of DCC binding sites rather supports the “affinities model”. According to this model, “chromatin entry sites” are not qualitatively different from other sites, but only represent those sites with the highest affinity for the complex. A prediction from this model that is further substantiated by the results is that non-functional complexes that lack MSL3 or the acetyltransferase activity of MOF have lower affinity for target sites. Only those determinants with highest affinity for the DCC are able to recruit partial complexes in the absence of MSL3. Sites with slightly lower affinity are still able to recruit the complex in the mof1 mutant. Because the interaction of the DCC with the X chromosome is thought to be largely mediated by MSL1 and MSL2, it remains to be explored whether MSL3 and the acetylase activity of MOF affect the active concentration of MSL1 and MSL2 or lead instead to the adoption of a high-affinity conformation of the complex. Conversely, it remains to be seen if over-expression of MSL1 and MSL2 in the msl-31 and mof1 mutants would allow partial complexes to bind additional sites. In this respect it is intriguing that the mutation of both roX RNAs, which is presumed to lead to incomplete and non-functional complexes, can be partially rescued by the over-expression of MSL1 and MSL2 (Dahlsveen, 2006).
During analysis of DCC recruitment to high-affinity sites inserted into autosomes of wild-type males, an additional band of DCC binding was observed close to the insertion site in three independent cases (one insert each of DBF9, DBF5, and DBF7). Such minimal and rare 'spreading' has previously been observed for ectopic insertions of the 18D high-affinity site and from roX transgenes in the wild-type male background. This study now reveals that these additional DCC binding sites are not a result of random spreading, but are most likely due to interaction of the DCC with one of the low-affinity sites on autosomes, which happened to reside close to the insertion site. These sites are usually only observed when the DCC concentrations are globally increased by over-expression of MSL1 and MSL2. Accordingly, it is suggested that the autosomal insertion of a high-affinity DCC binding site leads to a local rise in complex concentration, which allows these low-affinity sites to be recognized by the DCC even in wild-type males. However, additional requirements must clearly be met to allow low-affinity sites to profit from local increases in complex concentration, since not all ectopic high-affinity sites support the phenomenon. Permissive conditions may include active transcription or the presence of specific epigenetic marks (Dahlsveen, 2006).
It is envisioned that the clustering of DCC binding determinants of high and intermediate affinity on the X chromosome (combined with the transcription of the roX RNAs) elevates the concentration of the DCC within the X chromosomal territory and ensures the occupancy of lower-affinity sites in a context-dependent manner. This may explain the observation that autosomally derived transgenes often acquire dosage compensation. The transgenes may contain cryptic DCC binding determinants and may thus acquire binding if placed in the context of the X chromosomal territory. Conversely, an X chromosomal fragment that harbors only low-affinity sites may not be recognized if translocated to an autosomal context, and the fragment DBF3 may be an example for such a scenario. The presence of a large number of low-affinity sites may also contribute significantly to restricting the binding of the DCC to the X chromosome (Dahlsveen, 2006).
The term 'spreading' has been used to describe the appearance of additional bands of DCC binding around autosomal insertions of roX cDNAs or fragments derived thereof. However, extensive, long-range spreading from roX transgenes, which leads to the appearance of many ectopic DCC bands at greater distances from the insertion sites, occurs only under unusual conditions and depends on the transcription of the roX RNA rather than the DCC binding sites on DNA. Long-range spreading of the complex also does not occur into autosomal chromatin translocated to the X chromosome. It is suggested that large translocations maintain their original chromosomal context (DCC enriched or not), and therefore no redistribution of DCC over the new chromosomal junction is observable at the resolution of the polytene chromosomes. Importantly, this study does not address the higher-resolution distribution of the DCC within a chromosomal band. It is possible that such a band contains many individual binding sites, also of varying affinity. At this resolution, the term “spreading” may characterize the local diffusion of the DCC from high- to low-affinity sites. This study does not exclude this type of spreading, or indeed any other kind of complex distribution within a chromosomal band. High-resolution ChIP analyses will be necessary to resolve the detailed nature of DCC distribution (Dahlsveen, 2006).
Previously, only three high-affinity binding sites for DCC were known. This study identified nine more fragments, which encouraged investigation of common features within a larger pool. Interestingly, it was found that all new DBFs map to gene-rich regions and either overlap with or lie close to essential genes. Three high-affinity fragments (DBF12, DBF9, and DBF6) reside entirely within genes. It is possible that specific recruitment sites, such as those inferred to reside within these DBFs, have been enriched in and around genes that require dosage compensation during evolution, and consequently, high-affinity sites may represent loci that are particularly dosage sensitive. Previous experiments indicated that the DCC tends to bind to the coding regions of genes, and it has been suggested that this is linked to transcriptional activity. Although recent observations suggest that transcriptional activity alone is not sufficient to attract DCC binding, it is possible that transcription influences DCC recruitment to specific sites. For example, high-affinity sites, which show consistent and strong recruitment of the DCC at many chromosomal positions, may not be influenced by transcription. However, sites with lower affinity and variable recruitment ability may profit from transcriptional activity. Developmental differences in transcriptional activity may therefore also explain the lack of DCC recruitment in salivary glands to fragments isolated by ChIP from embryos (Dahlsveen, 2006).
Attempts were made to identify common sequence elements within previously characterized and new high-affinity DCC binding fragments and a number of short sequence elements, whose clustering in combinations could contribute to DCC recruitment, were uncovered. Clearly, the importance of these elements remains to be tested experimentally. Previous analysis of the roX DCC binding sites identified a 110 bp sequence containing several blocks of conservation between roX1 and roX2. DCC binding was affected by mutation in several of the conserved blocks, indicating that DCC binding sites may be made up of combinations of shorter elements. Such combinations have been sought by defining pairs of elements found within a 200 bp window in the high-affinity DCC binding fragments. X-enriched pairs often occur in multiple copies in the high-affinity fragmentsand at higher frequencies compared to the lower-affinity fragments DBF9-A, DBF1, DBF11, DBF13, and DBF3. Nonetheless, there is no obvious correlation between the location of individual pairs on the X and any specific features such as predicted genes. It is hypothesized that the elements that define these pairs (and other such elements that may have escaped attention) correspond to building blocks of DCC binding sites. Accordingly, a DCC binding site of given affinity for the complex would not be determined by a unique DNA sequence, but by clustering of variable combinations of short, degenerate sequence motifs. Individual low-affinity binding sites may not be unique to the X, but their clustering on the X may contribute to high-affinity binding. There are already indications that the DCC binds to several sites in close proximity. The two parts of DBF9, DBF9-A and DBF9-B, are both able to recruit the DCC, albeit with different affinity. The analysis of the 18D high-affinity fragment also suggested that multiple elements over 8.8 kb contribute to the binding of the complex (Dahlsveen, 2006).
The pairs were ordered according to sequence similarity. Interestingly, a large family of elements contain GAGA-related motifs. Mutation of GAGA or CTCT motifs in the 110 bp roX1/roX2 consensus severely affected DCC recruitment to that sequence, indicating that GAGA motifs are involved in DCC binding. The fact that these elements are found enriched in several independently identified high-affinity fragments demonstrates the appropriateness of the algorithms that were used. Besides elements with a clear relationship to GAGA motifs, it was also noticed several other element families were defined by sequence similarity. In order to visualize the element families, the related words may be aligned such that sequence logos representing degenerate motifs can be derived using the WebLogo software. It is possible that some of these degenerate motifs may contribute to DCC binding sites. Evaluation of the contributions of these novel motifs to the targeting of the complex will require increased resolution analysis and systematic evaluation of candidate sequences in the in vivo recruitment assay (Dahlsveen, 2006).
This study suggests that high-affinity DCC binding sites are composed of variable combinations of clustered, degenerate sequence motifs. The degeneracy of the sequence motifs indicates that many individual elements may have low affinity. Therefore, the interaction of the DCC with each individual site should be in dynamic equilibrium. However, it was recently observed by photobleaching techniques that the DCC components most likely involved in chromatin binding, MSL2 and MSL1, interact with the X chromosomal territory in cultured cells in an unusually stable manner, which is not compatible with binding equilibria involving off-rates that commonly characterize protein–DNA interactions. Several hypotheses can be formulated, whose evaluation may lead to resolution of this apparent contradiction. First, formation of higher-order structures involving many DCC components engaged in numerous simultaneous DNA interactions may lead to a trapping of the DCC within the X chromosome territory. Second, an initial sequence-directed targeting event may be followed by a stabilization of the interaction through positive reinforcement involving additional principles, such as epigenetic marks or a topological linkage. Finally, it is considered that the arrangement of the interphase genome in polytene chromosomes may differ in a relevant aspect from the more compact chromosomal territories of diploid cultured cells. Ultimately, the identification of the DNA-binding domains of DCC components and analysis of their mode of DNA interaction will be required to solve the targeting issue (Dahlsveen, 2006).
Msl-2 is required for the male-specific assembly of a dosage compensation regulatory complex on the X chromosome of Drosophila. Msl-2 binds in a reproducible, partial pattern to the male X chromosome in the absence of Mle or Msl-3, or when ectopically expressed at a low level in females. Moreover, the pattern of Msl-2 binding corresponds precisely in each case to that of MSL-1, suggesting that the two proteins function together to associate with the X chromosome. When MSL-1 function is compromised, Msl-2 is only localized to sites where the mutant form of MSL-1 is bound. Thus, a partial Msl-2 pattern is observed in msl-1 mutants. Likewise, when Msl-2 is present in limiting quantities, MSL-1 is precisely restricted to the sites of Msl-2 localization. Ectopic Msl-2 expression in females results in a dominant female phenotype. EMS-induced loss of function msl-1 and msl-2 alleles were isolated in a screen for suppressors of the toxic effects of Msl-2 expression in females. One such mutant lacks the RING finger motif. Site-directed mutagenesis was also used to determine the importance of the Msl-2 RING finger domain and second cysteine-rich motif. The mutations, including those in conserved zinc coordinating cysteines, confirm that the RING finger is essential for Msl-2 function, while suggesting a less stringent requirement for an intact second motif, the metalothionein-like cysteine cluster known as a PHD finger (Lyman, 1997).
Drosophila Msl proteins are thought to act within a complex to elevate transcription from the male X chromosome. Msl1, Msl2 and Msl3 proteins are associated in immunoprecipitations, chromatographic steps, and in the yeast two-hybrid system, but the Mle protein is not tightly complexed in these assays. Analysis focused on the Msl2-Msl1 interaction, which is postulated to play a critical role in Msl complex association with the X chromosome. Using a modified two-hybrid assay, missense mutations were isolated in Msl2 that disrupt its interaction with Msl1. In a Drosophila virilis Msl2 homolog, 11 out of 12 mutated residues that cluster around the first zinc-binding site of the RING finger domain have been found to be conserved. All but one of the residues (found to be important for Msl1 binding by reverse two-hybrid screening) are conserved in the D.virilis protein; the single exception, S53, is adjacent to a conserved proline (P54) and was itself mutated to proline in the screen. Four cysteine or histidine residues outside the RING domain are also conserved. Although these are not positioned to form a canonical zinc finger structure, mutation of one of the residues (C107) to arginine disrupts the interaction of Msl2 with Msl1. A second cysteine-rich region (amino acids 521-562), which is loosely related to the PHD motif and to metallothioneins, is conserved at all cysteines and histidines, including three positions not present in the published metallothionein alignment. The conservation of these residues is compatible with a previous report that mutation of C540 and C542 to alanine diminishes, but does not abolish msl2+ function in vivo. The acidic nature of an adjacent region (amino acids 563-592) is also conserved, while a more distal proline-rich region (amino acids 681-701) is incompletely conserved. A remarkable feature of the alignment is the presence of three gap regions having little or no homology. A short gap (amino acids 116-128) separates the RING finger and N-terminus from the rest of the protein. The second gap (amino acids 281-520) corresponds to the middle third of the Msl2 protein. This region contains most of the polymorphisms and length variations present within published D.melanogaster msl2 sequences; in addition, neither the repeats nor the acidic character of this region of the D.melanogaster protein are conserved in the D.virilis homolog. Finally, a gap following the second cysteine-rich domain (amino acids 593-614) is conserved in length, but is not conserved in sequence. These data suggest that the functions of the RING finger and other features of MSL2 have been conserved and that these domains may be positioned appropriately in the protein by more or less randomly evolving spacers. Two pre-existing D. melanogaster msl2 alleles, which fail to support male viability in vivo, have lesions in the same region of the RING finger. These were tested in the two-hybrid system and are also defective in interaction with Msl1. Mutation of the second zinc-binding site has little effect on Msl1 binding, suggesting that this portion of the RING finger may have a distinct function. These data support a model in which MSL2-MSL1 interaction nucleates assembly of an MSL complex, with which Mle is weakly or transiently associated (Copps, 1998).
In male Drosophila, histone H4 acetylated at Lys16 is enriched on the X chromosome, and most X-linked genes are transcribed at a higher rate than in females (thus achieving dosage compensation). Five proteins, collectively called the MSLs, are required for dosage compensation and male viability. Here it has been shown that one of these proteins, MSL1, interacts with three others, MSL2, MSL3 and MOF. The latter is a putative histone acetyl transferase. Overexpression of either the N- or C-terminal domain of MSL1 has dominant-negative effects, i.e. causes male-specific lethality. The lethality due to expression of the N-terminal domain is reduced if msl2 is co-overexpressed. MSL2 co-purifies over a FLAG affinity column with the tagged region of MSL1, and both MSL3 and MOF co-purify with the FLAG-tagged MSL1 C-terminal domain. Furthermore, the MSL1 C-terminal domain binds specifically to a GST-MOF fusion protein and co-immunoprecipitates with HA-tagged MSL3. The MSL1 C-terminal domain shows similarity to a region of mouse CBP, a transcription co-activator. It is concluded that a main role of MSL1 is to serve as the backbone for assembly of the MSL complex (Scott, 2000).
In general, the amino acid sequences of the MSLs suggest regions or domains within the proteins that could be important for function in vivo. Indeed, this has been confirmed by mapping loss-of-function mutations to the domain, such as the helicase domain of MLE, the putative acetylase domain of MOF and the RING finger region of MSL2. The amino acid sequence of MSL1 is the least informative, containing no recognizable domains, although regions rich in acidic amino acids and possible PEST sequences have been identified. To identify regions within MSL1 that are important for function in vivo, it was determined which regions have dominant-negative effects when overexpressed. Two regions of MSL1, one near the N-terminus and the other at the C-terminus, are likely to be important for assembly of the MSL complex in vivo, because overexpression of either region causes male-specific lethality. Genetic evidence, decreased male viability of msl2 heterozygotes and increased male viability by co-overexpression of MSL2, suggests that the region of MSL1 at the N-terminus interacts with MSL2. This has been confirmed by co-purification of MSL2 with FLAG-tagged versions of MSL1 over FLAG affinity columns. Similarly, the C-terminal region of MSL1 interacts with both MOF and MSL3. Furthermore, expression of the C-terminal domain results in significant loss of MOF from the male X chromosome (Scott, 2000).
The N-terminal FN region of MSL1 that binds to MSL2 was chosen originally for expression in flies because it was predicted that almost half of FN (amino acids 96-172) would form a two-stranded, alpha-helical, coiled-coil structure. Coiled-coil structures are comprised of a heptad repeat (abcdefg)n where hydrophobic residues occupy positions a and d on the same side of the alpha-helix. The coiled-coil motif of GCN4 mediates dimerization. If a similar structure mediates the formation of the MSL1-MSL2 heterodimer, then part of the region of MSL2 that interacts with MSL1 should form a coiled-coil structure. The Ring finger domain region of MSL2 interacts with MSL1. It is predicted that the region immediately preceding the RING finger could form a coiled-coil structure. It is particularly significant that several of the mutations that disrupt the interaction with MSL1 in yeast introduce amino acid changes that either significantly disrupt the alpha-helix (leucine to proline) or introduce a charged amino acid into the predicted hydrophobic face of the alpha-helix. The RING domain is found in a number of proteins, including the V(D)J recombination-activating protein RAG1. The crystal structure of the RAG1 dimerization domain, which includes the RING finger, reveals that dimerization is stabilized by interaction between alpha-helices that form a hydrophobic core. The RING finger is thought to form the structural scaffold upon which the dimer interface is formed. It is tempting to speculate, by analogy with RAG1, that the association of MSL1 and MSL2 involves the interaction of amphipathic alpha-helices that depend on the RING finger domain. This could best be addressed by determining the crystal structure of the MSL1-MSL2 complex (Scott, 2000).
In vitro translated MSL1 C-terminal domain co-immunoprecipitates with in vitro translated HA…MSL3 but not HA…MOF. Thus, C interacts directly with MSL3 but the interaction with MOF requires either another factor present in fly extracts or post-translational modification of MSL1 or MOF. While the possibility of a nucleic acid component of the FC-MOF complex cannot be ruled out, the possiblity (post-translational modification of MSL1 or MOF) is favored since a silver stain of FLAG affinity-purified FC-MOF complex separated by SDS-PAGE shows only two main bands corresponding to the sizes expected for FC and MOF. The C-terminal domain of MSL1 is rich in serine and threonine residues, and contains several potential phosphorylation sites and a predicted PEST sequence. PEST sequences have been suggested to contribute to the instability of the MSL1 protein. However, the role of these sequences in MSL1 has not been determined. Indeed, an alternative function for the PEST sequences is suggested by the observations that the PEST domains of PU.1 and IB are required for their respective interactions with Pip and c-Rel. In both cases, phosphorylation of a serine residue within the PEST sequence is required for the respective protein-protein interactions. The recent finding that a serine/threonine kinase is associated preferentially with the male X chromosome raises the possibility that MSL1 or another MSL is phosphorylated by this enzyme (Scott, 2000).
In the sequential model for assembly of the MSL complex, the first step involves the binding of the MSL1-MSL2 complex to several 'high affinity' sites on the male X chromosome. Since the localization of both MOF and MSL3 to the X chromosome requires mle+ function, this suggests that the association of MOF and MSL3 with the MSL1-MSL2 complex is MLE dependent. MLE could either bind directly to MOF and/or MSL3, or somehow stabilize the MSL complex together with roX RNA. In support of the latter model, MOF and MSL3 bind directly to the C-terminal domain of MSL1. Furthermore, MLE did not co-purify with an FC-MOF-MSL3 complex over an affinity column. However, the affinity chromatography experiments were designed to maximize the likelihood of detecting protein-protein association and are not quantitative. It is possible that MOF and MSL3 may have a higher affinity for the C-terminal domain of MSL1 than full-length MSL1. Thus, one possible mechanism is that in vivo the C-terminal domain of MSL1 is not freely available to bind to MOF and/or MSL3, and that the binding of MLE to the MSL1-MSL2 complex causes a conformational change in MSL1, such that the C-terminal domain becomes more accessible (Scott, 2000).
Previous searches of the protein sequence database with the complete MSL1 sequence have failed to identify any significant similarities. However, when a search is carried out with just the C-terminal domain sequence, some similarity is found to a 254 amino acid region of mouse CBP. Although the similarity is not high, given that the similarity extends across almost the entire C-terminal domain of MSL1, and that both CBP and the MSL1 C-terminal domain bind to histone acetyl transferases (or putative histone acetyl transferases), it is thought that this homology may be significant. If this similarity reflects a conserved function, then it would be predicted that the MSL1-similar region of CBP, which has no known function, would associate with either an MOF-like histone acetyl transferase or an MSL3-like protein in mammalian cells (Scott, 2000).
It is not known how the MSL complex binds to the male X chromosome. None of the MSLs contain a recognizable DNA-binding motif. The F84 version of MSL1, lacking the first 84 amino acids, binds to MSL2, MSL3 and MOF but does not bind preferentially to the male X chromosome. This suggests that the male lethality that results from overexpression of F84 is due to this protein being able to bind to three MSLs, but not being able to bind to the X chromosome because the first 84 amino acids of MSL1 are required for recognition of the X chromosome. Alternatively, the lack of binding of F84 to the male X chromosome could be because the beginning of MSL1 is required for assembly of the MSL complex in vivo. However, if so, then it would be expected that F84 would have bound to the 'high affinity' sites since F84 does bind to MSL2. Assuming that MSL1 and MSL2 are the only components of the high affinity complex, it would then appear more likely that the first 84 amino acids of MSL1 are required for X chromosome binding rather than complex formation. However, there are several lines of evidence that suggest that the roX RNAs are part of the MSL complex, which raises the possibility that one or both of the roX RNAs could be part of the high affinity complex. Thus it will be of interest to determine if the MSL complex containing the F84 protein binds to roX RNA with a lower affinity than the complex containing full-length MSL1 (Scott, 2000).
Dosage compensation in flies involves doubling the transcription of genes on the single male X chromosome to match the combined expression level of the two female X chromosomes. Crucial for this activation is the acetylation of histone H4 by the histone acetyltransferase (HAT) MOF. In male cells, MOF resides in a complex (dosage compensation complex, DCC) with MSL proteins and noncoding roX RNA. Previous studies suggested that MOF's localization to the X chromosome was largely RNA-mediated. This study shows that contact of the MOF chromo-related domain with roX RNA plays only a minor role in correct targeting to the X chromosome in vivo. Instead, a strong, direct interaction between a conserved MSL1 domain and a zinc finger within MOF's HAT domain is crucial. The functional consequences of this interaction were studied in vitro. Simultaneous contact of MOF with MSL1 and MSL3 led to its recruitment to chromatin, a dramatic stimulation of HAT activity and to improved substrate specificity. Activation of MOF's HAT activity upon integration into the DCC may serve to restrict the critical histone modification to the male X chromosome (Morales, 2004; full test of article)
Activation of the male X chromosome in Drosophila requires acetylation of H4K16 by MOF. In vitro, untargeted acetylation of H4K16 is sufficient to activate any chromatin template. Targeting MOF to a promoter in yeast via fusion to a heterologous DNA-binding domain also leads to derepression of transcription. Given the potential of MOF to activate transcription, fine-tuning the expression of X-linked genes crucially relies on restricting MOF activity to the X chromosome (Morales, 2004).
This study highlights the protein-protein interactions that dictate the incorporation of MOF into the DCC. More importantly, it illustrates a novel principle of conditional activation of the enzyme. Because MSL proteins are limiting in male cells and their association with the X chromosome is stable, essentially all MSL complexes are chromosomal. Activation of MOF requires interaction with MSL1, which initiates the DCC assembly, and MSL3, which is thought to associate with the complex after MSL1 and MSL2, suggesting that MOF may sense completed complex assembly. Integration into the complex unleashes MOF activity thereby restricting acetylation of H4K16 to the X chromosome. Rendering a regulatory acetylase activity dependent on the appropriate molecular context may be a more widespread principle. Recombinant Tip60 acetylase, like MOF a MYST family member, is unable to acetylate its physiological nucleosome substrate unless incorporated into a native complex. The HAT activity of the MYST member Sas2 absolutely requires Sas4 and is stimulated by Sas5 (Morales, 2004 and references therein).
Faithful association of MOF with the X chromosomal territory has been shown to be lost upon RNase treatment of nuclei in permeabilized cells and the chromo-related domain of MOF has been shown to be important for interaction with roX RNA in vivo, suggesting that targeting of MOF relies heavily on RNA interactions (Akhtar, 2000). The current results suggest that chromo-related domain-RNA interactions contribute to targeting but are not the primary targeting determinants for MOF. The RNase treatment not only results in displacement of MOF but also of MLE and MSL3 and might also affect other, yet unknown factors in the complex. RNA degradation thus leads to the simultaneous disruption of many protein-RNA interactions, which collectively may be required for complex integrity. Although roX RNA improves the assembly of the DCC and its distribution over the X chromosome under conditions of limiting MSL proteins in wild-type flies, the deficiency due to the absence of both roX RNAs can be partially overcome by overexpressing MSL1 and MSL2 in flies. This finding is consistent with the observation that protein-protein interactions are essential for targeting MOF to the X chromosome. How the incorporation of roX RNA into the DCC modulates the protein interactions discussed in this study and the dynamics of chromatin association remains to be explored (Morales, 2004).
Extending previous observations, this study emphasizes the central role of MSL1 in the DCC complex formation and chromatin recruitment. In addition to its well-documented association with MSL2, MSL1 directly interacts with MOF and MSL3 via two distinct surfaces. Interestingly, two phylogenetically conserved regions of MSL1 have recently been identified (Marin, 2003). The first one corresponds to an N-terminal coiled-coil domain involved in the interaction of MSL1 with the ring-finger domain of MSL2. The second one, called PEHE domain, overlaps with the fragment E (covering aa 766-939 and containing the conserved PEHE domain). It is therefore likely that a MOF interaction surface is present within the N-terminal part of this sequence conservation. The conservation of the PEHE region and the existence of MOF and MSL3 homologues in yeast and humans underline the functional importance of the observed interactions (Morales, 2004).
MOF interacts with MSL1 via the zinc-finger domain, a hallmark of MYST-type HAT domains. While this interaction is necessary for targeting MOF to the X chromosome, it is not known whether it is sufficient. Molecular modelling suggests that the zinc finger is an integral part of the HAT domain and hence additional surfaces may be involved in the contact. Interestingly, the very same mutations that abolish this interaction with MSL1 also led to reduced acetylation of histones (Akhtar, 2001). Since the zinc finger is not close to the substrate-binding pocket, it is considered that modulation of the zinc-finger structure, either through mutation or MSL1-MSL3 interaction, may have an allosteric negative or positive effect, respectively, on the ability of the catalytic site to interact with the histone tail substrate productively. Although the interactions of MSL3 and MOF are weak by comparison, MSL3 regulates MOF activity quantitatively and qualitatively. MSL3 and MOF interact with adjacent regions in the C-terminus of MSL1, which may promote their direct interaction. Alternatively, MSL3 may modulate MOF activity indirectly, through changes in MSL1 conformation (Morales, 2004).
In order to explore whether the activation of MOF's HAT activity by association with MSL1-MSL3 is due to enhanced binding to the chromatin substrate or an allosteric activation of catalysis, chromatin binding experiments were carried out. MSL1 interacts with chromatin and free DNA particularly well, and it helps both MSL3 and MOF to associate with chromatin. This observation lends additional support to the earlier notion of a central 'platform' function of MSL1. MSL1 interacts with MSL2, MSL3, MOF and chromatin and thus is ideally suited to function as a nucleation factor for the DCC complex assembly on chromatin. Interestingly, MSL3 also assists MOF's chromatin association, in keeping with the functional interactions between the two proteins observed in vitro and in vivo (Buscaino, 2003). Since the magnitude of the stimulation of chromatin binding is still an order of magnitude less than the observed stimulation of HAT activity, it is quite possible that allosteric effects of MSL protein association on the catalytic center of MOF contribute to activation of the HAT upon incorporation into the complex (Morales, 2004).
Interestingly, interaction of MOF with MSL1 and MSL3 also lead to a change in substrate specificity. In the absence of MSL3, MOF activity is mainly directed towards MSL1, even though the nucleosomal substrate is present. In reactions containing only MSL3 and MOF, acetylation of MSL3 can also be detected (Buscaino, 2003). In the presence of both MSL1 and MSL3, MOF does not acetylate either protein significantly, but histone H4 is the exclusive substrate. Whether acetylation of MSL1 occurs in the context of DCC assembly or its distribution over the X chromosome in vivo remains to be seen. The sensitivity of the metabolic labelling strategy did not suffice to detect acetylation of endogenous MSL1 in cells. The interaction of the DCC subunits is envisioned to be dynamic during the initial assembly of the complex, its propagation over the X chromosome and its perpetuation through replication and mitosis. Conceivably, MSL1 acetylation may occur transiently at one stage, may signal a particular functional status or be involved in feedback loops fine-tuning the two-fold enhancement of transcription from the male X chromosome (Morales, 2004).
The male-specific-lethal (MSL) proteins in Drosophila melanogaster serve to adjust gene expression levels in male flies containing a single X chromosome to equal those in females with a double dose of X-linked genes. Together with noncoding roX RNA, MSL proteins form the 'dosage compensation complex' (DCC), which interacts selectively with the X chromosome to restrict the transcription-activating histone H4 acetyltransferase MOF (Males-absent-on-the-first) to that chromosome. MSL3 is essential for the activation of MOF's nucleosomal histone acetyltransferase activity within an MSL1-MOF complex. By characterizing the MSL3 domain structure and its associated functions, it has been found that the nucleic acid binding determinants reside in the N terminus of MSL3, well separable from the C-terminal MRG signatures that form an integrated domain required for MSL1 interaction. Interaction with MSL1 mediates the activation of MOF in vitro and the targeting of MSL3 to the X-chromosomal territory in vivo. An N-terminal truncation that lacks the chromo-related domain and all nucleic acid binding activity is able to trigger de novo assembly of the DCC and establish an acetylated X-chromosome territory (Morales, 2005).
The MSL1 interaction surface maps to the C-terminal half of MSL3. This part of MSL3 is characterized by similarities to the MRG domain that subsumes MRG15, MSL3, and related proteins in multiple species into the so-called MRG family. The msl3 gene is related to the Drosophila mrg15 gene, suggesting an early gene duplication event. Accordingly, MRG sequences in MSL3 are highly conserved between D. melanogaster and Drosophila virilis. The MRG domain consists of three blocks of strong sequence similarity separated by short amino acid stretches of lesser conservation. Interestingly, these 'linker' regions harbor rather long insertions in MSL3 of flies and humans. The C terminus of MSL3 may thus be organized by folding of MRG signature sequences, which are disconnected in the primary sequence, into a compact unit from which the MSL3-specific structures 'loop out.' Consistent with this idea, it was found that every deletion in the C terminus of MSL3 compromises interaction with MSL1. Most of these deletions affect at least one of the blocks of MRG sequence similarity, most likely leading to global misfolding. However, one deletion that abolishes MSL1 binding (Delta328-433) selectively removed MSL3-specific sequences between two MRG blocks. There is considerable conservation of these sequences in the Drosophila species for which sequence information has recently become available, suggesting a conserved function, but whether this sequence contains a dedicated MSL1 interface remains to be explored. In any case, this analysis suggests that the MRG sequence similarity reflects a functional domain. The MRG-MSL1 contact is essential for targeting MSL3 to the X-chromosomal territory, confirming the functional importance of the interactions defined in vitro. It is suggested that MRG modules in other MRG family members may also constitute protein-protein interaction units (Morales, 2005).
In vitro analysis showed that MSL3 interacts better with single-stranded nucleic acids than with dsDNA. The significance of ssDNA interaction, if any, is unclear at the moment. In contrast, there is evidence that MSL3 interacts with roX RNA in vivo and in vitro, but the domain involved in RNA binding had not been defined. Biochemical analysis demonstrates that the nucleic acid binding structures reside in the N-terminal half of MSL3, which also contains the CRD. Previously, it has been suggested that RNA interaction of MSL3 is affected by its acetylation at lysine 116, close to the CRD. In the current studies, a fragment comprising the first 140 amino acids (and hence the CRD as well as K116) was not sufficient for nucleic acid binding, but sequences up to amino acid 259 contributed significantly. To what extent the CRD of MSL3 contributes to RNA binding needs to be established. The CRDs of MSL3 and MOF appear more related to each other than to canonical chromodomains. They lack the alpha-helix supporting the ß-sheet bundle and aromatic residues that may be involved in recognition of methylated histone N termini. The CRD of MOF also appears not to be sufficient for RNA binding. A further interesting similarity between MOF and MSL3 is that nucleic acid interactions are not the primary targeting determinant for either MOF (Morales, 2004) or MSL3. Although impairment of the CRDs leads to somewhat increased binding of the corresponding GFP fusion protein to autosomes, their concentration on the X-chromosomal territory is still obvious. However, the CRDs and noncoding RNA may have functions that are not assayed for in simple recruitment experiments. It is also possible that the CRDs of MOF and MSL3 provide partially redundant functions for DCC assembly. In contrast, mutations in MOF or MSL3 that abrogate their interaction with the C terminus of MSL1 prevent faithful recruitment to the X chromosome. Obviously, the recruitment assay employed may just reveal the strongest binary interaction that MSL3 or MOF are involved in. However, the fact that overexpression of an MSL3 lacking all nucleic acid binding capacity was able to complement an MSL3 deficiency and to trigger the accumulation of MOF and H4K16 acetylation on the X-chromosomal territory emphasizes the importance of the MSL protein interactions for the assembly of a functional DCC (Morales, 2005).
MSL complexes can be formed in vitro in the absence of RNA. A deficiency of roX RNA in vivo can be partially overcome by overexpression of the 'platform' proteins MSL1 and MSL2. It is possible that transient overexpression of MSL3 overcomes the RNA requirement and that under normal conditions of limiting MSL protein concentrations RNA is required for faithful DCC assembly (Morales, 2005).
The remarkable stimulation of MOF's HAT activity upon association of MSL3 with an MSL1-MOF complex was not due to enhanced binding of MSL3 to nucleic acids but rather required contact of MSL3 with the MSL1 scaffold. MOF and MSL3 are brought into proximity by interaction with adjacent structures in the C terminus of MSL1 (Morales, 2004). It is possible that the MSL1 scaffold stabilizes an otherwise transient and therefore nonproductive direct contact between MSL3 and MOF (Morales, 2004). The existence of such a contact has been inferred from the fact that MSL3 can be acetylated by MOF. However, when it comes to acetylation, MSL1 is a much better substrate for MOF than MSL3 (Morales, 2004). The new data reinforce a previous model of an acetylation 'checkpoint' built into DCC assembly. Accordingly, the regulatory potential of H4K16 acetylation would only be fully realized upon binding of MOF with MSL1 and the completion of the complex by association of MSL3 (Morales, 2004). Such a checkpoint would render full activation of MOF dependent on proper DCC assembly and hence 'maleness' and serve to restrict the critical epigenetic mark to the X chromosome (Morales, 2005).
A version of MSL1 missing the first 84 amino acids with a FLAG tag at the amino end does not bind to the male X chromosome (Scott, 2000). Full-length MSL1 with an amino-terminal FLAG tag does bind to the male X chromosome, although binding is not strong. This indicates that the first 84 amino acids were important for X chromosome binding and that adding a FLAG tag at the amino end may interfer with binding. To determine if the amino-terminal domain is sufficient for X chromosome binding, transgenic Drosophila lines were made that expressed the domain with an HA epitope tag at the carboxyl end (MSL1NHA). The domain includes the conserved N-terminal basic region, the predicted coiled coil, and an acidic region (aa 179 to 186). It was predicted that this domain would be able to bind to the male X chromosome but only to the ~30 high-affinity sites. This is because the amino-terminal domain does not interact with MOF and MSL3, both of which are needed for the MSL1/MSL2 complex to bind to sites on the X chromosome other than the high-affinity sites (Gu, 1998, Palmer, 1994). However, it was found that the HA-tagged amino-terminal domain of MSL1 (MSL1NHA) bind to hundreds of sites on the male X chromosome. Identical results were obtained if MSL1NHA expression was controlled by either the strongly heat-inducible hsp70 promoter or the constitutive armadillo promoter. Further, it was found that with the hsp70 construct, basal-level expression at 25°C was sufficient to detect X chromosome binding of MSL1NHA. Heat shock treatment to overexpress MSL1NHA did not lead to a significant increase in binding to the autosomes, nor did it disrupt X chromosome binding by other components of the MSL complex. Since heat treatment was not necessary to detect X chromosome binding of MSL1NHA, all additional experiments in this study were performed with larvae raised at 25°C without heat shock. Surprisingly, daily heat-shock treatment of the progeny of an MSL1NHA line had little effect on male viability (85 male and 119 female progeny obtained), indicating that binding of MSL1NHA to the X chromosome did not significantly disrupt MSL complex activity. In contrast, overexpression of a truncated version of MSL1 missing the first 84 amino acids that does not bind to the X chromosome was lethal to males (Scott, 2000). Δ84HA, which is identical to MSL1NHA but lacks the first 84 amino acids, does not bind to the male X chromosome. The lack of binding could be because the Δ84HA protein lacks a nuclear localization sequence. However, staining of whole salivary glands showed that Δ84HA is localized to the nucleus. Thus, the first 84 amino acids of MSL1 appear to play an essential role in X chromosome binding (Li, 2005).
The observed binding of MSL1NHA to hundreds of sites on the male X chromosome could be because the domain recognizes all the sites or because it associates with the MSL complex bound to the X chromosome. To distinguish between these two possibilities, the appropriate crosses to generate larvae that carried the MSL1NHA transgene but lacked endogenous MSL1. In the absence of MSL1, none of the components of the MSL complex bound to the X chromosome. Since it can be difficult to obtain good-quality polytene chromosomes from dying msl1 mutant males, salivary glands were isolated from female larvae that constitutively expressed MSL2. In the absence of msl1, MSL1NHA bound to about 30 sites, which corresponded to the previously mapped high-affinity sites (Lyman, 1997). MSL2 colocalized with MSL1NHA to the high-affinity sites. MLE also colocalized to the high-affinity sites with MSL1NHA; however, MOF did not. The latter result was expected, since MOF binds to the carboxyl-terminal domain of MSL1 (Scott, 2000) and functional MOF is required for MSL complex binding to sites other than the high-affinity sites. In control sibling female larvae that were heterozygous for the msl1L60 null mutation, MSL1NHA bound to hundreds of sites on the X chromosomes. MSL2 and MOF colocalized with MSL1NHA. These results show the amino-terminal domain of MSL1 complexed with MSL2 can specifically recognize the high-affinity sites on the X chromosome. However, in the presence of native MSL complex, MSL1NHA binds to hundreds of sites, presumably via association with the complex (Li, 2005).
Since Δ84HA does not bind to the male X chromosome, three additional smaller deletion mutants were made to identify the region important for X chromosome binding. Like Δ84HA, Δ74HA did not bind to the male X chromosome. Δ50HA, however, bound very weakly to the male X chromosome in approximately 50% of the nuclei examined. In the other 50% of nuclei, no staining of the X chromosome with the anti-HA antibody could be detected above background levels. In contrast, Δ26HA bound more strongly to the X chromosome but with less intensity than MSL1NHA (Li, 2005).
Given that the binding of MSL1NHA to the X chromosome is restricted to the high-affinity sites in the absence of endogenous MSL1, it was next asked if Δ26HA could bind to the X chromosome in a msl1 null mutant background. It was found that there was no binding of Δ26HA to the X chromosomes in homozygous msl1L60 female larvae that expressed MSL2. This demonstrates that the first 26 amino acids of MSL1 are essential for binding to the high-affinity sites. This region contains several well-conserved basic and aromatic amino acid residues. To test the importance of some of these conserved amino acids in X chromosome binding, two mutant versions of MSL1NHA were made. In mut_bas1, three of the conserved basic amino acids, lysine 3, arginine 4, and lysine 6, were all replaced by alanine. In a wild-type genetic background, this mutant version of MSL1NHA bound to hundreds of sites on the male X chromosome. However, in the absence of endogenous MSL1, binding was restricted to only five of the high-affinity sites. Two of these sites mapped to the location of the roX genes, roX1 at 3F and roX2 at 10C. In the second mutation, mut_bas2, two of the conserved aromatic amino acids (phenylalanine 5 and tryptophan 7) were changed to alanine. This mutation did not appear to disrupt binding to the high-affinity sites in msl1L60 null female larvae that expressed MSL2. However, mut_bas2 bound to significantly more autosomal sites than MSL1NHA. Thus, it appears that three of the conserved basic amino acids are essential for binding to most of the high-affinity sites. In addition, two of the conserved aromatic amino acids appear to be important for distinguishing X from autosomes, that is, the specificity of binding (Li, 2005).
The binding of MSL1NHA to hundreds of sites on the male X chromosomes appears to be in part due to association with the native MSL complex. The observation that Δ26HA bound to these sites but Δ74HA did not indicated that the region between amino acids 26 and 74 is important for association with the MSL complex. This region is particularly rich in the amino acids glycine, proline, asparagine, and histidine in all Drosophila MSL1 proteins. Glycine-rich domains are a common feature of many proteins including RNA binding proteins and can mediate protein-protein interaction. The glycine-rich domain of the Drosophila Sex-lethal RNA binding protein, which is the master regulator of dosage compensation, promotes self-association. Therefore whether the MSL1 glycine-rich domain would facilitate MSL1 self-association was examined. It was found that MSL1 coimmunoprecipitates from whole-fly protein extracts with MSL1NHA and Δ26HA but not Δ84HA, Δ74HA, or Δ50HA. There was a small variation in immunoprecipitation efficiency of the HA-tagged proteins, which were also detected with the MSL1 antibody. However, this was not sufficient to account for the lack of coimmunoprecipitation of MSL1 with the more truncated versions of MSL1NHA. MSL2 was not required for MSL1 self-association, sicne protein extracts were prepared from adult females, which normally do not make MSL2 protein. Δ26HA did not coimmunoprecipitate with MSL3, showing the specificity of the interaction of Δ26HA with MSL1. Deletion of the first 84 amino acids did not, however, disrupt interaction with MSL2, confirming previous studies (Copps, 1998, Scott, 2000). Thus, MSL1NHA appears to interact with the native MSL complex via MSL1 self-association (Li, 2005).
It has been suggested that the predicted leucine zipper-like region of MSL1 may interact with an predicted amphipathic α-helix at the amino terminus of MSL2 to form a coiled-coil structure (Scott, 2000). Likely orthologs of MSL1 and MSL2 have been identified from invertebrate and vertebrate genome sequences. Amino acid sequence alignments of MSL1 and MSL2 orthologs showed a high degree of conservation of the predicted α-helical regions. Inspection of the alignments showed that both MSL1 and MSL2 proteins contained a highly conserved region that is largely apolar and precedes the coiled coil. For MSL1, a glutamine-rich spacer separated the apolar and coiled-coil regions. Alanine substitution mutations were made in the apolar, glutamine-rich, and leucine zipper-like regions of MSL1 to investigate the relative importance of these regions in dimerization with MSL2 (Li, 2005).
In vitro-translated [35S]methionine-labeled MSL1NHA coimmunoprecipitates with the FLAG-tagged amino-terminal domain of MSL2 (aa 1 to 193) (MSL2NFLAG) from transformed whole-fly extract. MSL1NHA does not coimmunoprecipitate with control extract prepared from untransformed wild-type flies. Immunoprecipitations were performed under stringent high-salt conditions (500 mM NaCl), and thus only specific interactions should be detected. This was confirmed by the lack of coimmunoprecipitation of the carboxyl-terminal domain of MSL1 (aa 705 to 1039) with MSL2NFLAG. A derivative of MSL1NHA with mutations in the apolar region (mut_apo) does not coimmunoprecipitate with MSL2NFLAG. In contrast, mutations in the glutamine-rich region (mut_QEQ) do not appear to disrupt the MSL1:MSL2 interaction. This cannot be due to differences in immunoprecipitation efficiency, since recovery of MSL2NFLAG was similar. Consistent with these in vitro binding results, mut_QEQ bind to hundreds of sites on the male X chromosome. Further, no binding of mut_apo to the male X chromosome could be detected. Thus, the apolar but not the glutamine-rich region of MSL1 appears to be important for interaction with MSL2 (Li, 2005). Dimerization of coiled-coil proteins is driven by interaction between apolar side chains in the a and d positions of the α-helix. The binding is enhanced by ionic interactions between charged amino acids in the e and g positions. Consequently alanine-substitution mutations were made in the a, d, e, and g positions in the leucine zipper-like motif that follows the glutamine-rich region. It was found that all of the mutant versions of MSL1NHA coimmunoprecipitate with MSL2NFLAG. However, there appeared to be significantly less coimmunoprecipitation of two of the mutations, mut_cc1 and mut_cc2, with MSL2NFLAG. The efficiency of immunoprecipitation of MSL2NFLAG was similar for all four coiled coil mutant preparations. These results suggest that the mut_cc1 and mut_cc2 alanine substitution mutations have weakened the interaction between MSL1 and MSL2 (Li, 2005).
Home page: The Interactive Fly © 2006 Thomas Brody, Ph.D.
The Interactive Fly resides on the
Society for Developmental Biology's Web server.