org Interactive Fly, Drosophila Sex combs on midleg: Biological Overview | Evolutionary Homologs | Regulation | Developmental Biology | Effects of Mutation | References

Gene name - Sex combs on midleg

Synonyms -

Cytological map position - 85E1--10

Function - modification of chromatin structure

Keywords - Polycomb group

Symbol - Scm

FlyBase ID: FBgn0003334

Genetic map position - 3-48.5

Classification - zinc finger (C2C2 type) SAM motif protein; SPM-domain protein

Cellular location - nuclear



NCBI links: Precomputed BLAST | Entrez Gene
BIOLOGICAL OVERVIEW

Recent literature
Kang, H., McElroy, K.A., Jung, Y.L., Alekseyenko, A.A., Zee, B.M., Park, P.J. and Kuroda, M.I. (2015). Sex comb on midleg (Scm) is a functional link between PcG-repressive complexes in Drosophila. Genes Dev 29: 1136-1150. PubMed ID: 26063573
Summary:
The Polycomb group (PcG) proteins are key regulators of development in Drosophila and are strongly implicated in human health and disease. How PcG complexes form repressive chromatin domains remains unclear. Using cross-linked affinity purifications of BioTAP-Polycomb (Pc) or BioTAP-Enhancer of zeste [E(z)], this study captured all PcG-repressive complex 1 (PRC1) or PRC2 core components and Sex comb on midleg (Scm) as the only protein strongly enriched with both complexes. Although previously not linked to PRC2, the direct binding of Scm and PRC2 was confirmed using recombinant protein expression and colocalization of Scm with PRC1, PRC2, and H3K27me3 in embryos and cultured cells using ChIP-seq (chromatin immunoprecipitation [ChIP] combined with deep sequencing). Furthermore, it was found that RNAi knockdown of Scm and overexpression of the dominant-negative Scm-SAM (sterile α motif) domain both affect the binding pattern of E(z) on polytene chromosomes. Aberrant localization of the Scm-SAM domain in long contiguous regions on polytene chromosomes revealed its independent ability to spread on chromatin, consistent with its previously described ability to oligomerize in vitro. Pull-downs of BioTAP-Scm captured PRC1 and PRC2 and additional repressive complexes, including PhoRC, LINT, and CtBP. The study proposes that Scm is a key mediator connecting PRC1, PRC2, and transcriptional silencing. Combined with previous structural and genetic analyses, these results strongly suggest that Scm coordinates PcG complexes and polymerizes to produce broad domains of PcG silencing.

Frey, F., Sheahan, T., Finkl, K., Stoehr, G., Mann, M., Benda, C. and Muller, J. (2016). Molecular basis of PRC1 targeting to Polycomb response elements by PhoRC. Genes Dev 30: 1116-1127. PubMed ID: 27151979
Summary:
Polycomb group (PcG) protein complexes repress transcription by modifying target gene chromatin. In Drosophila, this repression requires association of PcG protein complexes with cis-regulatory Polycomb response elements (PREs), but the interactions permitting formation of these assemblies are poorly understood. This study shows that the Sfmbt subunit of the DNA-binding Pho-repressive complex (PhoRC) and the Scm subunit of the canonical Polycomb-repressive complex 1 (PRC1) directly bind each other through their SAM domains. The 1.9 A crystal structure of the Scm-SAM:Sfmbt-SAM complex reveals the recognition mechanism and shows that Sfmbt-SAM lacks the polymerization capacity of the SAM domains of Scm and its PRC1 partner subunit, Ph. Functional analyses in Drosophila demonstrate that Sfmbt-SAM and Scm-SAM are essential for repression and that PhoRC DNA binding is critical to initiate PRC1 association with PREs. Together, this suggests that PRE-tethered Sfmbt-SAM nucleates PRC1 recruitment and that Scm-SAM/Ph-SAM-mediated polymerization then results in the formation of PRC1-compacted chromatin.

The Sex combs on midleg (Scm) gene (Jürgens, 1985; Breen, 1986) encodes one of the Polycomb group (PcG) repressors (McKeon, 1991; Simon, 1992). Embryos lacking both maternal and zygotic Scm product die with most segments transformed into copies of the eighth abdominal segment (Breen, 1986). This null phenotype, which is among the strongest seen in single PcG mutants, shows that the Scm product is a central component in PcG repression. Scm protein represses multiple homeotic genes during embryonic stages (Breen, 1986; McKeon 1991, and Simon, 1992). Analysis of pupal lethal Scm alleles (Wu, 1989) shows that Scm is also required postembryonically. These genetic data and the continuous developmental expression of SCM mRNA imply a long-term role for Scm product in homeotic repression, like most other PcG products, (Bornemann, 1996).

Although Scm has been most well-characterized in terms of homeotic gene control, it is also likely to be involved in other processes, as are many of the PcG proteins. Scm is a regulator of the segmentation gene engrailed (Moazed, 1992) and genetic studies suggest a role in dorsal-ventral development (Adler, 1991). The suppression of zeste 1 eye color by Scm mutations may reflect an Scm role in white gene expression. This suppression does not require unusual Scm alleles, since it occurs when there is a deficiency at the Scm locus and in apparent Scm null mutations. Although the mechanism of zeste 1 suppression is unclear, it is intriguing that a subset of PcG products, including Scm, Enhancer of zeste and Posterior sexcombs, share the zeste interaction. Investigation of the physical interactions between Scm protein and its PcG cohorts should help define how transcription is modulated at homeotic loci and at other loci under PcG control (Bornemann, 1996 and references).

The Scm and Polyhomeotic proteins have in common the same domain (termed the SPM domain) located at their respective C termini. Using the yeast two-hybrid system and in vitro protein-binding assays, it has been shown that the SPM domain mediates direct interaction between Scm and Ph. Binding studies with isolated SPM domains from Scm and Ph show that the domain is sufficient for these protein interactions. These studies also show that the Scm-Ph and Scm-Scm domain interactions are much stronger than the Ph-Ph domain interaction, indicating that the isolated domain has intrinsic binding specificity determinants. Analysis of site-directed point mutations identifies residues that are important for SPM domain function. These binding properties, predict an alpha-helical secondary structure, and conservation of hydrophobic residues has prompted comparisons of the SPM domain to the helix-loop-helix and leucine zipper domains used for homotypic and heterotypic protein interactions in other transcriptional regulators. Scm and Ph proteins co-localize at polytene chromosome sites in vivo (Peterson, 1997).

To begin to investigate the mechanism and specific residues used for SPM domain protein contact, the effects of site-directed mutations in either the Scm or Ph domains on in vitro binding were tested. The mutations were targeted to residues that are highly conserved in alignments of proteins with similar domains. The point mutations fall into two classes: those that target conserved residues in the extended SAM domain family and those that target residues conserved only in the high-homology SPM subgroup. Three site-directed mutations have been generated in the SPM domain of Scm. The G31S mutation alters a residue that is absolutely conserved in all 23 compiled versions in the extended domain family. This mutant was tested in the context of radiolabelled full-length Scm protein for binding to the minimal Scm and Ph domains. Both Scm-Scm and Scm-Ph interactions are greatly reduced in vitro. Consistent with the residual binding activity seen in the G31S mutant, G31S is found to mediate a reduced but still detectable interaction in the two-hybrid system (Peterson, 1997).

The L35S;L36S double mutation and the K49A mutation affect residues conserved in the high-homology subgroup but not in the extended domain family. Substantial self- and cross-binding activity is retained with these mutant proteins. The only reduction seen with these two mutants is a modest effect of the L35S:L36S double substitution upon the Scm-Scm interaction. This mutant causes a several-fold loss in Scm-Scm binding but retains Scm-Ph cross-binding activity comparable to that of the wild type (Peterson, 1997).

Five site-directed mutations were generated in the SPM domain of Polyhomeotic. All five mutations alter residues that are highly conserved in the extended domain family. These mutations were inserted into the context of the minimal GSTph1511-1576 fusion protein and then tested for binding to the minimal Scm radiolabelled domain. W1A and G51A ph mutations cause significant reductions in binding activity to Scm. In contrast, mutations in the conserved hydrophobic residues (L34A, L42A, and I63D) have little effect on in vitro Scm-Ph interaction (Peterson, 1997).

The two-hybrid and GST pulldown assays show that the Scm and Ph proteins can bind each other directly and that their respective SPM domains mediate qualitatively strong interactions. However, these experiments do not address whether the Scm and Ph proteins are partners at sites of action in vivo. To assess association in vivo, the Scm and Ph distributions were compared on wild-type polytene chromosomes. In addition, colocalization was tested for at an engineered chromosomal site containing an isolated segment of homeotic gene regulatory DNA. Polytene chromosome immunostaining experiments have shown that Ph protein accumulates at its two most well-characterized target loci, the Antennapedia (Ant-C) and bithorax (BX-C) homeotic gene complexes. In addition, Ph protein is associated with approximately 100 other sites in the genome. Ph protein immunolocalizes at the BX-C site as well as at five flanking sites on chromosome 3R. The same section of chromosome stains with antibody against Scm protein. There is strong signal at the BX-C locus, and the Scm distribution on flanking sites is identical to the Ph distribution. The Ph and Scm protein distributions in the Ant-C region are also identical. Since the antibodies used in these studies are both rabbit polyclonal antibodies, double-staining experiments to determine if all the approximately 100 Ph and Scm sites are identical could not be performed. However, comparison of the Scm sites on the five major chromosome arms with the Ph sites indicates that there is at least 90% overlap in the distributions of these two proteins on polytene chromosomes (Peterson, 1997).

To compare Ph and Scm association with an additional site of action in vivo, colocalization was tested at a site containing regulatory DNA isolated from a homeotic gene. The germ line transformant, 85-39, contains a 14-kb segment from the bxd regulatory region of the BX-C complex inserted near the tip of chromosome 3L at cytological location 62A. Previous work has shown that this transformed DNA segment creates a novel site of Ph protein accumulation and that expression programmed by this 14-kb DNA segment is regulated by Ph and Scm in vivo. Scm protein accumulates at the insertion site of this bxd regulatory DNA. Thus, Scm and Ph proteins are both recruited to an engineered chromosomal site containing an in vivo regulatory target. This result, together with the coincidence of the Ph and Scm proteins at many wild-type chromosomal sites, provides evidence for association of these proteins in vivo (Peterson, 1997).

Sex comb on midleg (Scm) is a functional link between PcG-repressive complexes in Drosophila

The Polycomb group (PcG) proteins are key regulators of development in Drosophila and are strongly implicated in human health and disease. How PcG complexes form repressive chromatin domains remains unclear. Using cross-linked affinity purifications of BioTAP-Polycomb (Pc) or BioTAP-Enhancer of zeste [E(z)], this study captured all PcG-repressive complex 1 (PRC1) or PRC2 core components and Sex comb on midleg (Scm) as the only protein strongly enriched with both complexes. Although previously not linked to PRC2, direct binding of Scm and PRC2 was confirmed using recombinant protein expression and colocalization of Scm with PRC1, PRC2, and H3K27me3 in embryos and cultured cells using ChIP-seq (chromatin immunoprecipitation [ChIP] combined with deep sequencing). Furthermore, it was found that RNAi knockdown of Scm and overexpression of the dominant-negative Scm-SAM (sterile α motif) domain both affected the binding pattern of E(z) on polytene chromosomes. Aberrant localization of the Scm-SAM domain in long contiguous regions on polytene chromosomes revealed its independent ability to spread on chromatin, consistent with its previously described ability to oligomerize in vitro. Pull-downs of BioTAP-Scm captured PRC1 and PRC2 and additional repressive complexes, including PhoRC, LINT, and CtBP. It is proposed that Scm is a key mediator connecting PRC1, PRC2, and transcriptional silencing. Combined with previous structural and genetic analyses, these results strongly suggest that Scm coordinates PcG complexes and polymerizes to produce broad domains of PcG silencing (Kang, 2015).

One of the most interesting properties of chromatin modification is the ability, under certain circumstances, to propagate in cis independent of sequence. This ability to 'spread' may be important for the inheritance of chromatin states initially established through interactions at nucleation sites such as PREs. SAM domain-mediated polymerization is therefore an attractive model to explain the propagation of PcG silencing. From that perspective, Ph, one of the core components of PRC1, may be responsible for the spreading of PRC1 through Ph-SAM polymerization. The compact chromatin environment formed by PRC1 spreading may improve the enzymatic activity of PRC2, and the capacity of the Pc chromodomain to interact with H3K27me3 may also contribute to the synergistic spreading of PcG silencing. However, consistent with the identification of PRC1 and PRC2 as distinct complexes that purify independently, E(z) RNAi does not significantly affect binding patterns of Pc on polytene chromosomes, and E(z) binding is likewise still detected after Pc RNAi. This study found that Scm directly interacts with the PRC2 complex and colocalizes with H3K27me3 in genome-wide analyses. Scm RNAi results in the loss of major sites of E(z) binding and the redistribution of H3K27me3 on polytene chromosomes. In addition, overexpression of the Scm-SAM domain interferes with binding of endogenous Scm to chromosomes and appears to self-polymerize for long distances on polytene chromosomes independently of PRC1 and PRC2. Taken together, it is suggested that the interaction of Scm with PRC2 and polymerization by the Scm-SAM domain may be key factors contributing to PRC2 and H3K27me3 spreading (Kang, 2015).

The strong interaction that discovered between Scm and the G9a SET domain protein, an H3K9 methyltransferase, suggests a new link between H3K27 and H3K9 methylation in Drosophila. These classical histone marks, associated with silent chromatin, were once thought to be largely distinct but are now proposed to have a functional relationship in PcG silencing in mammals, most notably in X inactivation. There was also evidence for colocalization of these two marks in early ChIP analyses at the HOX gene Ubx in imaginal discs. G9a may also play a role in regulation of H3K27 methylation (Mozzetta, 2014). Alternatively, the abundance of G9a may reflect a key role for the CtBP corepressor complex in PcG function rather than for G9a itself, which is a nonessential gene. Genome-wide binding profile analyses have shown that the components of the CtBP complex such as Su(var)3-3 (LSD1) and Rpd3 (HDAC1) are mainly enriched on active genes rather than repressed genes in human cells, and L(3)mbt, one component of the LINT complex, colocalizes with insulator proteins, including CP190 and mod(mdg4), rather than with PcG proteins in Drosophila. Therefore, the possibility that the interactions of Scm with CtBP or LINT repressor complexes occur as independent complexes irrelevant to PcG silencing cannot be excluded. However, considering that PcG silencing could require dynamic interactions during development, components of these repressor complexes may not be permanently stationed in PcG silenced domains but rather participate in PcG silencing transiently. Furthermore, some of the CtBP subunits were copurified in Pc and E(z) affinity purifications, and previous studies reported that CtBP complex components can contribute to PcG silencing in Drosophila. For example, CtBP mutation causes the loss of Pc recruitment to many PREs. Furthermore, Rpd3 deacetylates H3K27ac, which is mutually exclusive with H3K27me3, and Su(var)3-3 demethylates H3K4me1 and H3K4me2, which are active marks linked to Trithorax (Trx) activity and H3K27ac (Kang, 2015).

How PcG complexes find PREs and spread to create repressive domains is not known on a mechanistic level. Perhaps repressor complexes such as CtBP help remove active chromatin marks to attract the PcG initially or enable cycles of spreading to maintain those domains. Future analyses will entail dissecting the direct interactions of Scm, including nucleosomes and their post-translational modifications. Furthermore, through iterative use of BioTAP-XL, the wealth of additional candidates in the Pc, E(z), and Scm pull-downs featured in this study will be invaluable in extending understanding of chromatin-based PcG repression (Kang, 2015).


GENE STRUCTURE

cDNA clone length - 4.1 and 3.8 Kb

Bases in 5' UTR - 434

Bases in 3' UTR - 676 and 979


PROTEIN STRUCTURE

Amino Acids - 877

Structural Domains

The predicted Scm protein is 877 amino acids long with a relative molecular mass of 94,000 Da and a pI of 9.4. A potential nuclear localization signal (RQRGRPAKR) starts at amino acid 52. Restriction mapping and sequence analysis have shown that the difference between the 4.1 and 3.8 kb cDNAs results from use of alternative polyadenylation sites. The poly(A) tail of the longer cDNA begins at position 4046 whereas the poly(A) tail of the shorter cDNA begins at position 3744. There is a consensus poly(A) addition signal located 25 bp upstream of the 3.8 kb cDNA poly(A) tail. There are only imperfect matches to the poly(A) addition signal in the region immediately upstream of the 4.1 kb cDNA poly(A) tail (Bornemann, 1996).

The Scm and Polyhomeotic (DeCamillis, 1992) proteins share the presence of a homologous region with regard to the SPM domain. This domain is 38% identical between the two proteins, over a length of 65 amino acids. Each protein has an SPM domain located at its respective C termini. This domain is predicted to be largely alpha-helical. Besides these proteins, there are numerous proteins that contain a related domain with much lower overall identity (Alkema, 1997 and Ponting, 1995). These more distantly related proteins include members of the Ets family of transcription factors and yeast proteins required for mating. The high-homology domain subgroup that includes the Scm and Ph versions are referred to as the SPM domain, and the extended domain family is referred to as the SAM domain (Ponting, 1995). One of the more well-characterized SAM domains is present in the human TEL oncoprotein, an Ets class transcription factor, where the SAM has been referred to as a helix-loop-helix (HLH) domain. Recent studies have shown that this domain mediates self-binding and oligomerization of TEL protein and of TEL fusion protein derivatives (Peterson, 1997).

Besides sharing homology to other proteins in the SPM domain, Scm is even more similar to another fly protein, the product of the tumor suppressor gene lethal (3) malignant brain tumor ([l(3)mbt]; Wismar, 1995). The Scm and L(3)mbt proteins share zinc fingers, the SPM domain, and a third domain consisting of 100-amino-acid long repeats. These repeats, termed mbt repeats (Wismar, 1995 and Bornemann, 1996), are present in two tandem copies in SCM and three copies in L(3)Mbt. The biochemical role of mbt repeats is not known (Peterson, 1997).

Polyhomeotic, Rae-28 and l(3)mbt all contain putative Cys2-Cys2 zinc fingers that define a distinct zinc finger subclass, which is marked by identical spacing between the cysteine pairs and conservation of residues that flank the cysteines. There are two such zinc fingers located near the N terminus of Scm protein and a single finger in the Ph, Rae-28 and L(3)mbt proteins. The spacing between cysteines is distinct from the Cys2-Cys2 fingers of known DNA-binding proteins such as the nuclear hormone receptors. Scm also contains a third potential zinc-binding region (Zn3), which differs from the N-terminal fingers and is not shared in Ph, Rae-28 or L(3)mbt. This third region can be arranged as a Cys2-Cys2 finger, but the presence of additional cysteine and histidine residues between the outer cysteine pairs may reflect alternative forms of a zinc-binding domain (Bornemann, 1996).

Scm protein also contains a region with a high density of alanine residues. Starting at amino acid position 748, there is a stretch of 29 residues comprised of 52% alanine. Alanine-rich regions have been associated with transcriptional repression domains in the Drosophila Engrailed, Even-skipped and Krüppel proteins (Bornemann, 1996).


Sex combs on midleg: Evolutionary Homologs | Regulation | Developmental Biology | Effects of Mutation | References

date revised: 13 Sept 99

Home page: The Interactive Fly © 1995, 1996 Thomas B. Brody, Ph.D.

The Interactive Fly resides on the
Society for Developmental Biology's Web server.