Gene name - Sex combs on midleg
Cytological map position - 85E1--10
Function - modification of chromatin structure
Keywords - Polycomb group
Symbol - Scm
FlyBase ID: FBgn0003334
Genetic map position - 3-48.5
Classification - zinc finger (C2C2 type) SAM motif protein; SPM-domain protein
Cellular location - nuclear
The Sex combs on midleg (Scm) gene (Jürgens, 1985; Breen, 1986) encodes one of the Polycomb group (PcG) repressors (McKeon, 1991; Simon, 1992). Embryos lacking both maternal and zygotic Scm product die with most segments transformed into copies of the eighth abdominal segment (Breen, 1986). This null phenotype, which is among the strongest seen in single PcG mutants, shows that the Scm product is a central component in PcG repression. Scm protein represses multiple homeotic genes during embryonic stages (Breen, 1986; McKeon 1991, and Simon, 1992). Analysis of pupal lethal Scm alleles (Wu, 1989) shows that Scm is also required postembryonically. These genetic data and the continuous developmental expression of SCM mRNA imply a long-term role for Scm product in homeotic repression, like most other PcG products, (Bornemann, 1996).
Although Scm has been most well-characterized in terms of homeotic gene control, it is also likely to be involved in other processes, as are many of the PcG proteins. Scm is a regulator of the segmentation gene engrailed (Moazed, 1992) and genetic studies suggest a role in dorsal-ventral development (Adler, 1991). The suppression of zeste 1 eye color by Scm mutations may reflect an Scm role in white gene expression. This suppression does not require unusual Scm alleles, since it occurs when there is a deficiency at the Scm locus and in apparent Scm null mutations. Although the mechanism of zeste 1 suppression is unclear, it is intriguing that a subset of PcG products, including Scm, Enhancer of zeste and Posterior sexcombs, share the zeste interaction. Investigation of the physical interactions between Scm protein and its PcG cohorts should help define how transcription is modulated at homeotic loci and at other loci under PcG control (Bornemann, 1996 and references).
The Scm and Polyhomeotic proteins have in common the same domain (termed the SPM domain) located at their respective C termini. Using the yeast two-hybrid system and in vitro protein-binding assays, it has been shown that the SPM domain mediates direct interaction between Scm and Ph. Binding studies with isolated SPM domains from Scm and Ph show that the domain is sufficient for these protein interactions. These studies also show that the Scm-Ph and Scm-Scm domain interactions are much stronger than the Ph-Ph domain interaction, indicating that the isolated domain has intrinsic binding specificity determinants. Analysis of site-directed point mutations identifies residues that are important for SPM domain function. These binding properties, predict an alpha-helical secondary structure, and conservation of hydrophobic residues has prompted comparisons of the SPM domain to the helix-loop-helix and leucine zipper domains used for homotypic and heterotypic protein interactions in other transcriptional regulators. Scm and Ph proteins co-localize at polytene chromosome sites in vivo (Peterson, 1997).
To begin to investigate the mechanism and specific residues used for SPM domain protein contact, the effects of site-directed mutations in either the Scm or Ph domains on in vitro binding were tested. The mutations were targeted to residues that are highly conserved in alignments of proteins with similar domains. The point mutations fall into two classes: those that target conserved residues in the extended SAM domain family and those that target residues conserved only in the high-homology SPM subgroup. Three site-directed mutations have been generated in the SPM domain of Scm. The G31S mutation alters a residue that is absolutely conserved in all 23 compiled versions in the extended domain family. This mutant was tested in the context of radiolabelled full-length Scm protein for binding to the minimal Scm and Ph domains. Both Scm-Scm and Scm-Ph interactions are greatly reduced in vitro. Consistent with the residual binding activity seen in the G31S mutant, G31S is found to mediate a reduced but still detectable interaction in the two-hybrid system (Peterson, 1997).
The L35S;L36S double mutation and the K49A mutation affect residues conserved in the high-homology subgroup but not in the extended domain family. Substantial self- and cross-binding activity is retained with these mutant proteins. The only reduction seen with these two mutants is a modest effect of the L35S:L36S double substitution upon the Scm-Scm interaction. This mutant causes a several-fold loss in Scm-Scm binding but retains Scm-Ph cross-binding activity comparable to that of the wild type (Peterson, 1997).
Five site-directed mutations were generated in the SPM domain of Polyhomeotic. All five mutations alter residues that are highly conserved in the extended domain family. These mutations were inserted into the context of the minimal GSTph1511-1576 fusion protein and then tested for binding to the minimal Scm radiolabelled domain. W1A and G51A ph mutations cause significant reductions in binding activity to Scm. In contrast, mutations in the conserved hydrophobic residues (L34A, L42A, and I63D) have little effect on in vitro Scm-Ph interaction (Peterson, 1997).
The two-hybrid and GST pulldown assays show that the Scm and Ph proteins can bind each other directly and that their respective SPM domains mediate qualitatively strong interactions. However, these experiments do not address whether the Scm and Ph proteins are partners at sites of action in vivo. To assess association in vivo, the Scm and Ph distributions were compared on wild-type polytene chromosomes. In addition, colocalization was tested for at an engineered chromosomal site containing an isolated segment of homeotic gene regulatory DNA. Polytene chromosome immunostaining experiments have shown that Ph protein accumulates at its two most well-characterized target loci, the Antennapedia (Ant-C) and bithorax (BX-C) homeotic gene complexes. In addition, Ph protein is associated with approximately 100 other sites in the genome. Ph protein immunolocalizes at the BX-C site as well as at five flanking sites on chromosome 3R. The same section of chromosome stains with antibody against Scm protein. There is strong signal at the BX-C locus, and the Scm distribution on flanking sites is identical to the Ph distribution. The Ph and Scm protein distributions in the Ant-C region are also identical. Since the antibodies used in these studies are both rabbit polyclonal antibodies, double-staining experiments to determine if all the approximately 100 Ph and Scm sites are identical could not be performed. However, comparison of the Scm sites on the five major chromosome arms with the Ph sites indicates that there is at least 90% overlap in the distributions of these two proteins on polytene chromosomes (Peterson, 1997).
To compare Ph and Scm association with an additional site of action in vivo, colocalization was tested at a site containing regulatory DNA isolated from a homeotic gene. The germ line transformant, 85-39, contains a 14-kb segment from the bxd regulatory region of the BX-C complex inserted near the tip of chromosome 3L at cytological location 62A. Previous work has shown that this transformed DNA segment creates a novel site of Ph protein accumulation and that expression programmed by this 14-kb DNA segment is regulated by Ph and Scm in vivo. Scm protein accumulates at the insertion site of this bxd regulatory DNA. Thus, Scm and Ph proteins are both recruited to an engineered chromosomal site containing an in vivo regulatory target. This result, together with the coincidence of the Ph and Scm proteins at many wild-type chromosomal sites, provides evidence for association of these proteins in vivo (Peterson, 1997).
Bases in 5' UTR - 434
Bases in 3' UTR - 676 and 979
The predicted Scm protein is 877 amino acids long with a relative molecular mass of 94,000 Da and a pI of 9.4. A potential nuclear localization signal (RQRGRPAKR) starts at amino acid 52. Restriction mapping and sequence analysis have shown that the difference between the 4.1 and 3.8 kb cDNAs results from use of alternative polyadenylation sites. The poly(A) tail of the longer cDNA begins at position 4046 whereas the poly(A) tail of the shorter cDNA begins at position 3744. There is a consensus poly(A) addition signal located 25 bp upstream of the 3.8 kb cDNA poly(A) tail. There are only imperfect matches to the poly(A) addition signal in the region immediately upstream of the 4.1 kb cDNA poly(A) tail (Bornemann, 1996).
The Scm and Polyhomeotic (DeCamillis, 1992) proteins share the presence of a homologous region with regard to the SPM domain. This domain is 38% identical between the two proteins, over a length of 65 amino acids. Each protein has an SPM domain located at its respective C termini. This domain is predicted to be largely alpha-helical. Besides these proteins, there are numerous proteins that contain a related domain with much lower overall identity (Alkema, 1997 and Ponting, 1995). These more distantly related proteins include members of the Ets family of transcription factors and yeast proteins required for mating. The high-homology domain subgroup that includes the Scm and Ph versions are referred to as the SPM domain, and the extended domain family is referred to as the SAM domain (Ponting, 1995). One of the more well-characterized SAM domains is present in the human TEL oncoprotein, an Ets class transcription factor, where the SAM has been referred to as a helix-loop-helix (HLH) domain. Recent studies have shown that this domain mediates self-binding and oligomerization of TEL protein and of TEL fusion protein derivatives (Peterson, 1997).
Besides sharing homology to other proteins in the SPM domain, Scm is even more similar to another fly protein, the product of the tumor suppressor gene lethal (3) malignant brain tumor ([l(3)mbt]; Wismar, 1995). The Scm and L(3)mbt proteins share zinc fingers, the SPM domain, and a third domain consisting of 100-amino-acid long repeats. These repeats, termed mbt repeats (Wismar, 1995 and Bornemann, 1996), are present in two tandem copies in SCM and three copies in L(3)Mbt. The biochemical role of mbt repeats is not known (Peterson, 1997).
Polyhomeotic, Rae-28 and l(3)mbt all contain putative Cys2-Cys2 zinc fingers that define a distinct zinc finger subclass, which is marked by identical spacing between the cysteine pairs and conservation of residues that flank the cysteines. There are two such zinc fingers located near the N terminus of Scm protein and a single finger in the Ph, Rae-28 and L(3)mbt proteins. The spacing between cysteines is distinct from the Cys2-Cys2 fingers of known DNA-binding proteins such as the nuclear hormone receptors. Scm also contains a third potential zinc-binding region (Zn3), which differs from the N-terminal fingers and is not shared in Ph, Rae-28 or L(3)mbt. This third region can be arranged as a Cys2-Cys2 finger, but the presence of additional cysteine and histidine residues between the outer cysteine pairs may reflect alternative forms of a zinc-binding domain (Bornemann, 1996).
Scm protein also contains a region with a high density of alanine residues. Starting at amino acid position 748, there is a stretch of 29 residues comprised of 52% alanine. Alanine-rich regions have been associated with transcriptional repression domains in the Drosophila Engrailed, Even-skipped and Krüppel proteins (Bornemann, 1996).
date revised: 13 Sept 99
Home page: The Interactive Fly © 1995, 1996 Thomas B. Brody, Ph.D.
The Interactive Fly resides on the
Society for Developmental Biology's Web server.