modifier of mdg4


REGULATION

Targets of Activity

The 5'-untranslated region of the Drosophila gypsy retrotransposon contains an "insulator," which disrupts the interactions between distally located enhancers and proximal promoter elements. The insulator effect is dependent on the suppressor of Hairy-wing (su[Hw]) protein, which binds to reiterated sites within the 350 base pairs of the gypsy insulator, and additionally acts as a transcriptional activator of gypsy. This study shows that the 350-base pair su(Hw) binding site-containing gypsy insulator behaves as a matrix/scaffold attachment region (MAR/SAR), involved in interactions with the nuclear matrix. In vitro experiments using nuclear matrices from Drosophila, murine, and human cells demonstrate specific binding of the gypsy insulator, not observed with any other sequence within the retrotransposon. Moreover, it is shown that the gypsy insulator, like previously characterized MAR/SARs, specifically interacts with topoisomerase II and histone H1, i.e. with two essential components of the nuclear matrix. Experiments within cells in culture demonstrate differential effects of the gypsy MAR sequence on reporter genes, namely no effect under conditions of transient transfection and a repressing effect in stable transformants, as expected for a sequence involved in chromatin structure and organization (Nabirochkin, 1998).

The presence of a MAR/SAR within gypsy is not totally unexpected, since "boundary" elements are in general regions which contain not only enhancer and insulating elements, but also matrix attachment domains. The rather original feature of the gypsy sequence is that all three domains, which in general are sufficiently "dispersed" so as to allow isolation of "pure" enhancers, MAR/SAR, or insulators, are in the present case "gathered" within a single and relatively short (350 bp) sequence. This rather uncommon situation might in fact be relevant to the pressure for compactness within retroviral sequences, as it is known that retroviruses can only package a limited amount of genetic information. A consequence of compaction is that the gypsy insulator and its associated components are most probably interacting, in vivo, with elements of the nuclear matrix. Accordingly, proteins of the nuclear matrix might play a role in the insulation process, and conversely the su(Hw) protein (which is essential for insulation) might interact with proteins of the matrix. Such interactions could actually account for the data on gypsy insulation and fit with previously proposed models for the gypsy effects (Nabirochkin, 1998).

A first series of data strongly suggested that the gypsy insulator, like all previously characterized insulators, essentially prevents interactions between distal enhancer and promoter, without any direct repressing effect on the enhancer itself. This directional effect can most easily be accounted for by the "looping model" involving generation of structural domains isolated one from the other by attachment of boundary sequences (MAR/SAR) to the nuclear matrix. Alternatively, a series of data on gypsy insulation (essentially in mod(mdg4) mutants) discloses bidirectional repressing effects, which can be accounted for by a model involving heterochromatinization. The present data (showing that the gypsy insulator behaves as a MAR/SAR) are clearly in agreement with the structural looping model, but also support the heterochromatinization model. Indeed, the gypsy MAR/SAR DNA per se, in the absence of su(Hw) protein, is involved in histone H1 nucleation (as shown in this paper), and it has been demonstrated that histone H1 nucleation is associated with both DNA compaction and transcriptional silencing. Additionally, Laemmli and co-workers have found that histone H1 can be removed from MAR/SAR domains by distamycin and distamycin-like proteins (D-like proteins, such as the high mobility group proteins); this has led to the proposal that MAR/SARs can activate or repress transcription of adjacent genes depending on the nucleation/depletion of histone H1. The gypsy MAR/SAR could then be responsible for the repressing effect observed in the mod(mdg4) mutants, as well as in the present assay within heterologous cells (assuming further that appropriate D-like proteins are absent in those cells). Taking into account, in addition, that mutations in the mod(mdg4) or the su(Hw) genes modify position-effect variegation, it could be further hypothesized that the su(Hw)/mod(Mdg4) complex acts as the D-like proteins and modifies the nucleation processes to allow the switch from a repressing to an active state. Accordingly, a model in which the su(Hw) binding sites and the associated su(Hw)/mod(Mdg4) complex modulate the effects of the MAR/SAR DNA sequence could rather simply account for the biological effects of the gypsy insulator in both the wild type and su(Hw)/mod(mdg4) mutants. The proposed model would then reconcile the two previous models for gypsy insulation, i.e. the heterochromatinization and the looping models (Nabirochkin, 1998 and references).

Trans-splicing of mod(mdg4)

The Drosophila BTB domain containing gene mod(mdg4) produces a large number of protein isoforms combining a common N-terminal region of 402 aa with different C termini. The genomic structure of this complex locus has been deduced and it has been found that at least seven of the mod(mdg4) isoforms are encoded on both of its antiparallel DNA strands, suggesting the generation of mature mRNAs by trans-splicing. Drosophila can produce mod(mdg4) mRNAs by trans-splicing of pre-mRNAs generated from transgenes inserted at distant chromosomal positions. Evidence is presented for the occurance of trans-splicing of mod(mdg4)-specific exons encoded by the parallel DNA strand. The mod(mdg4) locus represents a new type of complex gene structure in which genetic complexity is resolved by extensive trans-splicing, raising important implications for genome sequencing projects. Demonstration of naturally occurring trans-splicing in the model organism Drosophila opens new experimental approaches toward an analysis of the underlying mechanisms (Dorn, 2001).

During the molecular analysis of the mod(mdg4) locus, 26 different classes of transcripts were identified all containing a common 5' sequence (exons 1-4) but different 3' regions. The deduced proteins contain an N-terminal (BTB) domain. Furthermore, in most of the isoforms a conserved C-terminal C2H2-containing protein motif is found. Two new cDNA clones representing isoforms mod(mdg4)-52.2 and mod(mdg4)-54.1 were isolated by screening an embryonic cDNA library. Two additional putative isoforms, mod(mdg4)-53.6 and mod(mdg4)-54.7 have been detected by searching the genomic region of mod(mdg4) for ORFs containing the C-terminal C2H2 consensus sequence. Mod(mdg4)-58.8 represents another new isoform, which was so far identified by a 3'-truncated cDNA clone. RT-PCR experiments reveal the existence of mod(mdg4)-53.6, -54.7, and -58.8 transcripts in early embryos. By sequencing the resulting PCR products, the putative proteins were deduced. All of them contain the conserved C-terminal protein consensus sequence. Altogether 20 of 26 identified mod(mdg4) isoforms combine both the N-terminal BTB domain and the C-terminal consensus sequence -- this might be of functional significance (Dorn, 2001).

On the basis of extensive cDNA sequence data and the available sequence of the mod(mdg4) region, the exon/intron structure of mod(mdg4) has been deduced. Interestingly, seven of the transcripts are not colinearly located within the locus. Relative to the common exons 1-4, exon 5 of isoforms mod(mdg4)-53.1, -62.3, -55.6, -53.6, -54.7, -57.4, and -67.2 are encoded by the antiparallel DNA strand. Beside this finding, the exon/intron structure of mod(mdg4) suggests differential splicing within isoform-specific exons (isoforms mod(mdg4)-55.7 and -52.2; mod(mdg4)-54.6, -56.3, -54.2, and -46.3; mod(mdg4)-58.6 and -54.1). According to its genomic structure, the genetic density at the mod(mdg4) complex is unusually high. Altogether 26 independent transcripts of an average size of 2 kb are encoded in a genomic region of 28 kb (Dorn, 2001).

To demonstrate trans-splicing, specific exons from the common exons 1-4 have been separated to different chromosomes via transgene insertions. In this assay, the specific exons mod(mdg4)-55.1 and mod(mdg4)-53.1, which are encoded by antiparallel DNA strands, were chosen. Notably, the 3'-untranslated regions of both isoforms are antisense within a region of 149 nt. This experiment should also test for putative trans-splicing of isoform mod(mdg4)-55.1, which is colinearly located with respect to the common exons 1-4. In the transgenes, both specific exons have been sequence-tagged via PCR and the resulting fragment containing genomic sequences of 2.0-kb and 1.2-kb upstream of the splice sites of mod(mdg4)-55.1 and mod(mdg4)-53.1, respectively, was cloned in both orientations into the Drosophila transformation vector pUAST. Expression of the inserted sequences is induced by the yeast transcriptional activator GAL4, expressed from an independent driver element (Dorn, 2001).

One of the important criteria for efficient mRNA trans-splicing is the presence of a 3' splice site in absence of a functional 5' splice site as represented in outrons. To meet this criteria, the production of independent transcript(s) containing one or several of the endogenous-specific mod(mdg4) exons would be expected. Searching for putative promoter elements within the mod(mdg4) complex, several TATA-box-containing elements were found. One of these is located upstream of the specific exon mod(mdg4)-55.1 and is contained within the transgene construct used for the trans-splicing assay. To prove its function in vivo, all transgenic lines have been tested for expression of the transgene in absence of any GAL4-driver element. Independent of the insertion site of the transgene and its orientation relative to the UAS sequence, trans-splicing of the tagged specific exon mod(mdg4)-55.1 could be demonstrated. For PCR primers, again the forward primer E4-F and the primer 55.1-tag1-back have been used. However, the level of expression and/or the efficiency of trans-splicing is variable in different transformants. This could be because of chromosomal position effects, depending on the insertion site of the transgene. It is concluded from these results that independent mRNAs containing the common exons 1-4, in one case, and mRNAs containing the specific exon mod(mdg4)-55.1 in the other case, are produced endogenously (Dorn, 2001).

The mod(mdg4) locus represents an unusual type of gene structure. Both DNA strands within the locus are used to encode a large number of protein isoforms. A transgenic approach clearly demonstrates that both the colinearly located specific exons [demonstrated for exon mod(mdg4)-55.1] and those encoded by the antiparallel DNA strand [shown for exon mod(mdg4)-53.1] are substrates for trans-splicing. This result also suggests that all other mod(mdg4) isoforms might be generated by trans-splicing, implying the initiation of independent pre-mRNAs at several promoter elements within the mod(mdg4) complex. Multiple TATA-box-containing elements were found throughout the locus. One of these is located upstream of the mod(mdg4)-55.1 isoform and is contained in the transgene. Expression of the transgene independent of the inducible promoter element in six independent transgenic lines indicates a putative promoter function. However, further experiments should demonstrate the existence of multiple promoters at the mod(mdg4) locus. Moreover, these results demonstrate that trans-splicing occurs within mod(mdg4), independent of the chromosomal context of the common exons 1-4 and the 3'-specific exons. This also raises the question about the special requirements for initiating trans-splicing at mod(mdg4). Further experiments should clarify whether RNA recognition or nuclear compartmentalization or both plays a role in the initiation of trans-splicing (Dorn, 2001).

The data suggest that trans-splicing is a general property of the mod(mdg4) locus. Three possible types of trans-splicing events have been invisioned. The transcript containing common exons 1-4 is produced in large quantities and contains putative interaction sites with upstream regions of pre-mRNAs containing the specific mod(mdg4) exon(s). The specific exons are transcribed as mono-exonic, di-exonic, or polyexonic mRNAs from both cDNA strands. Depending on the site of trans-splicing, three different protein isoforms (A, B, and C) are produced. The expression of the alternatively spliced mod(mdg4) isoforms could be regulated at several levels: (1) differential spatial and temporal expression of pre-mRNAs containing one (or groups of) alternatively spliced specific exons; (2) differences in selectivity and efficiency of trans-splicing to generate different quantities of mature mRNAs, and (3) variable stability of the isoform specific transcripts. As a result, >20 different mod(mdg4) protein isoforms are produced that all contain a common region of 402 aa, including the N-terminal BTB domain implicated in dimerization/oligomerization and variable C termini with the conserved C2H2 motif. The variable C termini are implicated to specify the function of individual isoforms in different processes like chromatin insulator function, programmed cell death, or modification of gene silencing (Dorn, 2001).

Two mutant alleles of the same gene, each located in one of the two homologous chromosomes, may in some instances restore the wild-type function of the gene. This is the case with certain combinations of mutant alleles in the mod(mdg4) gene. This gene encodes several different proteins, including Mod(mdg4)2.2, a component of the gypsy insulator. This protein is encoded by two separate transcription units that can be combined in a trans-splicing reaction to form the mature Mod(mdg4)2.2-encoding RNA. Molecular characterization of complementing alleles shows that they affect the two different transcription units. Flies homozygous for each allele are missing the Mod(mdg4)2.2 protein, whereas wild-type trans-heterozygotes are able to synthesize almost normal levels of the Mod(mdg4)2.2 product. This protein is functional as judged by its ability to form a functional insulator complex. The results suggest that the interallelic complementation in the mod(mdg4) gene is a consequence of trans-splicing between two different mutant transcripts. A conclusion from this observation is that the trans-splicing reaction that takes place between transcripts produced on two different mutant chromosomes ensures wild-type levels of functional protein (Mongelard, 2002).

The interallelic complementation mechanism reported here is different from those previously described. Interallelic complementation between mutations affecting the coding region of the gene has been observed when each allele is affected in only one of two separate functional domains of a multifunctional protein. Two alleles, each deficient in a different domain, may complement one another. Such a complementation mechanism requires the production of abnormal proteins by the mutant loci. This is not the case for mod(mdg4), as assessed by Western blots and in situ immunodetection. The complementation observed here has its molecular origin in the cell's ability to produce a wild-type RNA by combining information present in two mutant transcripts. A trans-splicing event is most probably involved in the production of the final Mod(mdg4)2.2 mRNA. How trans-splicing is integrated with transcription and pre-mRNA processing reactions remains to be addressed. Abundant evidence suggests that transcription and processing of the mRNA are coordinated nuclear events. For example, splicing factors are recruited to the sites of transcription by RNA polymerase II. Similarly, mRNA capping and polyadenylation seem to occur right at the transcription site. Finally, even the packaging of the mature mRNA into heterogeneous ribonucleoparticles prior to cytoplasmic export could be coupled to other pre-mRNA processes. One may therefore argue that to enter a trans-splicing reaction, two premessenger RNAs need to be in physical proximity also during their transcription, before they engage in cis-splicing and other processing events. In the case of a wild-type mod(mdg4) locus, this condition is always met, thanks to the configuration of the gene: the two transcription units are in the same locus. In the case of flies undergoing interallelic complementation, this proximity condition may be met because of the extensive somatic pairing that exists between homologous chromosomes in both polytene and diploid cells (Mongelard, 2002).

It remains an open question whether a physiologically significant number of trans-splicing events are possible when the transcripts involved are produced at distant nuclear locations. It has been recently shown that two transcripts, one produced by the normal mod(mdg4) gene and a second one by a transgene inserted elsewhere in the genome, may be combined, presumably by trans-splicing. This latter study used a nonquantitative RT-PCR-based assay to detect the trans-spliced mRNA. It is therefore difficult to assess whether the increased distance between sites of transcription of both pre-mRNAs diminishes the efficiency of trans-splicing. If the level of trans-splicing is low when the mutant alleles are physically far away in the genome, the levels of protein synthesized might not be sufficient to restore the wild-type function of the mod(mdg4) gene. If this is the case, interallelic complementation at the mod(mdg4) locus will exhibit properties similar to transvection, in which phenotypic complementation is sensitive to the pairing of the two complementing alleles. The combined analysis of trans-splicing and somatic pairing of homologous chromosomes may constitute a powerful tool to study the intricate succession of events involved in the transcription and processing of the RNA in higher eukaryotes (Mongelard, 2002).

The modifier of mdg4, mod(mdg4), locus in Drosophila melanogaster represents a new type of complex gene in which functional diversity is resolved by mRNA trans-splicing. A protein family of greater than 30 transcriptional regulators, which are supposed to be involved in higher-order chromatin structure, is encoded by both DNA strands of this locus. Mutations in mod(mdg4) have been identified independently in a number of genetic screens involving position-effect variegation, modulation of chromatin insulators, apoptosis, pathfinding of nerve cells, and chromosome pairing, indicating pleiotropic effects. The unusual gene structure and mRNA trans-splicing are evolutionary conserved in the distantly related species Drosophila virilis. Chimeric mod(mdg4) transcripts encoded from nonhomologous chromosomes containing the splice donor from D. virilis and the acceptor from D. melanogaster are produced in transgenic flies. A significant amount of protein can be produced from these chimeric mRNAs. The evolutionary and functional conservation of mod(mdg4) and mRNA trans-splicing in both Drosophila species is furthermore demonstrated by the ability of D. virilis mod(mdg4) transgenes to rescue recessive lethality of mod(mdg4) mutant alleles in D. melanogaster (Gabler, 2005).

The majority of genes in higher eukaryotes represents monocistronic units where noncoding intron regions interrupt the protein-coding exon sequences. The resulting mature mRNA usually encodes a unique polypeptide. Recent advances in genome analysis of several model organisms and the molecular characterization of a large number of genes revealed that alternative pre-mRNA splicing is one of the main mechanisms generating a highly expanded proteome diversity. Thus, protein families with slightly different isoforms or even proteins with unrelated functions can be produced from single or multiple promoter elements within one gene. Regulatory integration of different transcriptional units is found in gene complexes like Hox genes, hemoglobin genes, or immunoglobin genes. This organization reflects clustering of genes with related functions. With mod(mdg4) a new type of functional clustering has been discovered in Drosophila. This complex locus encodes greater than 30 isoforms generated by mRNA trans-splicing. Protein isoforms produced by mod(mdg4) contain a common 402-amino-acid N-terminal region encoded by the four 5'-exons but differ in their C-terminal region encoded by alternative 3'-exons. This kind of trans-splicing clearly differs from splice leader trans-splicing that predominates in Caenorhabditis and Trypanosomes where polycistronic transcripts are resolved by addition of noncoding leader sequences. Mutational dissection and differential binding of Mod(mdg4) isoforms on polytene chromosomes suggest that the variable C-terminal regions encoded by any of the alternative 3'-exons determine functional specificity. Specific Mod(mdg4) isoforms are supposed to be involved in control of heterochromatic gene silencing, regulation of homeotic genes, function of chromatin insulators, nerve cell pathfinding, induction of apoptosis, and control of meiotic processes. Genomic structure and transgene analysis demonstrate the specific functional organization of the complex mod(mdg4) locus.Mature mod(mdg4) transcripts are generated by a trans-splicing mechanism combining one primary transcript comprising the common four 5'-exons with another transcription unit contributing one of the alternative 3'-exons. A comparably complex gene structure was also described for a number of other genes in Drosophila, including Broad, tramtrack, GAGA-factor/Trl, and lola, all of which encode numerous protein isoforms with alternative C termini. Interestingly, in addition to mod(mdg4), mRNA trans-splicing was recently reported for the lola locus. Another unique characteristic of these genes is that they all encode BTB/POZ domain proteins, which frequently contain Cys2His2 zinc-finger motifs within the variable C-terminal region (Gabler, 2005).

The limited knowledge of the functional significance of the large number of mod(mdg4) isoforms and the unusual type of gene structure in D. melanogaster prompted an analysis of the orthologous locus from the distantly related species D. virilis. It represents an evolutionarily distant species that was separated ~40–60 million years ago from the Sophophora, which includes D. melanogaster. This period of time allowed for the selection of functionally essential genes. A number of orthologous genes have been studied in detail and their functional conservation in D. virilis was demonstrated by mutant rescue experiments. The degree of the overall conservation within coding regions is variable and can reach up to 98% similarity. The results demonstrate a strong evolutionary conservation of all Mod(mdg4) isoforms identified in D. virilis, indicating the functional significance of the multiple isoforms. Evidence has been presented for a functional differentiation of at least two isoforms, Mod(mdg4)-58.0 and Mod(mdg4)-67.2 in D. melanogaster. The high degree of sequence conservation of both isoforms in D. virilis is in good agreement with binding to corresponding sites on polytene chromosomes as shown for isoform Mod(mdg4)-58.0. Its binding to corresponding subdivisions on polytene chromosomes suggests an involvement in regulation of a subset of orthologous genes in D. melanogaster and D. virilis (Gabler, 2005).

The common N-terminal region, which is part of all isoforms and therefore supposed to contribute general functions, shows an extended identity beyond the BTB/POZ domain. This common protein region represents about two-thirds of any of the Mod(mdg4) proteins. The ubiquitously expressed protein Chip interacts with the common region of Mod(mdg4) in D. melanogaster. Chip is supposed to facilitate enhancer-promoter interactions in a large number of genes and interacts genetically and physically with several LIM- and homeodomain-containing transcription factors. These data, together with the observed pleiotropic mutant effects of most mod(mdg4) mutants, indicate a putative link between the several hundred binding sites of Mod(mdg4) on polytene chromosomes and their involvement in transcriptional regulation of a large number of genes. The strong conservation of the common protein region in both Drosophila species might be the consequence of the evolutionarily conserved interaction with Chip and other putative interacting proteins. The N-terminal BTB/POZ domain is almost identical in both species. This domain was shown to mediate homo- and/or heterodimerization. A similar degree of conservation between D. melanogaster and D. virilis was found for the BTB/POZ domain containing gene GAGA/Trl. Also in this case at least two alternatively spliced isoforms containing a common N-terminal region of 400 amino acids but variable C termini have been described. However, in contrast to mod(mdg4), no significant functional differentiation between the two GAGA isoforms has been described (Gabler, 2005).

If specific C termini of orthologous Mod(mdg4) isoforms are compared, a remarkable degree of identity within the FLYWCH domain, a Cys2His2-motif-containing protein domain, is found. This domain is supposed to be involved in protein-protein interactions. Strong conservation of most amino acid positions within this motif between orthologous isoforms implies their functional importance for isoform-specific interactions with other proteins. The unique C-terminal region of isoform Mod(mdg4)-67.2 has been demonstrated to interact with Su(Hw) to create a functional gypsy insulator element whereas the unique C terminus of isoform Mod(mdg4)-56.3/Doom interacts with the baculovirus inhibitor of apoptosis protein/IAP. The high degree of sequence identity suggests that these interactions are conserved in D. virilis. If the orthologous D. virilis isoforms Mod(mdg4)-64.2, Mod(mdg4)-60.1, and Mod(mdg4)-67.2 are compared with their counterparts in D. melanogaster, it becomes evident that additional amino acid positions flanking the FLYWCH motif are highly conserved. However, the extension and the location of the identity beyond the FLYWCH motif is isoform dependent. In case of Mod(mdg4)-67.2, an additional strongly conserved sequence motif of 22 amino acids is located at the C terminus. On the basis of pull-down experiments with a C-terminal truncated (deletion of 43 amino acids) Mod(mdg4)-67.2 protein and the observed phenotype connected with the corresponding mutant protein (Mod(mdg4)-67.2T6) the FLYWCH domain itself is not sufficient for interaction with Su(Hw), indicating the functional importance of the strongly conserved 22 C-terminal amino acids. Also, the isoforms without the FLYWCH motif are conserved as shown for Mod(mdg4)-58.0 (identity of 51% within the unique C terminus). Recently, an evolutionary analysis of several Dipteran orthologous mod(mdg4) loci revealed a significant conservation of most isoforms, including Mod(mdg4)-58.0, Mod(mdg4)-60.1, Mod(mdg4)-64.2, and Mod(mdg4)-67.2 (Gabler, 2005).

Two conclusions can be drawn from the evolutionary conservation of Mod(mdg4) proteins. First, the large number of isoforms is functionally important in both Drosophila species and second, the conservation of the unique C-terminal regions clearly points to a functional differentiation between single isoforms (Gabler, 2005).

In the present study it was demonstrated that along with the evolutionary conservation of the unusual gene structure of mod(mdg4) in D. virilis mRNA trans-splicing is also conserved in both species. Three different assays were performed to prove the existence of chimeric transcripts in vivo. The identification of chimeric mod(mdg4) isoforms in transgenic flies clearly indicates that the mechanism of mRNA trans-splicing is conserved between the distantly related Drosophila species. Quantitative RT-PCR experiments reveal that in case of isoform Mod(mdg4)-67.2 the chimeric D. virilis/D. melanogaster transcript in transgenic flies containing two copies of the second chromosomal P(w+ Dv mod(mdg4) 6.8kb NotI-XbaI) transgene represents ~12% of the corresponding endogenous D. melanogaster transcript. The Mod(mdg4)-67.2 protein can be clearly detected on polytene chromosomes of 2-P(w+ Dv mod(mdg4) 11.5kb NotI)/+; mod(mdg4)02/mod(mdg4)02 larvae but not in mod(mdg4)02 homozygous larvae. Because the specific mod(mdg4)-67.2 exons are not encoded by the D. virilis transgene, this result strongly suggests that the cytologically detected protein represents the chimeric D. virilis/D. melanogaster Mod(mdg4)-67.2 protein, which is produced in a significant amount. In fact, the presence of considerable amounts of the full-length Mod(mdg4)-67.2 protein was demonstrated in Western blot analysis. The maintenance of the binding pattern of the chimeric Mod(mdg4)-67.2 isoform compared to the D. melanogaster Mod(mdg4)-67.2 on polytene chromosomes also implicates the functional conservation of the D. virilis N-terminal region (Gabler, 2005).

Interallelic complementation is facilitated by mRNA trans-splicing if two mutations disrupting independent mod(mdg4) mRNAs are combined in trans. It is assumed that the close proximity of donor and acceptor mRNAs within the mod(mdg4) locus is a prerequisite for generation of significant amounts of wild-type Mod(mdg4)-67.2 protein. The lola locus of D. melanogaster represents a second complex gene in which mRNA trans-splicing was demonstrated. Mutations interfering with the pairing of the lola locus reduce the in vivo trans-splicing of isoform T from 44 to 1%. However, the consequences on a protein level were not examined. The transgene assay ofr mod(mdg4) clearly demonstrates that even underrepresented chimeric transcripts produced from mRNAs encoded by nonhomologous chromosomes can produce considerable levels of the corresponding protein. Mutant rescue experiments with two different D. virilis mod(mdg4) transgenes indicate the functional conservation of Mod(mdg4) protein isoforms. Both the P(w+ Dv mod(mdg4) 11.5kb NotI) transgene, which encodes the five proximal isoforms, and the P(w+ Dv mod(mdg4) 6.8kb NotI-XbaI) transgene, encoding exclusively common exons 1–4, facilitate rescue of recessive lethality of mod(mdg4) mutant alleles. It is supposed that the rescue ability of the short transgene depends mainly on its capacity to produce sufficient chimeric transcripts consisting of the D. virilis common exons and the endogenous D. melanogaster specific exons, which was demonstrated at least for isoform Mod(mdg4)-67.2. However, the significantly reduced rescue ability of the shorter transgene indicates that all or some isoforms have to exceed a critical threshold to restore viability completely. The P(w+ Dv mod(mdg4) 11.5kb NotI) transgene, which produces five orthologous D. virilis isoforms, significantly improves rescue ability. Position effects influencing the expression level of the transgene cannot be excluded. Further experiments with a series of independent insertions of the short transgene scattered throuhgout the genome should provide further insight into a putative correlation of genomic transgene position and efficiency of trans-splicing (Gabler, 2005).

The observed frequency of chimeric transcripts, although significantly lower as compared to the corresponding endogenous transcript, can be interpreted in two ways. First, the splice donor containing the D. virilis mod(mdg4) common exons is produced at a high level, enabling its spreading in the nucleus. Thus a significant number of donor molecules are in close proximity to mod(mdg4) acceptor mRNAs, even if they are transcribed from a nonhomologous chromosome. The much higher expression of the common exons compared to the specific isoform mod(mdg4)-67.2 in w1118 females (116-fold) is in agreement with this hypothesis. A second explanation supposes transcription of both precursor mRNAs within the same compartment of the nucleus, thereby increasing the frequency of chimeric mRNAs (Gabler, 2005).

Protein Interactions

suppressor of Hairy wing physically interacts with Modifier of mdg4 (Mod[mdg4]). su(Hw) protein was applied to a glutathione-Sepharose 4B column in the presence or absence of a gluathione S-transferase-Mod(mdg4) fusion protein, and the proteins retained in the column were eluted with glutathione and subjected to Western blot analysis. The su(Hw) protein is retained in the column only when previously incubated with modified Mod(mdg4) protein, indicating that the proteins physically interact (Gerasimova, 1995).

The gypsy insulator is thought to play a role in nuclear organization and the establishment of higher order chromatin domains by bringing together several individual insulator sites to form rosette-like structures in the interphase nucleus. The Su(Hw) and Mod(mdg4) proteins are components of the gypsy insulator required for its effect on enhancer-promoter interactions. Using the yeast two-hybrid system, it has been shown that the Mod(mdg4) protein can form homodimers, which can then interact with Su(Hw). The BTB domain of Mod(mdg4) is involved in homodimerization, whereas the C-terminal region of the protein is involved in interactions with the leucine zipper and adjacent regions of the Su(Hw) protein. Analyses using immunolocalization on polytene chromosomes confirm the involvement of these domains in mediating the interactions between these proteins. Studies using diploid interphase cells further suggest the contribution of these domains to the formation of rosette-like structures in the nucleus. The results provide a biochemical basis for the aggregation of multiple insulator sites and support the role of the gypsy insulator in nuclear organization (Ghosh, 2001).

The formation of loops or higher order domains of chromatin structure requires the individual insulator sites from different chromosomal locations to come together in the nucleus. This organization must be mediated by interactions among protein components of the insulator. These interactions are indeed possible and take place in vivo in the case of the gypsy insulator of Drosophila. Mapping the domains of the Su(Hw) and Mod(mdg4) proteins involved in this interaction might shed light on how insulators could be involved in the establishment of higher order chromatin organization. Disruption of the leucine zipper and regions B and C of Su(Hw) renders the gypsy insulator unable to interfere with enhancer-promoter interactions. Results presented here indicate that disruption of this region of Su(Hw) also abolishes its interaction with Mod(mdg4) and eliminates the punctate nuclear staining pattern, suggesting that interaction between the two proteins is required for establishing domains in the nucleus, and that the establishment of these domains correlates with the functionality of the insulator (Ghosh, 2001).

Mod(mdg4) has at least 21 different isoforms generated by alternative splicing. All the proteins contain a common N-terminus of 402 amino acids that includes a BTB/POZ domain, whereas the C-terminus of the protein is variable. Most of these Mod(mdg4) proteins are present in a few sites on polytene chromosomes and only the Mod(mdg4) 2.2 protein (the product of a splice variants, the 2.2 kb transcript that is the major form in the wild-type Canton S strain) appears to be a general component of the gypsy insulator. The Su(Hw) protein interacts with Mod(mdg4) 2.2 through the C-terminal domain of the Mod(mdg4) 2.2 protein. Since this domain is specific to this form of the protein and it is not present in any of the other variants, this result supports the idea that Mod(mdg4) 2.2 is the component of the gypsy insulator, whereas other mod(mdg4)-encoded proteins might have more specific roles in the cell. Deletion of the BTB domain eliminates homodimeric interactions between Mod(mdg4) 2.2 and results in weakened interactions between Su(Hw) and Mod(mdg4) 2.2. This result could be interpreted as suggesting that Su(Hw) and Mod(mdg4) 2.2 interact through the BTB domain. However, this domain by itself is not able to interact with the full-length Su(Hw) protein or with the LZ-B-C region; this is not due to incorrect folding of the protein, since the BTB domain by itself is able to fold properly and mediate interaction with full-length Mod(mdg4) or another BTB domain. These results are interpreted to suggest that the BTB domain mediates the formation of Mod(mdg4) 2.2 dimers, which in turn are required to mediate the interaction with Su(Hw) (Ghosh, 2001).

BTB domain-containing proteins frequently have zinc fingers involved in DNA binding. The Mod(mdg4) 2.2 protein is unusual in the sense that it does not possess any such DNA-binding domain at the C-terminus. However, the presence of a domain that mediates interactions with Su(Hw), which binds DNA through its zinc fingers, might serve the purpose of recruiting this protein to chromatin. The BTB domain is responsible for self-oligomerization of proteins such as GAGA, promyelocytic leukemia zinc finger protein (PLZF) and ZID in vitro. Interestingly, although the BTB domain-containing promyelocytic leukemia zinc finger protein appears to form only dimers in solution, a short four-stranded antiparallel ß-sheet between two symmetry-related dimers can be observed in the crystal. This interaction involves four different peptide chains and, therefore, can give rise to the formation of tetramers and oligomers of higher stoichiometry, suggesting that BTB-containing proteins can form large multimers. This observation is especially significant in the context of proposed models for insulator function, which require multiple insulator sites to come together in one large aggregate. It might be possible for Mod(mdg4) 2.2 to interact with several Mod(mdg4) 2.2 molecules, thus helping to bring together several Mod(mdg4) binding sites to form insulator aggregates as observed in interphase diploid cells. Alternatively, Mod(mdg4) 2.2 might interact with other BTB domain-containing proteins, which might be an integral part of the gypsy insulator complex. The BTB domain forms an extensive dimer interface that is a possible binding site for other proteins. Since the presence of the BTB domain is only partially required for binding of Su(Hw), there might possibly be other as yet unidentified partners of Mod(mdg4) that interact with the BTB domain. Alternatively, the BTB dimer interface might stabilize the interaction of Su(Hw) with the C-terminal region of Mod(mdg4) (Ghosh, 2001).

The finding of specific domains of the Su(Hw) and Mod(mdg4) proteins that mediate intermolecular interactions provides a strong biochemical foundation for the involvement of these proteins in the establishment of chromosomal loops. These loops are the basis for the proposed role of insulators in the formation of higher order chromatin domains and nuclear organization of the chromosomes during interphase. These studies also provide support for the involvement of other proteins in insulator function. The identification of these proteins will provide additional evidence to understand the mechanisms by which these important sequences control eukaryotic gene expression (Ghosh, 2001).

A family of baculovirus inhibitor-of-apoptosis (IAP) genes is present in mammals, insects, and baculoviruses, but the mechanism by which these IAPs block apoptosis is currently unknown. A protein encoded by the Drosophila mod(mdg4) gene binds to the baculovirus IAPs. This protein induces rapid apoptosis in insect cells. Baculovirus IAPs and P35, an inhibitor of aspartate-specific cysteine proteases, block Doom-induced apoptosis. The carboxyl terminus encoded by the 3' exon of the doom cDNA, which distinguishes it from other mod(mdg4) cDNAs, is responsible for induction of apoptosis and engagement of the IAPs. Doom localizes to the nucleus, while the IAPs localize to the cytoplasm, but when expressed together, Doom and the IAPs all localize in the nucleus. Thus, IAPs might block apoptosis by interacting with and modifying the behavior of Doom-like proteins that reside in cellular apoptotic pathways (Harvey, 1997).

It is thought that su(Hw) protein forms discrete domains of gene activity by segregating promoters from enhancer elements through a change in chromatin organization. Functional domains of the su(Hw) protein have been characterized that mediate the silencing effect of mod(mdg4) mutations. Two of three regions of su(Hw), regions B and C, located between the leucine zipper motif and the C-terminal acidic domain, are conserved across Drosophila species and are necessary for both the unidirectional and bidirectional repression of transcription by su(Hw). These domains are implicated in an interaction with Mod(mdg4), which is thought to mediate the unidirectional repression due to insulator function. In contrast, two acidic domains, the N-terminal acidic domain and the C-terminal acidic domain, both dispensable for the unidirectional repression of enhancer elements, are critical for the bidirectional silencing of enhancer activity observed in mutants lacking functional Mod(mdg4) protein. Bidirectional repression is thought to be due to changes in large blocks of chromatin structure (Gdula, 1997).

The presence of a MAR/SAR within gypsy is not totally unexpected, since "boundary" elements are in general regions which contain not only enhancer and insulating elements, but also matrix attachment domains. The rather original feature of the gypsy sequence is that all three domains, which in general are sufficiently "dispersed" so as to allow isolation of "pure" enhancers, MAR/SAR, or insulators, are in the present case "gathered" within a single and relatively short (350 bp) sequence. This rather uncommon situation might in fact be relevant to the pressure for compactness within retroviral sequences, as it is known that retroviruses can only package a limited amount of genetic information. A consequence of compaction is that the gypsy insulator and its associated components are most probably interacting, in vivo, with elements of the nuclear matrix. Accordingly, proteins of the nuclear matrix might play a role in the insulation process, and conversely the su(Hw) protein (which is essential for insulation) might interact with proteins of the matrix. Such interactions could actually account for the data on gypsy insulation and fit with previously proposed models for the gypsy effects (Nabirochkin, 1998).

A first series of data strongly suggested that the gypsy insulator, like all previously characterized insulators, essentially prevents interactions between distal enhancer and promoter, without any direct repressing effect on the enhancer itself. This directional effect can most easily be accounted for by the "looping model" involving generation of structural domains isolated one from the other by attachment of boundary sequences (MAR/SAR) to the nuclear matrix. Alternatively, a series of data on gypsy insulation (essentially in mod(mdg4) mutants) discloses bidirectional repressing effects, which can be accounted for by a model involving heterochromatinization. The present data (showing that the gypsy insulator behaves as a MAR/SAR) are clearly in agreement with the structural looping model, but also support the heterochromatinization model. Indeed, the gypsy MAR/SAR DNA per se, in the absence of su(Hw) protein, is involved in histone H1 nucleation (as shown in this paper), and it has been demonstrated that histone H1 nucleation is associated with both DNA compaction and transcriptional silencing. Additionally, Laemmli and co-workers have found that histone H1 can be removed from MAR/SAR domains by distamycin and distamycin-like proteins (D-like proteins, such as the high mobility group proteins); this has led to the proposal that MAR/SARs can activate or repress transcription of adjacent genes depending on the nucleation/depletion of histone H1. The gypsy MAR/SAR could then be responsible for the repressing effect observed in the mod(mdg4) mutants, as well as in the present assay within heterologous cells (assuming further that appropriate D-like proteins are absent in those cells). Taking into account, in addition, that mutations in the mod(mdg4) or the su(Hw) genes modify position-effect variegation, it could be further hypothesized that the su(Hw)/mod(Mdg4) complex acts as the D-like proteins and modifies the nucleation processes to allow the switch from a repressing to an active state. Accordingly, a model in which the su(Hw) binding sites and the associated su(Hw)/mod(Mdg4) complex modulate the effects of the MAR/SAR DNA sequence could rather simply account for the biological effects of the gypsy insulator in both the wild type and su(Hw)/mod(mdg4) mutants. The proposed model would then reconcile the two previous models for gypsy insulation, i.e. the heterochromatinization and the looping models (Nabirochkin, 1998 and references).

Germ line transformation of white- Drosophila embryos with P-element vectors containing white expression cassettes results in flies with different eye color phenotypes due to position effects at the sites of transgene insertion. These position effects can be cured by specific DNA elements, such as the Drosophila scs and scs' and by gypsy elements, that have insulator activity in vivo. Matrix attachment regions (MARs) are DNA elements that are identified and defined by their ability to bind to DNA- and histone-depleted nuclei, which are generally termed nuclear matrices. MARs are typically AT-rich elements that contain consensus cleavage sites for topoisomerase II, and they may contain one or more loosely defined short sequence motifs, but, in general, their structures are not highly homologous. MARs are dispersed throughout eukaryotic genomes, having been found in centromeric DNA, within genes, and in intergenic regions. Especially interesting is the observation that the gypsy insulator of Drosophila has been identified as a MAR. This is a retroviral sequence that binds Suppressor of Hairy wing and the su(Hw) associated protein Mod(mdg4) (Nabirochkin,1998). The matrix-binding activities of MARs have been conserved throughout eukaryotic evolution. The functions of MARs in vivo are largely unknown, but one commonly held view is that MARs anchor individual chromatin loops to a proteinaceous matrix or scaffold in both interphase nuclei and mitotic chromosomes (Namciu, 1998 and references).

A test was performed of the ability of human MARS to insulate white from position effect variagation. Two different human MARs, from the apolipoprotein B and alpha1-antitrypsin loci, insulate white transgene expression from position effects in Drosophila. Both elements reduce variability in transgene expression without enhancing levels of white gene expression. In contrast, expression of white transgenes containing human DNA segments without matrix-binding activity is highly variable in Drosophila transformants. These data indicate that human MARs can function as insulator elements in vivo in Drosophila (Namciu, 1998).

Insulation of enhancer-promoter communication by a gypsy transposon insert in the Drosophila cut gene: Cooperation between Suppressor of Hairy-wing and Modifier of mdg4 proteins

The Drosophila mod(mdg4) gene products counteract heterochromatin-mediated silencing of the white gene and help activate genes of the bithorax complex. They also regulate the insulator activity of the gypsy transposon when gypsy inserts between an enhancer and promoter. The Su(Hw) protein is required for gypsy-mediated insulation, and the Mod(mdg4)-67.2 protein binds to Su(Hw). The aim of this study was to determine whether Mod(mdg4)-67.2 is a coinsulator that helps Su(Hw) block enhancers or a facilitator of activation that is inhibited by Su(Hw). Evidence is provided that Mod(mdg4)-67.2 acts as a coinsulator by showing that some loss-of-function mod(mdg4) mutations decrease enhancer blocking by a gypsy insert in the cut gene. The C terminus of Mod(mdg4)-67.2 binds in vitro to a region of Su(Hw) that is required for insulation, while the N terminus mediates self-association. The N terminus of Mod(mdg4)-67.2 also interacts with the Chip protein, which facilitates activation of cut. Mod(mdg4)-67.2 truncated in the C terminus interferes in a dominant-negative fashion with insulation in cut but does not significantly affect heterochromatin-mediated silencing of white. It is inferred that multiple contacts between Su(Hw) and a Mod(mdg4)-67.2 multimer are required for insulation. It is theorized that Mod(mdg4)-67.2 usually aids gene activation but can also act as a coinsulator by helping Su(Hw) trap facilitators of activation, such as the Chip protein (Gause, 2001).

This study found that certain loss-of-function alleles of mod(mdg4) reduce insulation by the Su(Hw) protein in the cut gene. This is evidence that mod(mdg4) products are not simply targets of Su(Hw) insulator activity but contribute to the insulator activity of Su(Hw). Wild-type Mod(mdg4)-67.2, the major protein product of mod(mdg4), interacts with a region of Su(Hw) that has been shown to be required for insulation in vivo, but the truncated versions of the Mod(mdg4)-67.2 proteins produced by the viable mod(mdg4)u1 and mod(mdg4)T6 alleles did not. This is consistent with the observation that binding of Mod(mdg4) proteins to Su(Hw) binding sites on salivary gland polytene chromosomes is greatly reduced in mod(mdg4)u1 mutants. mod(mdg4)u1 and mod(mdg4)T6 more strongly reduce insulator activity than do null alleles of mod(mdg4) and that this antimorphic nature of mod(mdg4)u1 may stem from the ability of the mutant protein to interact with wild-type Mod(mdg4)-67.2 protein. To explain these observations, a model is proposed in which a multimer of Mod(mdg4)-67.2 interacts with more than one Su(Hw) molecule to form the active insulator complex, and the truncated Mod(mdg4)-67.2 proteins produced by mod(mdg4)u1 and mod(mdg4)T6 destabilize this complex (Gause, 2001).

The evidence that Mod(mdg4)-67.2 is an active component of the gypsy insulator that blocks gene activation appears at first glance to be contradictory to the evidence indicating that the mod(mdg4) gene is a member of the trxG of genes that activate genes in the bithorax complex. Another trxG protein, however, also appears to have insulator activity. The GAGA factor encoded by the Trithorax-like (Trl) gene is similar to Mod(mdg4)-67.2 in that it contains a BTB/POZ motif at the N terminus, self-interacts, and supports activation of the bithorax complex. GAGA factor is also required for enhancer blocking by the insulator associated with the even-skipped promoter. This insulator activity requires GAGA binding sites just proximal to the transcription start site and is diminished by Trl mutations. Potential GAGA binding sites are found just proximal to many promoters in Drosophila, including sequences associated with insulator activity in the alpha1 tubulin gene promoter. The GAGA-dependent insulator just proximal to the eve promoter does not prevent activation of the eve promoter by upstream enhancers even though it is positioned between them. Indeed, GAGA binding sites just proximal to the engrailed gene promoter potentiate activation by an upstream enhancer. To resolve the paradoxical insulator and activator activities of the GAGA and Mod(mdg4)-67.2 BTB/POZ proteins, therefore, it must be theorized that the function of promoter-proximal insulators is to aid activation of the promoters that contain them by helping to capture and anchor distal activator or facilitator proteins near the promoter. If so, it is feasible that the Mod(mdg4)-67.2 protein has a promoter-anchoring function in the bithorax complex, but when bound to Su(Hw), it anchors activator or facilitator proteins far from the promoter, thereby preventing activation (Gause, 2001).

The centrosomal protein CP190 is a component of the gypsy chromatin insulator

Chromatin insulators, or boundary elements, affect promoter-enhancer interactions and buffer transgenes from position effects. The gypsy insulator of Drosophila is bound by a protein complex with two characterized components, the zinc finger protein Suppressor of Hairy-wing [Su(Hw)] and Mod(mdg4)2.2, which is one of the multiple spliced variants encoded by the modifier of mdg4 [mod(mdg4)] gene. A genetic screen for dominant enhancers of the mod(mdg4) phenotype identified the Centrosomal Protein 190 (CP190) as an essential constituent of the gypsy insulator. The function of the centrosome is not affected in CP190 mutants whereas gypsy insulator activity is impaired. CP190 associates physically with both Su(Hw) and Mod(mdg4)2.2 and colocalizes with both proteins on polytene chromosomes. CP190 does not interact directly with insulator sequences present in the gypsy retrotransposon but binds to a previously characterized endogenous insulator, and it is necessary for the formation of insulator bodies. The results suggest that endogenous gypsy insulators contain binding sites for CP190, which is essential for insulator function, and may or may not contain binding sites for Su(Hw) and Mod(mdg4)2.2 (Pai, 2004).

A genetic screen for dominant enhancers of mod(mdg4) has resulted in the identification of CP190 as a third component of the gypsy insulator. CP190 is present at gypsy retrotransposon insulator sites and overlaps extensively with Su(Hw) and Mod(mdg4)2.2 at presumed endogenous insulators. CP190 displays a specific distribution pattern on polytene chromosomes, showing significant overlap with Su(Hw) and Mod(mdg4)2.2 at the junctions between transcriptionally inert bands and transcriptionally active interbands. Similar localization patterns have been reported for other insulators. For example, the faswb insulator at the notch locus and the BEAF-32 protein of the scs' insulator are also present at the boundaries between bands and interbands. Results suggest that CP190 can bind DNA on its own or can be tethered to the chromosome through interactions with Su(Hw). Mutations in the CP190 gene impair the function of the insulator present in the gypsy retrotransposon without affecting the presence of Su(Hw) and Mod(mdg4)2.2, suggesting an essential task for CP190 in the activity of this insulator. In addition, the lethality of CP190 mutants suggests a critical role for the CP190 protein in the function of gypsy endogenous insulators. This essential role may be a consequence of the requirement of CP190 for the formation of insulator bodies in the nuclei of diploid cells (Pai, 2004).

The insulator present in the gypsy retrotransposon contains only Su(Hw) binding sites, and CP190 is present in this insulator through direct interactions with Su(Hw). The gypsy insulator contains 12 Su(Hw) binding sites, and at least four are needed for insulator activity. However, clusters of three or more Su(Hw) binding sites are rare in the genome. Therefore, a critical question is whether the sites of Su(Hw) and Mod(mdg4)2.2 localization present throughout the genome truly function as insulators. The presence of CP190 at these sites and its ability to bind DNA might explain this apparent paradox. For example, the endogenous insulator present in the yellow-achaete region has only two binding sites for Su(Hw). Nevertheless, the y454 fragment containing this insulator is able to bind CP190, suggesting that this protein might act in concert with Su(Hw) to confer insulator activity. It is therefore possible that endogenous gypsy insulators are composed of binding sites for Su(Hw) and/or for CP190 and, together with Mod(mdg4)2.2, form a complex. Endogenous gypsy insulators may have few or no Su(Hw) binding sites, and they may rely on CP190 to bind DNA and tether other insulator components such as Mod(mdg4)2.2 via protein-protein interactions (Pai, 2004).

Previous studies have suggested that gypsy insulators separated at a distance in the genome may come together and form large insulator bodies in the nucleus during interphase. These aggregates represent higher order structures of chromatin and are implicated in the regulation of gene expression by compartmentalizing the genome into transcriptionally independent domains. The formation of these aggregates appears to require Mod(mdg4) function because the large aggregates are missing in mod(mdg4) mutants. The formation of gypsy insulator bodies is severely impaired also in CP190 mutants, suggesting that CP190 plays an essential role in the formation of these bodies and in the establishment of the chromatin domain organization mediated by gypsy endogenous insulators. It is possible that the BTB/POZ protein-protein interaction domains of both CP190 and Mod(mdg4)2.2 are required for and contribute to the stability of the interactions among insulator sites. In vitro-expressed CP190 lacking the BTB/POZ domain is soluble, whereas the wt protein is not, further suggesting that CP190 might exist as a complex with itself or other proteins in vivo, and the formation of this complex is likely mediated by the BTB/POZ domain. However, because CP190 is present at the gypsy insulator in the absence of Mod(mdg4)2.2 protein, the interaction between these two proteins may not be crucial for CP190 recruitment to the insulator (Pai, 2004).

Previous studies have identified CP190 as a centrosome-specific protein during mitosis that also associates with chromatin during interphase. Although many of these studies have focused on the possible role of CP190 during cell division, the current results suggest that centrosomal function and cell division are not affected in CP190 mutants. This conclusion is supported by independent studies of CP190 function during the cell cycle. The main function of CP190 might then be to regulate chromosome-related processes during interphase. Several lines of evidence suggest that this role is related to the function of the gypsy insulator: mutations in CP190 alter gypsy-induced phenotypes; CP190 colocalizes with Su(Hw) and Mod(mdg4)2.2 on polytene chromosomes and in diploid cell nuclei, and CP190 associates physically with gypsy insulator components in vitro and in vivo. However, the centrosomal localization of CP190 might also be important for its role in the gypsy insulator despite being unnecessary for cell cycle progression. The centrosome could either be a temporary storage site for CP190 during mitosis, or a site for a mitosis-specific modification that could be important for CP190 reassociation with chromosomes later in the cell cycle. The presence of CP190 in the centrosome could also be related to the regulation of the level of this protein in the cell. In fact, it has been shown that some chromatin-binding proteins are targeted to the centrosome for degradation. Alternatively, the presence of CP190 at the centrosome might be related to a possible role in the ubiquitin modification pathway. Recent findings have linked BTB/POZ domain proteins to ubiquitin E3 ligase function, some of which are known to be present at the centrosome. CP190 may be involved in similar types of interactions as an adaptor for ubiquitin E3 ligases and might target associated insulator proteins to the centrosome during mitosis for ubiquitination and/or degradation, which in turn may be required for properly reestablishing chromosome domain boundaries after mitosis (Pai, 2004).

The ubiquitin ligase dTopors directs the nuclear organization of a chromatin insulator

Chromatin insulators are gene regulatory elements implicated in the establishment of independent chromatin domains. The gypsy insulator of D. melanogaster confers its activity through a protein complex that consists of three known components, Su(Hw), Mod(mdg4)2.2 (a spliced variant encoded by the modifier of mdg4), and CP190. Drosophila Topoisomerase I-interacting RS protein (dTopors) interacts with the insulator protein complex and is required for gypsy insulator function. In the absence of Mod(mdg4)2.2, nuclear clustering of insulator complexes is disrupted and insulator activity is compromised. Overexpression of dTopors in the mod(mdg4)2.2 null mutant rescues insulator activity and restores the formation of nuclear insulator bodies. dTopors associates with the nuclear lamina, and mutations in lamin disrupt dTopors localization as well as nuclear organization and activity of the gypsy insulator. Thus, dTopors appears to be involved in the establishment of chromatin organization through its ability to mediate the association of insulator complexes with a fixed nuclear substrate (Capelson, 2005).

A yeast two-hybrid screen for proteins that interact with Mod(mdg4)2.2 resulted in identification of dTopors as a factor involved in the activity of the gypsy insulator. dTopors was found to interact with the three known insulator components, Su(Hw), Mod(mdg4)2.2, and CP190, and to associate with the gypsy insulator complex on chromosomes and in diploid nuclei. Additionally, dTopors appears to physically associate with the nuclear lamina. Genetically, dTopors was shown to behave as a positive factor involved in gypsy insulator activity. Consistently, reduction in levels of dTopors, observed in the background of a dTopors-spanning deletion or of an inducible dTopors RNAi construct, results in the disruption of insulator activity. The effects of elevated levels of dTopors are particularly dramatic as they restore the activity of a compromised gypsy insulator on multiple levels. The enhancer blocking function of the insulator, the binding of Su(Hw) to chromatin, and the formation of insulator bodies in cell nuclei -- all compromised in mod(mdg4)u1 mutants -- are rescued by overexpression of dTopors (Capelson, 2005).

These effects can be explained by a model in which dTopors acts as a nuclear lamina-associated factor that serves to tether the gypsy insulator complexes to a fixed substrate. In the wild-type situation, Mod(mdg4)2.2 mediates the coalescence of distant insulator sites and the subsequent establishment of chromatin compartments, whereas dTopors may be involved in further organization of insulator bodies at specific nuclear attachment points through its direct interaction with both Mod(mdg4)2.2 and Su(Hw). The absence of Mod(mdg4)2.2 leads to the breakdown of nuclear organization and the destabilization of Su(Hw)-chromatin association. Through tethering distant insulator sites to a nuclear substrate, dTopors, when present at elevated levels, may be able to compensate for the loss of a component such as Mod(mdg4)2.2. By stabilizing the nuclear organization of insulator complexes, dTopors may also promote the binding of Su(Hw) to chromatin. This explanation is further reinforced by the observed disruptive effects of a lamin mutation on the nuclear organization and the enhancer blocking activity of the gypsy insulator (Capelson, 2005).

The connection between gypsy insulator activity and nuclear insulator bodies has relied predominantly on the effects of the mutations in Mod(mdg4)2.2 and CP190 on both enhancer blocking function and insulator body integrity. The activity of dTopors provides further evidence for a functional relationship between insulators and their nuclear localization, since rescue of insulator phenotypes by dTopors is accompanied by the recovery of insulator bodies. Establishment of independent chromatin domains, which has been proposed as the main function of insulators, is thought to rely on structural partitioning of chromatin through physical interactions between distant loci or through interactions with a fixed nuclear substrate. It has been previously intimated that gypsy insulators may employ both types of structural organization to ensure the establishment of domain autonomy. This work suggests that the gypsy insulator may undergo physical clustering through the BTB domains of Mod(mdg4)2.2 and of CP190 and may utilize the attachment to the nuclear lamina via dTopors. The interaction of the insulator with a nuclear substrate is further supported by a recent report that gypsy insulator proteins associate with the nuclear matrix, of which lamin is a principal component. Tethering to a subnuclear surface has also been implicated in the activity of the chicken β-globin insulator, where β-globin insulator loci were observed to interact with the nucleolar surface, perhaps via a direct association between the insulator protein CTCF and the nucleolar component nucleophosmin (Capelson, 2005).

The E3 ubiquitin ligase activity of dTopors was not found to act directly on the known insulator proteins, yet the RING domain of dTopors appears to be essential for its positive effect on the gypsy insulator. It thus remains possible that an unknown factor involved in insulator activity may be a substrate for dTopors-mediated ubiquitination. A connection between the gypsy insulator complex and the ubiquitin conjugation pathway is also suggested by the presence of BTB domains in Mod(mdg4)2.2 and CP190, since BTB domain proteins have been proposed to act as substrate adaptors for the ubiquitin RING E3 ligases. It is feasible that BTB-containing insulator proteins and RING-containing dTopors are involved in ubiquitin conjugation with functional consequences for the insulator (Capelson, 2005).

The association of dTopors with a subset of insulator binding sites on polytene chromosomes implies that its presence is not required by all insulator complexes. This may be a consequence of the proposed function of dTopors as a tethering factor, such that the interaction between distant insulator loci may alleviate the need for dTopors at every binding site of the insulator complex. Alternatively, it may suggest that endogenous insulator complexes are not all functionally equivalent, and that the enzymatic properties of dTopors may be important for specific insulator complexes. The ubiquitin ligase activity of dTopors may be involved in regulation of insulator complexes, such that modification of a yet uncharacterized component by ubiquitin can lead to variation in function of endogenous insulators (Capelson, 2005).

SUMO conjugation attenuates the activity of the gypsy chromatin insulator

Chromatin insulators have been implicated in the establishment of independent gene expression domains and in the nuclear organization of chromatin. Post-translational modification of proteins by Small Ubiquitin-like Modifier (SUMO) has been reported to regulate their activity and subnuclear localization. Evidence is presented suggesting that two protein components of the gypsy chromatin insulator of Drosophila melanogaster, Mod(mdg4)2.2 and CP190, are sumoylated, and that SUMO is associated with a subset of genomic insulator sites. Disruption of the SUMO conjugation pathway improves the enhancer-blocking function of a partially active insulator, indicating that SUMO modification acts to regulate negatively the activity of the gypsy insulator. Sumoylation does not affect the ability of CP190 and Mod(mdg4)2.2 to bind chromatin, but instead appears to regulate the nuclear organization of gypsy insulator complexes. The results suggest that long-range interactions of insulator proteins are inhibited by sumoylation and that the establishment of chromatin domains can be regulated by SUMO conjugation (Capelson, 2006).

Two protein components of the gypsy chromatin insulator, Mod(mdg4)2.2 and CP190, were found to be modified by SUMO in vitro and in vivo. dTopors was observed to interfere with their sumoylation by possibly disrupting the contacts between the SUMO E2 enzyme Ubc9 and substrate insulator proteins. The inhibitory effect of dTopors, although relatively subtle, is consistent across the various assays utilized such that any time dTopors was introduced at higher levels, either by direct addition in vitro or by increasing expression in vivo, it was found to result in reduced sumoylation of Mod(mdg4)2.2 and CP190. Disruption of SUMO conjugation by mutations in genes coding for Ubc9 and SUMO exerts a positive effect on gypsy insulator activity, suggesting that the normal role of SUMO modification is to antagonize insulator function. A fraction of chromatin-bound insulator proteins appears to be associated with SUMO, yet mutations in the SUMO pathway are not seen to affect the chromatin-binding properties of CP190 or Mod(mdg4)2.2. Instead, sumoylation interferes with the formation of nuclear insulator bodies, such that overexpression of Ubc9 leads to breakdown of nuclear insulator structures, whereas lower levels of Ubc9 and sumoylation result in a partial recovery of coalescence lost in the absence of Mod(mdg4)2.2 (Capelson, 2006).

These findings suggest that modification of CP190 and Mod(mdg4)2.2 by SUMO may prevent self-association and thus interfere with long-range interactions between distant insulator complexes required to form insulator bodies. Thereby, sumoylation may preclude formation of closed chromatin loops and the consequent establishment of autonomous gene expression domains (Capelson, 2006).

Multiple lines of evidence point to a role for SUMO modification in transcriptional repression. Sumoylation of histones has been characterized as a mark of repressed chromatin, whereas SUMO conjugation to certain transcriptional regulators leads to their association with histone deacetylases, which remove the active acetylation marks from histones. SUMO modification of the Polycomb group (PcG) protein SOP-2 is required for its function in stable repression of Hox genes, and another PcG repressor, Pc2, acts as a SUMO E3 ligase. Modification of gypsy insulator proteins by SUMO does not seem to associate them exclusively with transcriptional repression, as reduction of sumoylation in lwr/smt3 mutants results in the upregulation of expression from the ombP1-D1 locus, but in the downregulation of transcription at y2 and ct6. In these cases, transcriptional output appears to correlate only with the enhancer-blocking activity of the insulator. Nevertheless, it is possible that one of the roles of sumoylation involves association of selected insulator sites in the genome with transcriptional repression. Sumoylated insulator complexes may not participate in the formation of expression domains, but instead, could target silencing factors to the surrounding chromatin (Capelson, 2006).

In mammalian nuclei, the homolog of dTopors localizes to PML bodies, which are enriched in the SUMO conjugation machinery. If inhibition of sumoylation is also a property of mammalian Topors, it may play a role in preventing further sumoylation of factors that are targeted to these nuclear compartments. In this manner, ICP0 also localizes to the PML bodies, where it causes desumoylation of two primary components, PML and SP100. It has been reported that Topors may function as a SUMO E3 ligase for the tumor suppressor p53 protein. This apparent contradiction with the current results may be due to several reasons. Topors and dTopors may have diverged their functions regarding the SUMO pathway, such that Topors functions as a SUMO E3 while dTopors interferes with SUMO addition due to its conserved interaction with Ubc9. Alternatively, the involvement of dTopors in the SUMO pathway may be substrate-specific, since it may bind to Ubc9 in ways that allow for interaction with a given target protein or prevent it. In the context of the gypsy insulator, the interference of dTopors with sumoylation is consistent with previous observations that dTopors promotes insulator activity, whereas sumoylation appears to disrupt it (Capelson, 2006).

It has been suggested that SUMO conjugation may affect the function of the modified protein even after the SUMO tag itself has been removed, creating a cellular memory for protein regulation. This idea has arisen partly to explain the commonly observed contradiction between the small percentage of a given protein that is modified by SUMO and the dramatic consequences of the modification on the protein's cellular function. Sumoylation may be needed for proteins to enter stable complexes or functional states, but the persistence of the SUMO modification may not be required after the initial establishment. Thus, the actual effect of sumoylation may far exceed that of the detectable sumoylated population since the function of a much larger proportion of molecules has been altered by SUMO conjugation and subsequent deconjugation. Similarly to other reported cases, the sumoylated forms of Mod(mdg4)2.2 and of CP190 represent a small fraction of the total pool of the insulator proteins, yet the phenotypic effects of the loss of these forms are quite striking. It is possible that SUMO attachment regulates the initial organization of chromatin domains, perhaps in earlier development or following mitosis, yet once established, the domains may be stably maintained without SUMO. Additionally, the rapid conjugation and deconjugation cycle of the SUMO tag implies that sumoylation may be used by processes that require reassembly upon signal. In that sense, SUMO modification seems particularly suitable for the regulation of gene expression domains as it can result in 'remembered' yet flexible states (Capelson, 2006).


modifier of mdg4: Biological Overview | Developmental Biology | Effects of Mutation | References

Home page: The Interactive Fly © 1997 Thomas B. Brody, Ph.D.

The Interactive Fly resides on the
Society for Developmental Biology's Web server.