sans fille


A complete set of seven U1-related sequences have been cloned and characterized from Drosophila melanogaster. These sequences are located at the three cytogenetic loci 21D, 82E, and 95C. Three of these sequences have been previously studied: one U1 gene at 21D that encodes the prototype U1 sequence (U1a), one U1 gene at 82E that encodes a U1 variant with a single nucleotide substitution (U1b), and a pseudogene at 82E. The four previously uncharacterized genes comprise another U1b gene at 82E, two additional U1a genes at 95C, and a U1 gene at 95C that encodes a new variant (U1c) with a distinct single nucleotide change relative to U1a. Three blocks of 5' flanking sequence similarity are common to all six full length genes. The U1b RNA is expressed in Drosophila Kc cells and is associated with snRNP proteins, suggesting that the U1b-containing snRNP particles are able to participate in the process of pre-mRNA splicing. The expression throughout Drosophila development of the two U1 variants has been observed relative to the prototype sequence. The U1c variant is undetectable, while the U1b variant exhibits a primarily embryonic pattern reminiscent of the expression of certain U1 variants in sea urchin, Xenopus, and mouse (Lo, 1990).

Both experimental work and surveys of the lengths of internal exons in nature have suggested that vertebrate internal exons require a minimum size of approximately 50 nucleotides for efficient inclusion in mature mRNA. This phenomenon has been ascribed to steric interference between complexes involved in recognition of the splicing signals at the two ends of short internal exons. To determine whether U1 small nuclear ribonucleoprotein, a multicomponent splicing factor that is involved in the first recognition of splice sites, contributes to the lower size limit of vertebrate internal exons, advantage was taken of the observation that U1 small nuclear RNAs (snRNAs), which bind upstream or downstream of the 5' splice site (5'SS) stimulate splicing of the upstream intron. By varying the position of U1 binding relative to the 3'SS, it is shown that U1-dependent splicing of the upstream intron becomes inefficient when U1 is positioned 48 nucleotides or less downstream of the 3'SS, suggesting a minimal distance between U1 and the 3'SS of approximately 50 nucleotides. This distance corresponds well to the suggested minimum size of internal exons. The results of experiments in which the 3'SS region of the reporter is duplicated suggest an optimal distance of greater than 72 nucleotides. Inclusion of a 24-nucleotide miniexon is promoted by the binding of U1 to the downstream intron but not by binding to the 5'SS (Hwang, 1997).

U1 snRNP of yeast

The formation of mRNAs in the nuclei of eukaryotic cells involves several co- and post-transcriptional processing events. These include 5' end capping, 3' end formation, usually by cleavage and polyadenylation, and frequently the removal of intervening sequences by splicing. Pre-mRNA splicing can be conceptually divided into distinct stages. The initial step is recognition of conserved intronic sequences near the 5' splice site and branchpoint region by a subset of splicing factors. This is followed by assembly of multiple additional splicing factors to form the spliceosome. Rearrangements within the spliceosome then occur, accompanying the two chemical steps of intron removal. Spliced mRNA is released for export to the cytoplasm while intronic RNA is degraded and splicing factors are recycled (Fortes, 1999).

The first defined step of splicing consists of the formation of commitment complexes in yeast and E complex in mammals. In yeast, two forms of commitment complex are experimentally separable, CC1 and CC2. It is likely, though not definitively proven, that CC1 is a precursor of CC2. Both contain U1 snRNP, which interacts with the 5' splice site. CC2 additionally contains at least two proteins, BBP and Mud2p, that bind to the branchpoint sequence and an adjacent pyrimidine-rich tract, respectively. mBBP/SF1 and U2AF65 (see Drosophila U2 small nuclear riboprotein auxiliary factor 50), the mammalian homologs of these proteins, are present in E complex (Fortes, 1999 and references therein).

These facts about early steps in spliceosome formation point to a critical role for U1 snRNP in 5' splice site definition and choice, and lead to the question of how the choice is made between two alternative 5' splice sites that can both be spliced to a common 3' splice site. Examination of alternative splicing in vertebrates suggests that factors that are not components of U1 snRNP can influence the selection of splice sites. Recent work in yeast has shown, however, that at least one U1 snRNP protein can also influence 5' splice site choice (Fortes, 1999 and references therein).

Yeast U1 snRNA is significantly larger than vertebrate U1 snRNA. The yeast-specific regions of the RNA are not absolutely essential for survival, but nevertheless they play a role in splicing. Yeast U1 snRNP, as biochemically purified, is considerably more complex than vertebrate U1 snRNP. Both contain the Sm core proteins and three U1-specific proteins: U1 70K/Snp1p, U1A/Mud1p, and U1C/yU1-C. In addition, the yeast U1 snRNP contains at least six specific proteins (Snu71p, Snu65p, Snu56p, Prp39p, Prp40p, and Nam8p) that have no currently characterized vertebrate homologs. U1 snRNP interacts with the 5' splice site via base-pairing through U1 snRNA. Recent data indicate that the yeast U1 snRNP proteins also make extensive contact with the pre-mRNA both upstream and downstream of the 5' splice site. These interactions are likely to increase the stability of U1 snRNP-5' splice site binding. In addition, at least one U1 snRNP protein-pre-mRNA interaction, involving Nam8p, is affected by the sequence of the pre-mRNA to which the protein binds. The sequence specificity of this interaction can affect 5' splice site choice (Fortes, 1999 and references therein).

Other signals on a pre-mRNA can also influence binding of U1 snRNP to a 5' splice site or other steps that affect the efficiency of intron recognition and removal. Examples include the effects of adjacent introns or 3' end formation signals, exon enhancer sequences, and, in the case of the cap-proximal intron, the cap structure. The effect of the cap structure is mediated by the nuclear cap-binding complex (CBC), a conserved heterodimeric complex composed of CBP80 and CBP20. In both yeast and mammals, CBC appears to act by increasing the efficiency of recognition of the cap-proximal 5' splice site by U1 snRNP during commitment complex/E complex assembly. Much of the initial evidence for this mechanism comes from biochemical experiments but in yeast a considerable body of genetic data indicates that CBC plays an important role in commitment complex assembly. The gene encoding yCBP20, MUD13, was identified by a mutation that causes synthetic lethality in combination with a nonlethal deletion of part of U1 snRNA. A more extensive search for genes whose mutation led to synthetic lethality in the absence of CBC led to the identification of LUC genes (lethal unless CBC is produced). The LUC collection includes genes that encode several components of the commitment complex, including both Mud2p/Luc2p and several protein components of yeast U1 snRNP. Some of these genes encode proteins conserved between yeast and vertebrates, like SmD3/Luc6p or Mud1p/Luc1p, the yeast homolog of the human U1A protein, and others encode several of the recently identified yeast-specific U1 snRNP proteins, Nam8p/Luc3p, Snu56p/Luc4p, and Snu71p/Luc5p (Fortes, 1999 and references therein).

One functionally uncharacterized gene identified in the screen was named LUC7. Luc7p is an additional component of the yeast U1 snRNP. LUC7 is an essential gene, and Luc7p is required for commitment complex formation in vitro. In the presence of a temperature-sensitive form of Luc7p, the protein composition of U1 snRNP is altered. Although the defective U1 snRNP still appears to be partially active in vivo, splicing efficiency is reduced and 5' splice site selection is altered. The change in 5' splice site recognition is similar to that seen in the absence of CBC, suggesting that CBC-U1 snRNP interaction is affected by the absence of Luc7p. The LUC7 gene was identified by a mutation that causes lethality in a yeast strain lacking the nuclear cap-binding complex (CBC). Luc7p is similar in sequence to metazoan proteins that have arginine-serine and arginine-glutamic acid repeat sequences characteristic of a family of splicing factors. Although the in vivo defect in splicing wild-type reporter introns in a luc7 mutant strain is comparatively mild, splicing of introns with nonconsensus 5' splice site or branchpoint sequences is more defective in the mutant strain than in wild-type strains. By use of reporters that have two competing 5' splice sites, a loss of efficient splicing to the cap proximal splice site is observed in luc7 cells, analogous to the defect seen in strains lacking CBC. CBC can be coprecipitated with U1 snRNP from wild-type yeast strains (but not from luc7). These data suggest that the loss of Luc7p disrupts U1 snRNP-CBC interaction, and that this interaction contributes to normal 5' splice site recognition (Fortes, 1999).

Examination of protein sequence databases has revealed the existence of metazoan relatives of Luc7p, including three in human and C. elegans and others in Arabidopsis, Drosophila, and other eukaryotes. The regions encoding the zinc finger motifs are particularly highly conserved (57% similarity conserved across the whole family). LUC7 appears to have been duplicated early in evolution, leading to the Luc7A and Luc7B subfamilies in higher eukaryotes. Interestingly, all metazoan LUC7 family members contain carboxy-terminal extensions with multiple arginine-serine or arginine-glutamate repeats, characteristic of a large number of metazoan splicing factors (Fortes, 1999).

Analysis of U1A, the A protein component of U1 snRNP

Many RNA-associated proteins contain a ribonucleoprotein (RNP) consensus octamer encompassed by a conserved 80 amino acid sequence, termed an RNA recognition motif (RRM). RRM family members contain either one (class I) or multiple (class II) copies of this motif. A class II component of the U1 small nuclear RNP (snRNP), the A protein of U1 snRNP (U1snRNP-A), contains two RRMs (RRM1 and -2), yet has only one binding domain (RRM1) that interacts specifically with stem-loop II of U1 RNA. Quantitative analysis of binding affinities of fragments of U1snRNP-A demonstrates that an 86-amino acid polypeptide is competent to bind to U1 RNA with an affinity comparable to that of the full-length protein (Kd approximately 80 nM). The carboxyl-terminal RRM2 of U1snRNP-A does not bind to U1 RNA and may recognize an unidentified heterologous RNA. It is proposed that class II proteins may function as bridges between RNA components of RNP complexes, such as the spliceosome (Lutz-Freyermuth, 1990).

The RNP domain is a very common eukaryotic protein domain involved in recognition of a wide range of RNA structures and sequences. Two structures of human U1A in complex with distinct RNA substrates have revealed important aspects of RNP-RNA recognition, but have also raised intriguing questions concerning the origin of binding specificity. The beta-sheet of the domain provides an extensive RNA-binding platform for packing aromatic RNA bases and hydrophobic protein side chains. However, many interactions between functional groups (on the single-stranded nucleotides) and residues (on the beta-sheet surface) are potentially common to RNP proteins with diverse specificity, and therefore make only limited contribution to molecular discrimination. The refined structure of the U1A complex with the RNA polyadenylation inhibition element reported here clarifies the role of the RNP domain principal specificity determinants (the variable loops) in molecular recognition. The most variable region of RNP proteins, loop 3, plays a crucial role in defining the global geometry of the intermolecular interface. Electrostatic interactions with the RNA phosphodiester backbone involve protein side chains that are unique to U1A and are likely to be important for discrimination. This analysis provides a novel picture of RNA-protein recognition, much closer to the current understanding of protein-protein recognition than that of DNA-protein recognition (Allain, 1997).

By the use of hybrids between a U1 small nuclear ribonucleoprotein (snRNP: U1A) and a U2 snRNP (U2B"), regions have been identified containing 29 U1A-specific amino acid residues scattered throughout the 117 N-terminal residues of the protein, which are involved in binding to U1 RNA. The U1A-specific amino acid residues have been arbitrarily divided into seven contiguous groups. None of these groups is sufficient for U1 binding when transferred singly into the U2B" context, and none of the groups is essential for U1 binding in U1A. Several different combinations of two or more groups can, however, confer the ability to bind U1 RNA to U2B", suggesting that most or all of the U1A-specific amino acid residues contribute incrementally to the strength of the specific binding interaction. Further evidence for the importance of the U1A-specific amino acid residues, some of which lie outside the region previously shown to be sufficient for U1 RNA binding, is obtained by comparison of the sequence of human and Xenopus laevis U1A cDNAs. These are extremely similar (94.4% identical) between amino acid residues 7 and 114 but much less conserved immediately upstream and downstream from this region (Scherly, 1991).

U1 snRNP-A protein (U1A) interacts with elements in SV40 late polyadenylation signal. This association increases polyadenylation efficiency. It is postulated that this interaction occurs to facilitate protein-protein association between components of the U1 snRNP and proteins of the polyadenylation complex. Direct binding occurs between U1A and the 160-kD subunit of cleavage-polyadenylation specificity factor (CPSF). U1A copurifies with CPSF to a point but can be separated in the highly purified fractions. These data suggest that U1A protein is not an integral component of CPSF but may be able to interact and affect its activity. The addition of purified, recombinant U1A to polyadenylation reactions containing CPSF, poly(A) polymerase, and a precleaved RNA substrate results in concentration-dependent increases in both the level of polyadenylation and poly(A) tail length. In agreement with the increase in polyadenylation efficiency caused by U1A, recombinant U1A stabilizes the interaction of CPSF with the AAUAAA-containing substrate RNA. These findings suggest that, in addition to its function in splicing, U1A plays a more global role in RNA processing through effects on polyadenylation (Lutz, 1996).

The human U1A protein-U1A pre-mRNA complex and the relationship between its structure and function in inhibition of polyadenylation in vitro has been investigated. Two molecules of U1A protein bind to a conserved region in the 3' untranslated region of U1A pre-mRNA. The secondary structure of this region was determined by a combination of theoretical prediction, phylogenetic sequence alignment, enzymatic structure probing and molecular genetics. The U1A binding sites form (part of) a complex secondary structure that is significantly different from the binding site of U1A protein on U1 snRNA. Studies with mutant pre-mRNAs show that the integrity of much of this structure is required for both high affinity binding to U1A protein and specific inhibition of polyadenylation in vitro. In particular, binding of a single molecule of U1A protein to U1A pre-mRNA is not sufficient to produce efficient inhibition of polyadenylation (van Gelder, 1993).

The human U1 snRNP-specific U1A protein autoregulates its production by binding its own pre-mRNA and inhibiting polyadenylation. The mechanism of this regulation has been elucidated by in vitro studies. U1A protein prevents neither the binding of cleavage and polyadenylation specificity factor (CPSF) to its recognition sequence (AUUAAA) nor the cleavage of U1A pre-mRNA. Instead, U1A protein bound to U1A pre-mRNA inhibits both specific and nonspecific polyadenylation by mammalian, but not by yeast, poly(A) polymerase (PAP). Domains are identified in both proteins whose removal uncouples the polyadenylation activity of mammalian PAP from its inhibition via RNA-bound U1A protein. U1A protein specifically interacts with mammalian PAP in vitro. This interaction may possibly reflect a broader role of the U1A protein in polyadenylation (Gunderson, 1994).

The inactivity of the 5' long terminal repeat (LTR) poly(A) site, immediately downstream of the cap site maximizes the production of HIV-1 transcripts. This inactivity has been found to be mediated by the interaction of the U1 snRNP with the major splice donor site (MSD). The inhibition of the HIV-1 poly(A) site by U1 snRNP relies on a series of delicately balanced RNA processing signals. These include the poly(A) site, the major splice donor site and the splice acceptor sites. The inherent efficiency of the HIV-1 poly(A) site allows maximal activity where there is no donor site (in the 3' LTR) but full inhibition by the downstream MSD (in the 5' LTR). The MSD must interact efficiently with U1 snRNP to completely inhibit the 5' LTR poly(A) site, whereas the splice acceptor sites are inefficient, allowing full-length genomic RNA production (Ashe, 1997).

The inhibition of poly(A) polymerase (PAP) by the U1 snRNP-specific U1A protein (a reaction whose function is to autoregulate U1A protein production) requires a substrate RNA to which at least two molecules of U1A protein can bind tightly, but the secondary structure of the RNA is not highly constrained. A mutational analysis reveals that the carboxy-terminal 20 amino acids of PAP are essential for its inhibition by the U1A-RNA complex. Remarkably, transfer of these amino acids to yeast PAP, which is otherwise not affected by U1A protein, is sufficient to confer U1A-mediated inhibition onto the yeast enzyme. A glutathione S-transferase fusion protein containing only these 20 PAP residues can interact in vitro with an RNA-U1A protein complex containing two U1A molecules, but not with one containing a single U1A protein. This, explains the requirement for two U1A-binding sites on the autoregulatory RNA element. A mutational analysis of the U1A protein demonstrates that amino acids 103-119 are required for PAP inhibition. A monomeric synthetic peptide consisting of the conserved U1A amino acids from this region has no detectable effect on PAP activity. However, the same U1A peptide, when conjugated to BSA, inhibits vertebrate PAP. In addition to this activity, the U1A peptide-BSA conjugate specifically uncouples splicing and 3'-end formation in vitro without affecting uncoupled splicing or 3'-end cleavage efficiencies. This suggests that the carboxy-terminal region of PAP with which it interacts is involved not only in U1A autoregulation but also in the coupling of splicing and 3'-end formation (Gunderson, 1997).

Human, mouse, and Xenopus mRNAs encoding the U1 snRNP-specific U1A protein contain a conserved 47 nt region in their 3' untranslated regions (UTRs). In vitro studies show that human U1A protein binds to two sites within the conserved region that resemble, in part, the previously characterized U1A-binding site on U1 snRNA. Overexpression of human U1A protein in mouse cells results in down-regulation of endogenous mouse U1A mRNA accumulation. In vitro and in vivo experiments demonstrate that excess U1A protein specifically inhibits polyadenylation of pre-mRNAs containing the conserved 3' UTR from human U1A mRNA. Thus, U1A protein regulates the production of its own mRNA via a mechanism that involves pre-mRNA binding and inhibition of polyadenylation (Boelens, 1993).

An in vitro genetic system was developed as a rapid means for studying the specificity determinants of RNA-binding proteins. This system was used to investigate the origin of the RNA-binding specificity of the mammalian spliceosomal protein U1A. The U1A domain responsible for binding to U1 small nuclear RNA was locally mutagenized and displayed as a combinatorial library on filamentous bacteriophage. Affinity selection identified four U1A residues in the mutagenized region that are important for specific binding to U1 hairpin II. One of these residues (Leu-49) disproportionately affects the rates of binding and release and appears to play a critical role in locking the protein onto the RNA. Interestingly, a protein variant that binds more tightly than U1A emerged during the selection, showing that the affinity of U1A for U1 RNA has not been optimized during evolution (Laird-Offringa, 1995).

Nuclear transport of the U1 snRNP-specific protein U1A has been examined. U1A moves to the nucleus by an active process that is independent of interaction with U1 snRNA. Nuclear localization requires an unusually large sequence element situated between amino acids 94 and 204 of the protein. U1A transport is not unidirectional. The protein shuttles between nucleus and cytoplasm. At equilibrium, the concentration of the protein in the nucleus and cytoplasm is not, however, determined solely by transport rates, but can be perturbed by introducing RNA sequences that can specifically bind U1A in either the nuclear or cytoplasmic compartment. Thus, U1A represents a novel class of protein that shuttles between cytoplasm and nucleus and whose intracellular distribution can be altered by the number of free binding sites for the protein present in the cytoplasm or the nucleus (Kambach, 1992).

Macromolecules that are imported into the nucleus can be divided into classes according to their nuclear import signals. The best characterized class consists of proteins that carry a basic nuclear localization signal (NLS), whose transport requires the importin alpha/beta heterodimer. U snRNP import depends on both the trimethylguanosine cap of the snRNA and a signal formed when the Sm core proteins bind the RNA. Here, factor requirements for U snRNP nuclear import are studied using an in vitro system. Depletion of importin alpha, the importin subunit that binds the NLS, is found to stimulate rather than inhibit U snRNP import. This stimulation is due to a common requirement for importin beta in both U snRNP and NLS protein import. Saturation of importin beta-mediated transport with the importin beta-binding domain of importin alpha blocks the import of U snRNP both in vitro and in vivo. Immunodepletion of importin beta inhibits protein import, whether NLS-mediated or U snRNP. While the former requires re-addition of both importin alpha and importin beta, re-addition of importin beta alone to immunodepleted extracts is sufficient to restore efficient U snRNP import. Thus importin beta is required for U snRNP import, and it functions in this process without the NLS-specific importin alpha (Palacios, 1997).

Precursors of U1 snRNA are associated with nuclear proteins prior to export to the cytoplasm. The approximately 15S complexes containing pre-U1 RNA, termed pre-export U1 snRNPs, can be identified in extracts of Xenopus laevis oocyte nuclei that are synthesizing U1 RNAs from injected U1 genes. The U1 snRNP-specific A protein is associated with nuclear pre-U1 RNA. The interaction of the U1-A protein with pre-U1 RNA required sequences in the loop II region although this region of U1 RNA was not necessary for the association of U1 A protein with mature U1 snRNPs. The U1 A protein helps protect pre-U1 RNA against degradation in the nucleus (Terns, 1993).

In an enhancer screen for yeast mutants that may interact with U1 small nuclear RNA (snRNA), a gene was identified that encodes the apparent yeast homolog of the well-studied human U1A protein. Both in vitro and in vivo, the absence of the protein has a dramatic effect on the activity of U1 snRNP containing the mutant U1 snRNA used in the screen. Surprisingly, the U1A gene is inessential in a wild-type U1 RNA background, as growth rate and the splicing of endogenous pre-mRNA transcripts are normal in these strains that lack the U1A protein. Even in vitro, the absence of the protein has little effect on splicing. On the basis of these observations, it is suggested that a principal role of the U1A protein is to help fold or maintain U1 RNA in an active configuration (Liao, 1993).

The interaction of the U1-specific proteins 70k, A and C with U1 snRNP was studied by gradually depleting U1 snRNPs of the U1-specific proteins. U1 snRNP species are obtained that are selectively depleted of either protein C, A, C and A, or of all three U1-specific proteins (C, A and 70k) while retaining the common proteins B' to G. These various types of U1 snRNP particles were used to study the differential accessibility of defined regions of U1 RNA towards nucleases V1 and S1 dependent on the U1 snRNP protein composition. U1 snRNP protein 70k interacts with stem/loop A of U1 RNA, and protein A interacts with stem/loop B of U1 RNA . The presence or absence of protein C does not affect the nuclease digestion patterns of U1 RNA. These results suggest further that the binding of protein A to the U1 snRNP particle should be independent of proteins 70k and C. Mouse cells contain two U1 RNA species, U1a and U1b, which differ in the structure of stem/loop B: U1a exhibits the same stem/loop B sequence as U1 RNA from HeLa cells. Protein A is always preferentially lost from U1b snRNP as compared to U1a snRNPs. This indicates that one consequence of the structural difference between U1a and U1b is a lowering of the binding strength of protein A to U1b snRNP. The possible functional significance of this finding is discussed with respect to the fact that U1b RNA is preferentially expressed in embryonal cells (Bach, 1990).

U1 small nuclear ribonucleoprotein (snRNP) may function during several steps of spliceosome assembly. However, most spliceosome assembly assays fail to detect the U1 snRNP. A new native gel electrophoretic assay was used to find the yeast U1 snRNP in three pre-splicing complexes (delta, beta1, alpha2) formed in vitro. The order of complex formation is deduced to be delta --> beta1 --> alpha2 --> alpha1 --> beta2, the active spliceosome. The delta complex is formed when U1 snRNP binds to pre-mRNA in the absence of ATP. There are two forms of delta: a major one, deltaun (unstable to competitor RNA), and a minor one, deltacommit (committed to the splicing pathway). The other complexes are formed in the presence of ATP and contain the following snRNPs: beta1 (the pre-spliceosome) has both U1 and U2; alpha2 has all five, however U1 is reduced compared with the others; and alpha1 and beta2 have U2, U5, and U6. Prior work by others suggests that U1 is "handing off" the 5' splice site region to the U5 and U6 snRNPs before splicing begins. The reduced levels of U1 snRNP in the alpha2 complex suggests that the handoff occurs during formation of this complex (Ruby, 1997).

Intron definition and splice site selection occur at an early stage during assembly of the spliceosome, the protein complex mediating pre-mRNA splicing. Association of U1 snRNP with the pre-mRNA is required for these early steps. The yeast U1 snRNP-specific protein Nam8p is a component of the commitment complexes, the first stable complexes assembled on pre-mRNA. In vitro and in vivo, Nam8p becomes indispensable for efficient 5' splice site recognition when this process is impaired as a result of the presence of noncanonical 5' splice sites or the absence of a cap structure. Nam8p stabilizes commitment complexes in the latter conditions. Consistent with this, Nam8p interacts with the pre-mRNA downstream of the 5' splice site, in a region of nonconserved sequence. Substitutions in this region affect splicing efficiency and alternative splice site choice in a Nam8p-dependent manner. Therefore, Nam8p is involved in a novel mechanism by which an snRNP component can affect splice site choice and regulate intron removal through its interaction with a nonconserved sequence. This supports a model where early 5' splice recognition results from a network of interactions established by the splicing machinery with various regions of the pre-mRNA (Puig, 1999).

Epitopes depending on three-dimensional folding of proteins have during recent years been acknowledged to be main targets for many autoantibodies. However, a detailed resolution of conformation-dependent epitopes has, to date, not been achieved in spite of its importance for understanding the complex interaction between an autoantigen and the immune system. In analysis of immunodominant epitopes of the U1-70K protein, the major autoantigen recognized by human ribonucleoprotein (RNP)-positive sera, diversely mutated recombinant Drosophila 70K proteins were used as antigens in assays for human anti-RNP antibodies. Thus, the contribution of individual amino acids to antigenicity could be assayed with the overall structure of the major antigenic domain preserved, and analysis of how antigenicity can be reconstituted rather than obliterated was enabled. Amino acid residue 125 is shown to be situated at a crucial position for recognition by human anti-RNP autoantibodies. Flanking residues at positions 119-126 also appear to be of utmost importance for recognition. These results are discussed in relation to structural models of RNA-binding domains. Tertiary structure modeling indicates that the residues 119-126 are situated at easily accessible positions in the end of an alpha-helix in the RNA binding region. This study identifies a major conformation-dependent epitope of the U1-70K protein and demonstrates the significance of individual amino acids in conformational epitopes. Using this model, it will be possible to analyze other immunodominant regions in which protein conformation has a strong impact (Welin Henriksson, 1999).

Attempts have been made to relate the conformational epitope to a three-dimensional position on the U1-70K protein. Using the closely related human hnRNP A1 and Drosophila Sex lethal proteins for modeling the three-dimensional structure of the U1-70K protein, one can observe that the region around residues 119-126 is situated in the end of alpha-helix 1. These residues face away from the residues in the RNA-binding beta1 and beta3 sheets, and, by using this model, the amino acid residue at position 125 should be an easily accessible B-cell epitope. Not only is the amino acid residue sequence important: in addition, the epitope has to be kept in place by the alpha-helix and beta-sheet. By now, having demonstrated that valine-125 is part of a major human epitope, it is not surprising that a switch to phenylalanine induces a disarrangement of the epitope surface. Both amino acids involved are hydrophobic and contain nonpolar side chains, but phenylalanine is larger and contains a bulky aromatic side chain. Valine, however, is one of the smallest amino acids containing a less complex carbon side chain. Thus, the tertiary structure derivation supports the theory of the region 119-126 as an autoantigenic, conformational epitope of the U1-70K protein. The positions of the key amino acids that constitute this major epitope also make it possible to understand why previous approaches have failed to identify smaller fragments than 56-67 amino acid residues as being antigenic. A truncation from either end would affect protein conformation and would cause the compressed helix-loop-beta-sheet structure to unfold, thus obscuring the originally prominent and crucial valine at position 125. These findings might have clinical implications because all tested sera recognized the identified epitope. This demonstrates a remarkable homogeneity of the otherwise heterogeneous autoantigenic B-cell response and might also instigate therapeutic approaches of inducing tolerance (Welin Henriksson, 1999).

Analysis of U1, the RNA component of U1 snRNP

Evidence for at least four U1 RNA variants of the snRNP has been obtained from a U1 cDNA library using U1 snRNA from Bombyx mori BmN cells in culture. Sequence analysis of thirty cDNA clones showed that: (1) the nucleotide changes are in the hairpin structures I, II and III; (2) the majority of the base changes in stem structures between a posterior silk gland (PSG) U1 RNA and the BmN U1 clones, as well as among the BmN U1 clones, are compensatory; (3) although the base differences between PSG U1 and BmN U1 clones, and among the BmN U1 clones, are not the same, they are located in similar positions in moderately conserved sites, frequently at the bases of loops; (4) when comparing the PSG U1 with the BmN U1 clones, twelve out of nineteen stem differences generate stronger pairing resulting in a more stable hairpin II in the BmN U1 clones; and (5) the Sm and 70K proteins binding site sequences are highly conserved among these U1 clones. Although a comparison of sequences changes associated with U1 isoforms from different species indicates that there are no common base changes with the B. mori U1 clones reported here, similarities in the multitude and location of base differences in hairpins I, II and III are observed in mouse and/or Xenopus. It is possible that U1 variants like the ones reported here play a role in alternative pre-mRNA splicing by way of different RNA-protein factor interactions (Gao, 1995).

To dissect U1 snRNA function, 14 single point mutations were analyzed in the six nucleotides complementary to the 5' splice site for their effects on growth and splicing in the fission yeast S. pombe. Three of the four alleles previously found to support growth of S. cerevisiae are lethal in S. pombe, implying a more critical role for the 5' end of U1 in fission yeast. Furthermore, a comparison of phenotypes for individual nucleotide substitutions suggests that the two yeasts use different strategies to modulate the extent of pairing between U1 and the 5' splice site. The importance of U1 function in S. pombe is further underscored by the lethality of several single point mutants not examined previously in S. cerevisiae. In total, only three alleles complement the U1 gene disruption, and these strains are temperature-sensitive for growth. Each viable mutant was tested for impaired splicing of three different S. pombe introns. Among these, only the second intron of the cdc2 gene (cdc2-I2) showed dramatic accumulation of linear precursor. Notably, cdc2-I2 is spliced inefficiently even in cells containing wild-type U1, at least in part due to the presence of a stable hairpin encompassing its 5' splice site. Although point mutations at the 5' end of U1 have no discernible affect on splicing of pre-U6, significant accumulation of unspliced RNA is observed in a metabolic depletion experiment. Taken together, these observations indicate that the repertoire of U1 activities is used to varying extents for splicing of different pre-mRNAs in fission yeast (Alvarez, 1996).

In the flagellated protozoon Euglena gracilis, characterized nuclear genes harbor atypical introns that usually are flanked by short repeats, adopt complex secondary structures in pre-mRNA, and do not obey the GT-AG rule of conventional cis-spliced introns. In the nuclear fibrillarin gene of E. gracilis, three spliceosomal-type introns have been identified that have GT-AG consensus borders. A small RNA has been isolated from E. gracilis and on the basis of primary and secondary structure comparisons, it is proposed to be a homolog of U1 small nuclear RNA, an essential component of the cis-spliceosome in higher eukaryotes. Conserved sequences at the 5' splice sites of the fibrillarin introns can potentially base pair with Euglena U1 small nuclear RNA. These observations demonstrate that spliceosomal GT-AG cis-splicing occurs in Euglena, in addition to the nonconventional cis-splicing and spliced leader trans-splicing previously recognized in this early diverging unicellular eukaryote (Breckenridge, 1999).

U1 snRNP protects pre-mRNAs from premature cleavage and polyadenylation

In eukaryotes, U1 small nuclear ribonucleoprotein (snRNP) forms spliceosomes in equal stoichiometry with U2, U4, U5 and U6 snRNPs; however, its abundance in human far exceeds that of the other snRNPs. This study used antisense morpholino oligonucleotide to U1 snRNA to achieve functional U1 snRNP knockdown in HeLa cells, and identified accumulated unspliced pre-mRNAs by genomic tiling microarrays. In addition to inhibiting splicing, U1 snRNP knockdown caused premature cleavage and polyadenylation in numerous pre-mRNAs at cryptic polyadenylation signals, frequently in introns near (<5 kilobases) the start of the transcript. This did not occur when splicing was inhibited with U2 snRNA antisense morpholino oligonucleotide or the U2-snRNP-inactivating drug spliceostatin A unless U1 antisense morpholino oligonucleotide was also included. It was further shown that U1 snRNA-pre-mRNA base pairing was required to suppress premature cleavage and polyadenylation from nearby cryptic polyadenylation signals located in introns. These findings reveal a critical splicing-independent function for U1 snRNP in protecting the transcriptome, a function that is proposed explains its overabundance (Kaida, 2010).

U1 snRNP bound to 5' splice sites may thus serve a dual purpose-in splicing and suppression of premature cleavage and polyadenylation. The perimeter of U1 snRNP’s protective zone is not known, but its binding to 5' splice site alone is unlikely to be able to protect the majority of introns, which in humans average ~3.4 kb in length. Furthermore, if suppression of actionable PASs was provided only via U1 snRNP bound to 5' splice sites, 5' splice-site mutations would be expected to cause premature termination, as opposed, for example, to exon skipping, which would be extremely deleterious and has not been observed. Additional U1 snRNP binding sites, including cryptic 5' splice sites, may function as tethering sites for its activity in suppression of cleavage and polyadenylation in introns. Viewed from this perspective, sequences referred to as cryptic 5' splice sites may serve a non-splicing purpose to recruit U1 snRNP to protect introns. It is also reasonable to consider that modulating U1 snRNP levels or its binding at sites that protect actionable PASs could be a mechanism for regulating gene expression, including downregulation of the mRNA or switching expression to a different mRNA produced from a prematurely terminated pre-mRNA. It is suggested that the vulnerability to premature cleavage and polyadenylation would be expected to increase with increasing intron size if U1 snRNP and cognate base-pairing sites are not available to protect it. It is proposed that the large excess of U1 snRNP over what is required for splicing in human cells serves an additional critical biological function, to suppress premature cleavage and polyadenylation in introns and protect the integrity of the transcriptome (Kaida, 2010).

Widespread recognition of 5' splice sites by noncanonical base-pairing to U1 snRNA involving bulged nucleotides

An established paradigm in pre-mRNA splicing is the recognition of the 5' splice site (5'ss) by canonical base-pairing to the 5' end of U1 small nuclear RNA (snRNA). A small subset of 5'ss base-pair to U1 in an alternate register that is shifted by 1 nucleotide. Using genetic suppression experiments in human cells, it was demonstrated that many other 5'ss are recognized via noncanonical base-pairing registers involving bulged nucleotides on either the 5'ss or U1 RNA strand, which were termed 'bulge registers.' By combining experimental evidence with transcriptome-wide free-energy calculations of 5'ss/U1 base-pairing, it is estimated that 10,248 5'ss (~5% of human 5'ss) in 6577 genes use bulge registers. Several of these 5'ss occur in genes with mutations causing genetic diseases and are often associated with alternative splicing. These results call for a redefinition of an essential element for gene expression that incorporates these registers, with important implications for the molecular classification of splicing mutations and for alternative splicing (Roca, 2012).

Splicing of >99% of pre-mRNA introns is catalyzed by the major spliceosome, a dynamic macromolecular machine composed of five small nuclear RNAs (snRNAs) and associated polypeptides, plus many other protein factors. The U1 small nuclear ribonucleoprotein particle (snRNP), comprising the U1 snRNA and 10 polypeptides, is the main component for early 5' splice site (5'ss) recognition by the major or U2-type spliceosome. The vast majority of such introns (>99%) belong to the GT-AG (or GU-AG) category, as defined by their intronic terminal dinucleotides. For more than 30 years, it has been firmly established that 5'ss are recognized by base-pairing to the 5' end of U1 snRNA in a canonical register, defined as +1G at the 5'ss (the first intronic nucleotide) base-pairing to C8 of U1 (the eighth nucleotide of U1). Thus, the 5'ss element spans the last 3 nucleotides (nt) of the exon and the first 8 nt of the intron, establishing a maximum of 11 base pairs (bp) to U1—although the contribution of the seventh and eighth nucleotides in the intron, which are much more variable, appears to depend on the species. Later in spliceosome assembly, U1 is replaced by U6 snRNA, which forms a few base pairs to the 5'ss and is likely involved in catalysis. In a handful of documented cases, U1 base-pairs at some distance from the 5'ss, and the cleavage site depends on subsequent U6 base-pairing. There is also an example of a natural human U2-type intron whose splicing appears to be U1 snRNA-independent (Roca, 2012 and references therein).

Two minor categories of U2-type splice sites have been known for a long time: GC-AG 5'ss (0.9%) and very rare AT-AC 5'ss (only 15 introns in the human genome). These 5'ss conform to consensus motifs very similar to the major U2-type GT-AG 5'ss and are recognized by analogous mechanisms. It has recently been shown that restoration of base-pairing to both U1 and U6 is essential to rescue recognition of a mutant AT 5'ss that causes aberrant splicing and myotonia. U12-type introns are spliced by the minor spliceosome and are very rare as well (0.36%) (Roca, 2012 and references therein).

It has recently been shown that a small subset of GT-AG 5'ss, which are here termed atypical 5'ss, is recognized by a base-pairing register with U1 that is shifted by 1 nt (+1G base-pairs to U1 C9 instead of C8) without changing the actual exon-intron boundary or the sequence of the spliced mRNA. In budding yeast, mutational analysis led to the suggestion that the noncanonical HOP2 5'ss is recognized by a base-pairing register involving a bulged nucleotide. A bulge in a strand of RNA (or DNA) duplex is defined as a nucleotide (or more) that is not opposed by any nucleotide on the other strand. This study presents extensive experimental evidence for multiple base-pairing registers between human 5'ss and U1, with bulged nucleotides on either RNA strand, and estimates that ~5% of all 5'ss (present in ~40% of human genes) use one of these noncanonical registers (Roca, 2012).

Interaction of U1 snRNP with SR proteins: the role of SR proteins in splicing

Continued: see sans fille Evolutionary homologs part 2/3 | part 3/3 |

sans fille : Biological Overview | Regulation | Protein Interactions | Developmental Biology | Effects of Mutation | References

Home page: The Interactive Fly © 1995, 1996 Thomas B. Brody, Ph.D.

The Interactive Fly resides on the
Society for Developmental Biology's Web server.