The Interactive Fly

Zygotically transcribed genes

Splicing factors for processing pre-messenger RNA

Splicing factors


The architecture of pre-mRNAs affects mechanisms of splice-site pairing

The exon/intron architecture of genes determines whether components of the spliceosome recognize splice sites across the intron or across the exon. Using in vitro splicing assays, this study demonstrates that splice-site recognition across introns ceases when intron size is between 200 and 250 nucleotides. Beyond this threshold, splice sites are recognized across the exon. Splice-site recognition across the intron is significantly more efficient than splice-site recognition across the exon, resulting in enhanced inclusion of exons with weak splice sites. Thus, intron size can profoundly influence the likelihood that an exon is constitutively or alternatively spliced. An EST-based alternative-splicing database was used to determine whether the exon/intron architecture influences the probability of alternative splicing in the Drosophila and human genomes. Drosophila exons flanked by long introns display an up to 90-fold-higher probability of being alternatively spliced compared with exons flanked by two short introns, demonstrating that the exon/intron architecture in Drosophila is a major determinant in governing the frequency of alternative splicing. Exon skipping is also more likely to occur when exons are flanked by long introns in the human genome. Interestingly, experimental and computational analyses show that the length of the upstream intron is more influential in inducing alternative splicing than is the length of the downstream intron. It is concluded that the size and location of the flanking introns control the mechanism of splice-site recognition and influence the frequency and the type of alternative splicing that a pre-mRNA transcript undergoes (Fox-Walsh, 2005).

Pre-mRNA splicing is an essential process that accounts for many aspects of regulated gene expression. Of the ~25,000 genes encoded by the human genome, >60% are believed to produce transcripts that are alternatively spliced. Thus, alternative splicing of pre-mRNAs can lead to the production of multiple protein isoforms from a single pre-mRNA, exponentially enriching the proteomic diversity of higher eukaryotic organisms. Because regulation of this process can determine when and where a particular protein isoform is produced, changes in alternative-splicing patterns modulate many cellular activities (Fox-Walsh, 2005).

The spliceosome assembles onto the pre-mRNA in a coordinated manner by binding to sequences located at the 5' and 3' ends of introns. Spliceosome assembly is initiated by the stable associations of the U1 small nuclear ribonucleoprotein particle with the 5' splice site, branch-point-binding protein/SF1 with the branch point, and U2 snRNP auxiliary factor with the pyrimidine tract. ATP hydrolysis then leads to the stable association of U2 snRNP at the branch-point and functional splice-site pairing (Fox-Walsh, 2005).

Intron size has been correlated with rates of evolution and the regulation of genome size. The exon/intron architecture has also been shown to influence splice-site recognition. For example, increasing the size of mammalian exons results in exon skipping. However, the same enlarged exons are included when the flanking introns are small. Thus, splice-site recognition is more efficient when introns or exons are small. Because, in the human genome, the majority of exons are short and introns are long, it is expected that the vast majority of splice sites in the human genome are recognized across the exon. Lower eukaryotes have a genomic architecture that is typified by small introns and flanking exons with variable length, suggesting that splice-site recognition occurs across the intron. Consistent with this model, expansion of small introns in yeast or Drosophila causes loss of splicing, cryptic splicing, or intron retention. Taken together, these observations suggest that splice sites are recognized across an optimal nucleotide length (Fox-Walsh, 2005).

It is unknown whether splice-site recognition across the intron or across the exon results in similar efficiencies of spliceosomal assembly and/or splice-site pairing. This study demonstrates that splice-site recognition across the intron ceases when the intron reaches a length between 200 and 250 nt. Because splice-site recognition is more efficient across the intron, alternative splicing is less likely for exons flanked by short introns. This influence is supported experimentally and by computational analyses of Drosophila and human alternative-splicing databases. It is concluded that the size and location of the flanking introns control the mechanism of splice-site recognition and influence the frequency and the type of alternative pre-mRNA splicing (Fox-Walsh, 2005).

Previous studies have suggested that genes with small introns tend to be recognized across the intron, and genes with large introns are recognized across the exon. To determine the distance at which recognition of splice sites switches from cross-intron interactions to cross-exon interactions occurs, advantage was taken of an in vitro kinetic splicing assay that was originally used to demonstrate that exonic splicing enhancers (ESEs), discrete sequences within exons that promote both constitutive and regulated splicing, activate both splice sites of an exon simultaneously (Lam, 2002). A number of pre-mRNAs were designed with intron lengths ranging from 120 to 425 nt. Within each set, the pre-mRNAs differ only in the presence or absence of a well characterized 13-nt ESE derived from the Drosophila doublesex and Drosophila fruitless pre-mRNAs. Each pre-mRNA harbors the same weak 5' and 3' splice sites that require the activities of ESEs for recognition in their natural context (Tian, 1992; Lam, 2003). Because splicing factors present in HeLa cell nuclear extracts activate the ESEs used (Lam, 2003), the presence of functional or mutant enhancer elements within each test substrate determine its splicing efficiency. If the splice sites are recognized across the exon, it is expected that the activation of the splice sites on each exon constitutes a different step during spliceosomal assembly, because the ESE located on each exon will only aid in the recognition of its weak splice site. Thus, the activities of the separate ESEs are expected to display synergistic kinetics, because the activation of each ESE accelerates an independent step during spliceosomal assembly. However, if the splice sites are recognized across the intron, the ESE located on each exon will aid in the recognition of both weak splice sites, because the recruited spliceosomal components define the entire intron within one step. In this scenario, the activities of the separate ESEs are expected to display additive kinetics, because the activation of each ESE accelerates the same rate-limiting step during spliceosomal assembly (Fox-Walsh, 2005).

In vitro splicing assays were performed with each of the four pre-mRNA sets over a 3-h time course to determine the apparent rates of splicing. Pre-mRNAs with an intron size of 120 nt display additive kinetics. Using Drosophila nuclear extract (Kc), it was possible to demonstrate additive kinetics for substrates containing the 120-nt intron; however, it was not possible to detect sufficient splicing for the substrates containing longer introns. These results are consistent with in vitro studies demonstrating that splicing of pre-mRNAs with long introns is supported in HeLa nuclear extract but not in Kc extract. The kinetics of pre-mRNAs containing an intron 200 nt or less in length are additive. This behavior indicates that the spliceosomal components required for the recognition of both splice sites are recruited to the intron simultaneously. However, constructs with introns >200 nt demonstrate synergistic kinetics. It is concluded that the change from splice-site recognition across the intron to splice-site recognition across the exon occurs when the intronic length is between 200 and 250 nt (Fox-Walsh, 2005).

The kinetic analysis demonstrates that the upstream 5' splice site and the downstream 3' splice site are recognized simultaneously across introns <200 nt. Significantly, in the absence of ESEs, splice-site recognition across the intron is a much more efficient process than splice-site recognition across the exon. Thus, splice-site recognition across the intron may be able to rescue the inclusion of internal exons harboring weak splice sites. To test this hypothesis, a series of pre-mRNA substrates containing three exons was designed for in vitro splicing analysis in which the internal exon contains splice sites that are insufficiently recognized in the absence of ESEs. The four substrates generated differed only in their ability to be recognized across each intron by changing the length of the intron from <200 to >250 nt, thus permitting or discouraging splice-site recognition across the intron. As expected, the internal exon is predominantly excluded when flanked by two long introns. However, significant inclusion of the internal exon is observed if one of the flanking introns is short enough to support splice-site recognition across the intron. In fact, two short introns increase exon inclusion ~30 times greater than two long introns (Fox-Walsh, 2005).

To estimate the fractions of splice sites that may be recognized through cross-intron interactions, the flanking-intron lengths were recored for every internal exon within the human and Drosophila genomes. Genome information was obtained from the Alternative Splicing Database (ASD), which contains information about the exon/intron structure and EST-verified alternative-splicing events of several thousand genes. Within the human genome, many exons are flanked by at least one short intron, creating two separate populations, separated roughly by the intron length that is proposed to represent the transition of splice-site recognition from across the intron to across the exon. As expected from previous intron-length analyses, a very different distribution is seen in the Drosophila genome, where ~85% of exons are flanked by at least one short intron. An overlay of the Drosophila and human genomes demonstrates that the minimum intron length in the human genome is at the same location that demarcates the maximum intron length of the major Drosophila exon population. This difference in genome constraint may reflect specific compositional variations between the Drosophila and human spliceosomes (Fox-Walsh, 2005).

Because splice-site recognition across the intron rescues exon inclusion, how intron length influences alternative splicing within the Drosophila and human genomes was investigated. To do so, the flanking-intron information of each exon was correlated with exon-skipping and alternative-splice-site-activation events reported in the ASD to compute the probability that an exon is involved in alternative splicing, without taking into consideration the contributions made by splice-site signal strength and splicing enhancers or silencers. Thus, the correlation simply tests whether the influence of the exon/intron architecture on alternative splicing is significant enough to be detectable amid all other splicing determinants. Computational analysis of the Drosophila genome supports a significant role for intron length in defining the likelihood of alternative splicing. A striking influence of the exon/intron architecture is observed for simple exon-skipping events. Exons flanked by very long introns are up to 90-fold more likely to be skipped than exons that are flanked by two short introns. Significantly, the most drastic increase in the probability of alternative splicing (>10-fold) was observed when the length of flanking introns increased from 225 to 525 nt. In agreement with the experimental results, a greater probability that an exon is alternatively spliced was observed when the upstream intron is long. This polarity could be the consequence of coupling pre-mRNA splicing to transcription by RNA polymerase II. Even in the category of alternative 5' or 3' splice-site activation, alternative splicing is up to 10-fold more likely for exons that are flanked by long introns. It is concluded that, in Drosophila, exon skipping is a rare event for exons flanked by short introns and that the length of the upstream intron is of greater importance than the length of the downstream intron in determining whether an exon will be involved in exon skipping (Fox-Walsh, 2005).

Within the human genome, a similar correlation between the exon/intron architecture and the probability of exon skipping is observed; however, the ~5-fold maximal variance calculated is significantly lower than that observed for Drosophila. As for Drosophila, the length of the upstream intron is more important in determining the frequency of alternative splicing. In the case of alternative 5' or 3' splice-site usage, the opposite distribution of alternative splicing is seen in the human genome. The activation of alternative splice sites is less likely if the flanking introns are long. It is concluded that exon/intron architecture influences the frequency and type of alternative splicing that an exon may undergo in the Drosophila and human genomes (Fox-Walsh, 2005).

These experiments support the existence of two different mechanisms for splice-site recognition, splice-site recognition across the intron, and splice-site recognition across the exon. Splice-site recognition across the intron ceases when the intron size reaches the threshold length of >200 nt. Importantly, splice-site recognition across the intron is more efficient and increases the inclusion of exons with weak splice sites. These results demonstrate that the distance between splice sites affects efficient spliceosomal assembly. Presumably, the pairing of cross-exon-defined splice sites requires the interaction between two sets of pre-spliceosomes across an intron of variable length. In contrast, splice-site recognition across the intron already identifies the splice sites that will be paired. It is also possible that the kinetics of splice-site pairing are slowed because longer introns associate with an increased number of hnRNP proteins. HnRNP proteins coat nascent pre-mRNAs and are thought to interfere with the splicing reaction. Therefore, larger introns may reduce splicing by decreasing the relative concentration of splicing components through competition with hnRNPs (Fox-Walsh, 2005).

Additive kinetics of splice-site activation demonstrate that splice-site recognition across the intron is achieved through the recruitment of a multicomponent complex that contains components of the splicing machinery required for 5' and 3' splice-site definition. Interestingly, the activation of a single ESE results in a significant increase in splicing activity, suggesting that ESEs influence splice-site activation of adjacent exons. As anticipated from ESE distance/activity correlations, this effect depends on intron length. Given the unique combination of splice sites and cis-acting elements, it is possible that the precise transition from splice-site recognition across the intron to splice-site recognition across the exon may vary for different substrates. The presence of strong splice sites and enhancers or silencers could modulate the cross-intron recognition by increasing or decreasing the strength of interaction between spliceosomal components and the pre-mRNA (Fox-Walsh, 2005).

The observation that increasing exon length decreases exon inclusion suggests that similar distance limitations exist for splice-site recognition across the exon. Approximately 80% of human exons are <200 bp in length, the average being 170 bp. Importantly, exon length is tightly distributed when compared with intron length. These results demonstrate that maintaining exon size in the human genome is more important to the architecture and evolution of a gene than is maintaining intron size. In contrast to the human genome, exon size varies much more than intron size in yeast. The maximum intron length of 182 nt lies well within the size limitations of splice-site recognition across the intron. Taken together, these considerations support the notion that the majority of splice sites in higher eukaryotes are recognized across the exon, whereas lower eukaryotes employ splice-site recognition across the intron (Fox-Walsh, 2005).

It is well established that several types of exon and intron elements influence splice-site choice. The most prominent include the exon/intron junction signals and splicing enhancers and silencers. The results show that the exon/intron architecture is an additional parameter that affects the efficiency of splice-site recognition and alternative pre-mRNA splicing. When compared in otherwise isogenic test substrates, splice-site recognition across the intron rescues the inclusion of a weak internal exon by >10-fold. Even though the computational analysis ignores the contributions made by variable splice sites, enhancers, and silencers, a striking increase in the probability of alternative splicing is observed for Drosophila exons, whose splice sites are recognized across the exon. Thus, the exon/intron architecture in Drosophila is a major determinant in governing the probability of alternative splicing. Within the human genome, a qualitatively similar trend was observed for exon-skipping events but with a reduced magnitude. One major difference between the Drosophila and human gene architecture is intron length. Human genes are dominated by long introns (87% of introns are >250 nt), whereas short introns are much more common in Drosophila (66% are <250 nt). One possible explanation for the small intron size in Drosophila could be the pressure to maintain a constrained genome size in these fast-replicating organisms (Fox-Walsh, 2005).

Alternative splicing is extensive in both species, supporting the argument that both species benefit from expanded proteomes generated from alternative splicing. However, genome analysis suggests that there are significant differences in the weight of the mechanisms by which alternative splicing can be induced. In Drosophila, intron length is a major determinant in promoting alternative splicing patterns. In the human, additional mechanisms of controlling alternative splicing may have gained more influence on intron expansion to maintain balanced levels of alternative splicing (Fox-Walsh, 2005).

U7 snRNP is recruited to histone pre-mRNA in a FLASH-dependent manner by two separate regions of the Stem-Loop Binding Protein

Cleavage of histone pre-mRNAs at the 3' end requires Stem-Loop Binding Protein (SLBP) and U7 snRNP that consists of U7 snRNA and a unique Sm ring containing two U7-specific proteins: Lsm10 and Lsm11. Lsm11 interacts with FLASH and together they bring a subset of polyadenylation factors to U7 snRNP, including the CPSF73 endonuclease that cleaves histone pre-mRNA. SLBP binds to a conserved stem-loop structure upstream of the cleavage site and acts by promoting an interaction between the U7 snRNP and a sequence element located downstream of the cleavage site. This study shows that both human and Drosophila SLBPs stabilize U7 snRNP on histone pre-mRNA via two regions that are not directly involved in recognizing the stem-loop structure: helix B of the RNA Binding Domain and the C-terminal region that follows the RNA Binding Domain. Stabilization of U7 snRNP binding to histone pre-mRNA by SLBP requires FLASH but not the polyadenylation factors. Thus, FLASH plays two roles in 3' end processing of histone pre-mRNAs: it interacts with Lsm11 to form a docking platform for the polyadenylation factors and it co-operates with SLBP to recruit U7 snRNP to histone pre-mRNA (Skrajna, 2017).

In spite of passing 20 years since the discovery of SLBP as a factor that binds the conserved stem-loop structure in histone pre-mRNA, its precise role in processing and interactions with the rest of the processing machinery are incompletely understood. Initial studies in human nuclear extracts demonstrated that human SLBP promotes binding of U7 snRNP to histone pre-mRNA and that the in vitro requirement for SLBP could be bypassed by increasing the extent of complementarity between the 5' end of the U7 snRNA and the histone downstream element (HDE). In contrast, Drosophila SLBP is essential for cleavage of all five Drosophila histone pre-mRNAs in vitro, but whether it acts in the same manner as mammalian SLBP and/or has other functions in processing has not been unambiguously determined (Skrajna, 2017).

Determining the role of Drosophila SLBP in processing proved challenging, in part due to the rapid cleavage of histone pre-mRNAs and hence disruption of the processing complexes containing U7 snRNP during short incubation in Drosophila nuclear extracts. This study used a modified approach for the assembly and purification of Drosophila processing complexes and clearly demonstrates that Drosophila SLBP functionally resembles mammalian SLBP and acts by stabilizing the interaction between the U7 snRNP and histone pre-mRNA. By using a substrate containing biotin at the 3' end, it was shown that the U7 snRNP becomes partially destabilized on the HDE after cleavage, when SLBP is no longer part of the complex, further confirming the role of Drosophila SLBP in promoting stable association of U7 snRNP with histone pre-mRNA. Interestingly, the U7 snRNP that remains associated with the HDE following endonucleolytic cleavage contains FLASH and all subunits of the HCC. This result suggests that holo-U7 snRNP does not undergo a major remodeling during the course of processing reaction (Skrajna, 2017).

Consistent with a recent studies, a detectable amount of Drosophila U7 snRNP binds to histone pre-mRNA in the absence of SLBP, possibly as a result of base-pairing between the 5' end of the U7 snRNA and the HDE. The bound U7 snRNP contains all the key subunits of the HCC, including the CPSF73 endonuclease, but remains functionally inert. This is in sharp contrast to mammalian holo-U7 snRNP, which when bound to the HDE can result in cleavage of histone pre-mRNA even in the absence of SLBP. The reason for this difference between the two systems is unknown (Skrajna, 2017).

Drosophila and human SLBP use the same regions to recruit U7 snRNP to histone pre-mRNA The results indicate that two regions in mammalian and Drosophila SLBPs are critical for the recruitment of the U7 snRNP to histone pre-mRNA: parts of the RBD that do not directly contact the SL RNA and 15-20 amino acids of the C-terminal region located immediately downstream from the RBD. SLBP mutants altered in these regions retain the ability to bind the SL RNA but are partially or completely impaired in supporting cleavage of histone pre-mRNAs. This study shows that these mutants are also impaired in recruiting U7 snRNP to histone pre-mRNA, providing a likely molecular basis for their reduced activity in processing (Skrajna, 2017).

Within the RBD, the most critical role is played by helix B, including the highly conserved D(E)/R motif. Mutating this motif itself is sufficient to strongly reduce processing activity of human SLBP (Dominski, 2001) and the recruitment of U7 snRNP by Drosophila SLBP. An important role in recruiting U7 snRNP may also be played by other amino acids of the RBD, including evolutionarily variable residues in the loop that connects helices B and C. Despite overall conservation, human and Drosophila RBDs are not functionally interchangeable, yielding chimeric proteins that are inactive in both the U7 snRNP recruitment and cleavage of histone pre-mRNAs. It is possible that these variable amino acids are largely responsible for the observed incompatibility between the two RBDs (Skrajna, 2017).

In Drosophila SLBP, the C-terminal region consists of only 17 amino acids and is highly acidic, containing four stoichiometrically phosphorylated serines alternating with four aspartic acids. In addition to these four SD motifs, the C-terminal region contains one TD motif, and the current study indicates that the threonine residue in this motif may be phosphorylated sub-stoichiometrically. The high density of negative charge in this region is critical for the activity of Drosophila SLBP in promoting stable recruitment of the U7 snRNP to histone pre-mRNA and in supporting the cleavage reaction. Interestingly, in the absence of SL RNA the acidic C terminus of Drosophila SLBP associates with helices A and C, bringing them together and preparing for maximum strength interaction with the SL RNA (Zhang, 2014). Upon binding to the RNA target, the phosphorylated C-terminal tail is repelled from the RBD (Zhang, 2014) and may become available for the independent function in the recruitment of the U7 snRNP (Skrajna, 2017).

The C-terminal region of human SLBP lacks the repeated SD motif present in the Drosophila SLBP. Interestingly, six out of 20 residues that immediately follow the RBD in human SLBP are acidic, suggesting that the overall negative charge of this segment may also be important for activity of this protein (Zhang, 2014). However, the C-terminal regions of human and Drosophila SLBP are not functionally interchangeable, indicating that they co-evolved with other component(s) of their respective processing machineries, resembling the divergent evolution of the two RBDs (Skrajna, 2017).

SLBP tightly bound to the upstream stem-loop promotes stable recruitment of U7 snRNP to histone pre-mRNA likely by directly or indirectly interacting with a subunit(s) of the U7 snRNP. In both Drosophila and mouse nuclear extracts, SLBP is active in recruiting U7 snRNP that lacks the HCC. Thus, SLBP is unlikely to interact with any of the polyadenylation factors of the holo-U7 snRNP. In contrast, removal of FLASH from Drosophila U7 snRNP by RNAi abolishes the ability of SLBP to recruit U7 snRNP to histone pre-mRNA and this ability can be restored by addition of the N-terminal fragment of Drosophila FLASH. Mouse nuclear extracts contain primarily core U7 snRNP (FLASH is limiting), and addition of human N-terminal fragments of human FLASH to a mammalian extract stimulates the activity of SLBP in recruiting the U7 snRNP to histone pre-mRNA (Skrajna, 2017).

The N-terminal region of human and Drosophila FLASH was initially characterized as a protein that interacts with Lsm11, hence forming a docking platform for the HCC. This study now identified a new role for the N-terminal FLASH in processing by showing that it is also essential for the SLBP-mediated 'loading' of the U7 snRNP on histone pre-mRNA. This dual FLASH function may be important for the fidelity of 3' end processing of histone pre-mRNAs in vivo. Clearly, the FLASH-bound U7 snRNP that readily associates with the HCC forming holo-U7 snRNP has an advantage in binding to histone pre-mRNA over functionally incompetent core U7 snRNP, likely preventing misprocessing at downstream cryptic sites by cleavage and polyadenylation (Skrajna, 2017).

Previous studies on processing in human cells suggested that the recruitment of U7 snRNP by SLBP might be mediated by ZFP100, a zinc finger protein of 100 kDa that binds both human SLBP and Lsm11. Mrultiple attempts to detect ZFP100 in the mouse processing complexes by silver staining and mass spectrometry failed, although CPSF100 of the same molecular weight was readily identified, arguing against its involvement in the SLBP-mediated recruitment of the U7 snRNP. ZFP100 is a component of histone locus bodies and may play a role in vivo in coupling transcription of histone genes with 3' end processing of the nascent histone pre-mRNAs (Skrajna, 2017).

It is unclear whether SLBP is completely unable to act on the core U7 snRNP and whether its interaction with FLASH is direct or indirect. A model is favored in which SLBP interacts with the FLASH/Lsm11 complex rather than with FLASH alone. Alternatively, FLASH, upon binding Lsm11, induces a structural shift in part of Lsm11, making it competent to interact with SLBP. The interaction may also include Lsm10. Clearly, further studies are required to identify interactions that span across the cleavage site and bring SLBP and U7 snRNP together into a tight processing complex (Skrajna, 2017).

This study has developed a method to identify regions in human and Drosophila SLBP that are essential for the recruitment of U7 snRNP to histone-pre-mRNA. In both proteins, this activity is mediated by helix B and likely other amino acids of the RBD that do not directly contact the SL RNA, and by ∼20 C-terminal amino acids that follow the RBD. The activity of SLBP in promoting stable recruitment of U7 snRNP to histone pre-mRNA depends on FLASH but not the HCC. Thus, FLASH has two functions in processing: First, it is essential for bringing the HCC to U7 snRNP and second, it cooperates with SLBP in facilitating the interaction between U7 snRNP and histone pre-mRNA. The fact that Drosophila and human SLBP recruit U7 snRNP to histone pre-mRNA through the same regions, and that FLASH but not the HCC is essential for this activity of SLBP, suggests that both processing machineries utilize a conserved network of interactions spanning across the cleavage site (Skrajna, 2017).

Ecd promotes U5 snRNP maturation and Prp8 stability

Pre-mRNA splicing catalyzed by the spliceosome represents a critical step in the regulation of gene expression contributing to transcriptome and proteome diversity. The spliceosome consists of five small nuclear ribonucleoprotein particles (snRNPs), the biogenesis of which remains only partially understood. This study defines the evolutionarily conserved protein Ecdysoneless (Ecd) as a critical regulator of U5 snRNP assembly and Prp8 stability. Combining Drosophila genetics with proteomic approaches, this study demonstrates the Ecd requirement for the maintenance of adult healthspan and lifespan and identify the Sm ring protein SmD3 as a novel interaction partner of Ecd. The predominant task of Ecd is to deliver Prp8 to the emerging U5 snRNPs in the cytoplasm. Ecd deficiency, on the other hand, leads to reduced Prp8 protein levels and compromised U5 snRNP biogenesis, causing loss of splicing fidelity and transcriptome integrity. Based on these findings, it is propose that Ecd chaperones Prp8 to the forming U5 snRNP allowing completion of the cytoplasmic part of the U5 snRNP biogenesis pathway necessary to meet the cellular demand for functional spliceosomes (Erkelenz, 2021).

The transcriptome-wide landscape and modalities of EJC binding in adult Drosophila

Exon junction complex (EJC) assembles after splicing at specific positions upstream of exon-exon junctions in mRNAs of all higher eukaryotes, affecting major regulatory events. In mammalian cell cytoplasm, EJC is essential for efficient RNA surveillance, while in Drosophila, EJC is essential for localization of oskar mRNA. This study has developed a method for isolation of protein complexes and associated RNA targets (ipaRt) to explore the EJC RNA-binding landscape in a transcriptome-wide manner in adult Drosophila. The EJC was found at canonical positions, preferably on mRNAs from genes comprising multiple splice sites and long introns. Moreover, EJC occupancy is highest at junctions adjacent to strong splice sites, CG-rich hexamers, and RNA structures. Highly occupied mRNAs tend to be maternally localized and derive from genes involved in differentiation or development. These modalities, which have not been reported in mammals, specify EJC assembly on a biologically coherent set of transcripts in Drosophila (Obrdlik, 2019).

The exon junction complex (EJC) consists of a heterotetramer core composed of eIF4AIII, Mago, Y14 (Tsunagi), and Barentsz (Btz) and auxiliary factors that form the EJC periphery. The complex assembles on mRNAs during splicing, -20 to -24 nt upstream of exon-exon junctions. EJC assembly is a multi-step process that begins with CWC22-mediated deposition of the DEAD-box helicase eIF4AIII on nascent pre-mRNAs (Alexandrov, 2012; Barbosa, 2012; Steckelberg, 2015) and is followed by recruitment of Mago and Y14, forming a pre-EJC intermediate. The pre-EJC is stably bound to RNA because of the ATPase-inhibiting activity of the (non-RNA-binding) Mago-Y14 heterodimer, which 'locks' eIF4AIII helicase in its RNA-bound state. Once formed, the pre-EJC is completed by recruitment of Barentsz (Btz), forming mature EJCs (Obrdlik, 2019).

The roles of the EJC in post-transcriptional control of gene expression are manifold. In the nucleus, EJC subunits have a role in splicing, mRNA export, and nuclear retention of intron-containing RNAs. In the cytoplasm, the EJC is reported to play a role in translation, nonsense-mediated decay (NMD) , and RNA localization. Although most EJC functions appear conserved, in Drosophila the EJC is not crucial for NMD, but it is essential for oskar mRNA localization within the developing oocyte. To better understand the engagement of the EJC in the fly, a strategy to stabilize mRNA binding proteins (mRBPs) associated with their RNA templates has been developed within multi-protein messenger ribonucleoprotein (mRNP) assemblies, and the EJC mRNA interactome was defined in adult Drosophila melanogaster. Through the use of the crosslinking agent dithio(bis-) succinimidylpropionate (DSP), the method captures stable and transient protein interactions in close proximity and allows definition of the binding sites of specific protein (holo-)complexes associated with their RNA templates (isolation of protein complexes and associated RNA targets [ipaRt]). This analysis of EJC-protected sites defined by ipaRt reveals that in Drosophila, EJC binding occurs at canonical deposition sites, with a median coordinate ~22 nt upstream of exon-exon junctions. Although in mammals EJC-mediated protection outside canonical sides was reported, this study finds that in Drosophila the degree of non-canonical EJC-mediated RNA protection is minimal. In Drosophila RNA polymerase II transcripts protected primarily by the EJC derive from genes involved in differentiation or development, while mRNAs protected primarily by mRBPs derive from genes with homeostatic functions. This analysis suggests that the EJC's bias for transcripts in Drosophila is a consequence of several modalities in the genes' architecture, particularly splice site number and intron length. Moreover, EJC binding is enhanced by adjacent RNA secondary structures and CUG-rich hexamers located 3' to the EJC binding site. These modalities were not identified in previous studies of mammalian EJC binding, reflecting either greater specificity of this method for fully assembled EJCs or differences in EJC binding between flies and human. This study provides a comprehensive transcriptome-wide view of EJC-RNA interactions in a whole organism and unravels RNA modalities that contribute to the unforeseen biological coherence of the bound transcripts (Obrdlik, 2019).

This study has profiled the landscape of EJC binding across the transcriptome of a whole animal, Drosophila melanogaster, and determined the parameters that influence the distribution of the complex on RNAs in the organism. Previous knowledge of EJC-RNA interactions was based on UV-crosslinking experiments in specific cell types grown as homogeneous cultures for the individual studies. Although UV crosslinking remains a method of choice for identification of protein binding sites on nucleic acids, because of the inefficient penetration of UV light into tissues and organisms, the method is most useful when applied to cells in culture. In contrast, this analysis of EJC distribution in the tissues of whole Drosophila flies was made possible by ipaRt, which uses the crosslinking agent DSP to freeze protein-protein interactions within otherwise dynamic RNP complexes, such as the EJC (Obrdlik, 2019).

DSP-mediated covalent bond formation between the RNA helicase eIF4AIII and the Mago-Y14 heterodimer is shown to preserve EJCs in their 'locked' state on mRNAs and that efficient recovery of the bound RNAs does not require their crosslinking to eIF4AIII using UV light. The 'ipaRt' approach, like CLIP and iCLIP, enables highly stringent washing of the samples. In support of the robustness and reliability of the DSP-based assays, this study demonstrated high reproducibility not only among technical but also biological replicates of EJC ipaRt, as well as mRBP footprinting sequencing results (Obrdlik, 2019).

Furthermore, ipaRt allows enables the use of non-RNA-binding subunits of the EJC, such as Mago, as immunoprecipitation baits. This is highly relevant in the context of the EJC, as it has been shown that its RNA-binding subunit, the RNA helicase eIF4AIII, may have other, EJC-independent functions in the cell. ipaRt afford the option of using Mago as a EJC bait, and indeed this is a main reason for the high-quality definition of the EJC binding landscape in the fly cytoplasm that was achieved. The protection site reads obtained from EJC ipaRt map almost exclusively to canonical EJC deposition sites with a median protection ~22 nt of the upstream exon's 3' end. In contrast to mammalian EJC CLIP and RIP studies, in which eIF4AIII served as an immunoprecipitation bait, EJC ipaRt reads mapping to regions distant from canonical deposition sites are of low abundance and sequencing coverage. Although this discrepancy could reflect differences in EJC engagement in humans and Drosophila, it more likely reflects the choice of bait or the cell compartment in which the analysis was executed. Indeed, a recent study in human cells revealed that when the cytoplasmic EJC component Btz was chosen as the bait rather than eIF4AIII, the proportion of non-canonical EJC deposition sites was negligible (Obrdlik, 2019).

Finally, in ipaRt the DSP crosslinker is applied ex vivo during tissue disruption and does not require inhibition of translation in vivo. Therefore ipaRt is considered a method of choice for functional investigations of protein-RNA complexes in fully developed organisms and tissues (Obrdlik, 2019).

Through this analysis, factors were defined that contribute to or inhibit EJC assembly on mRNAs and at individual exon-exon junctions in Drosophila. From this it is deduced that the landscape of EJC binding to RNAs is sculpted through regulation of EJC assembly at two levels in the fly (Obrdlik, 2019).

At the upstream regulatory level, the degree to which EJCs are assembled on an mRNA is dictated by the complexity of the gene's architecture: mRNAs produced from genes of simple architecture are marked by fewer EJCs, while mRNAs from genes of complex architecture, comprising multiple splice sites and long introns, are EJC bound to a higher degree. Given that EJCs assemble on mRNAs concomitantly with splicing, it is not surprising that mRNAs of genes containing a greater number of introns are more likely to be EJC bound. However, the finding that the enhancing effect on EJC binding provoked by large introns is not restricted to flanking junctions but occurs at junctions mRNA-wide is unexpected. Loss-of-function experiments indicate that the EJC participates in exon definition during splicing of long intron-containing genes in Drosophila, particularly in definition of exons proximal to the long introns. The data exclude any significant bias toward EJC assembly in proximity to long-intron splice junctions. Instead they reveal a general enhancement of EJC binding at exon-exon junctions throughout transcripts of long-intron genes. Therefore, it is concluded that stable binding of EJCs within mRNAs of long-intron genes is not the result of EJC engagement in exon definition. Instead, it is proposed that the high degree of EJC binding to long-intron transcripts derives from the increased number and resting time of co-transcriptionally assembled spliceosomes on the nascent transcripts, which would increase the probability of CWC22-dependent eIF4AIII recruitment to pre-mRNAs during splicing (Obrdlik, 2019).

At the downstream regulatory level, after EJC assembly rates at transcripts are defined, deposition of EJCs along mRNA exon-exon junctions is modulated by the structural and sequence context of the splice sites. dsRNA stem structures in exon-exon junctions of Drosophila mRNAs either antagonize EJC assembly when present within canonical EJC deposition sites or enhance EJC assembly when located in the vicinity of the EJC deposition site. Absence of dsRNA within the EJC binding moiety is in agreement with reported preference of EJCs for ssRNA. It remains to be elucidated how and why EJC binding is positively affected when RNA stem structures are found in its direct proximity on the bound template (Obrdlik, 2019).

Although it is likely that the structural context of exon-exon junctions in Drosophila directly influences the degree of EJC assembly, sequence composition-derived effects on EJC binding to mRNA are a consequence of the assigned roles of these sequences during pre-mRNA splicing. This study has demonstrated that exon-exon junctions with strong 5' and strong 3' splice sites (SSs) are biased toward junctions with enhanced EJC binding. For the regulation of weak 5' and 3' SSs, which commonly occur at alternatively spliced junctions, cis-acting splicing regulatory elements (SREs) were shown to be of importance. In light of the negative impact of alternative splicing at the level of EJC mRNA binding, it is not surprising that conventional ESEs and ESSs hardly affect EJC binding at the level of individual exon-exon junctions. Whether the position-dependent bias mediated by the UUU-triplet- and CUG-triplet-containing hexamers toward inhibited or enhanced EJC binding that this study has discovered in the Drosophila dataset is due to a direct or indirect influence of these hexamers on splicing remains to be addressed. UUU-triplet-containing hexamers, which are strongly biased against EJC binding, could potentially function as yet undefined 5'ESS in Drosophila. Interestingly, CUG-triplet-containing hexamers, which are strongly biased toward enhanced EJC binding, share sequence similarity with a previously predicted CUG containing 5'ESE of short intron splice sites. It appears likely that the CUG-triplet and UUU-triplet hexamers exert their effect on EJC binding as a yet undefined class of SREs (Obrdlik, 2019).

In agreement with reports in mammals, the extent of EJC occupancy varies between mRNAs and exon-exon junctions also in Drosophila. The splice site score next to a junction correlates with increased EJC deposition in the fly, and this relationship between splicing efficiency and EJC deposition has also been proposed in mammalian studies. Analysis of published mammalian Btz iCLIP data revealed several modalities that correlate with the increased binding landscape of the EJC on mRNAs in both mammals and Drosophila, including the large number of introns, high transcript abundance, and sequence context of individual exon-exon junctions. Interestingly, the presence of long introns has a slightly negative effect and the amount of alternative splicing a slightly positive effect on EJC occupancy in mammals; the latter agrees with previous observations. Studies in cultured mammalian cells have reported that EJC-enriched junctions contain a relatively high proportion of 'non-canonical' protection sites, which were enriched for RBP consensus sequences of the SR protein family. Analysis of mammalian Btz iCLIP data confirms that presence of ESEs in upstream exons and 5'ISEs in introns correlates with enhanced EJC binding. Moreover, a group of junctions have been identified in mammals containing AGAA hexamers that are biased for enhanced EJC binding, but their effects are not especially strong near the canonical EJC deposition site. These hexamers match the AGAA-encompassing consensus sequence of the mammalian SR protein SRSF10, known to function as splicing enhancers, and have been found previously in EJC bound exon-exon junctions. Not only do the in silico results agree with these reports and support the proposed cooperative binding of EJC with SR proteins, they also partially explain the EJC's preference in mammals for alternatively spliced mRNAs (Obrdlik, 2019).

One observation deriving from this analysis of published mammalian Btz iCLIP datasets is surprising. Although junctions in Drosophila are observed to be enhanced or inhibited in EJC binding by specific base-pairing probability (bpp) profiles, thus by specific RNA folding categories, it was not possible to detect any striking difference between overall bpp profiles of exon-exon junctions with enhanced or inhibited EJC binding in mammals. Indeed, the only aspect of RNA structure shared by mammals and Drosophila is the negative effect of dsRNA when directly overlapping with the canonical EJC deposition site. In Drosophila, however, the presence of dsRNA close to canonical deposition sites enhances EJC binding, an effect that is not observed in mammalian cells (Obrdlik, 2019).

The findings regarding the differences in the RNA modalities enriched at highly occupied mammalian and Drosophila EJC sites provide insight into the expansion of functions of the EJC during eukaryotic evolution. Spliceosome catalyzed splicing reactions are bidirectional, and efficient formation of exon-exon junctions during RNA maturation is achieved by Prp22-induced release of spliceosomes from mRNAs. The EJC is absent in organisms with low rates of RNA splicing, such as Saccharomyces cerevisiae, but present in organisms with high splicing rates, such as Schizosaccharomyces pombe. This suggests that with the increased demand for splicing accuracy in higher eukaryotes, the EJC evolved to function as an exon-exon junction 'lock' hindering spliceosome reassembly at spliced exon-exon junctions. Because EJC binding in the fly is enhanced at strong splices sites, but is not affected by splicing enhancer elements, and is not biased toward alternatively spliced mRNAs, it is proposed that the EJC preserved its primary function as such a lock in Drosophila. Two recent studies provide evidence that also in mammals bound EJCs hinder spliceosome assembly, suppressing recursive splicing (RS) of RS exons. The previously reported importance of EJC for splicing fidelity, and the current observations on the mode of EJC binding to transcripts in the fly revealing its independence from splicing regulatory elements indeed supports that the EJC's most conserved function is to ensure splicing irreversibility (Obrdlik, 2019).

The EJC further evolved to become a central component of the NMD pathway in mammals, in which more than 95% of all genes are alternatively spliced. This may explain why EJCs in mammals are enriched on alternatively spliced transcripts. In Drosophila, in which only 30% of all genes appear to be alternatively spliced, the EJC is not a component of the main NMD pathway. It is proposed that although the EJC-NMD pathway evolved before segregation of the proto- and deuterostome clades, it gained importance by complementing the faux 3'UTR-NMD pathway during the evolution of vertebrates, for which RNA surveillance and spatiotemporal control of gene expression are essential (Obrdlik, 2019).

Similarly, recruitment of the EJC and interacting proteins upon splicing to facilitate mRNA localization so far seems exclusive to Drosophila. Two Drosophila-specific features that modulate EJC binding, namely, the presence of a large intron within a gene and secondary structure near the junction, are also predictive of mRNA localization. Although the precise strength of association between these features and mRNA localization remains to be verified with larger and more quantitative datasets, previous studies with the SOLE in oskar RNA have shown that RNA structure and EJC binding are indeed crucial for oskar mRNA localization (Obrdlik, 2019).

The exon junction complex regulates the splicing of cell polarity gene dlg1 to control Wingless signaling in development

Wingless (Wg)/Wnt signaling is conserved in all metazoan animals and plays critical roles in development. The Wg/Wnt morphogen reception is essential for signal activation, whose activity is mediated through the receptor complex and a scaffold protein Dishevelled (Dsh). This study reports that the exon junction complex (EJC) activity is indispensable for Wg signaling by maintaining an appropriate level of Dsh protein for Wg ligand reception in Drosophila. Transcriptome analyses in Drosophila wing imaginal discs indicate that the EJC controls the splicing of the cell polarity gene discs large 1 (dlg1), whose coding protein directly interacts with Dsh. Genetic and biochemical experiments demonstrate that Dlg1 protein acts independently from its role in cell polarity to protect Dsh protein from lysosomal degradation. More importantly, human orthologous Dlg protein is sufficient to promote Dvl protein stabilization and Wnt signaling activity, thus revealing a conserved regulatory mechanism of Wg/Wnt signaling by Dlg and EJC (Liu, 2016).

The EJC is known to act in several aspects of posttranscriptional regulation, including mRNA localization, translation and degradation. After transcription, the pre-mRNA associated subunit eIF4AIII is loaded to nascent transcripts about 20-24 bases upstream of each exon junction, resulting in binding of Mago nashi (Mago)/Magoh and Tsunagi (Tsu)/Y14 proteins to form the pre-EJC core complex. The pre-EJC then recruits other proteins including Barentsz (Btz) to facilitate its diverse function). In vertebrates, the EJC is known to ensure translation efficiency as well as to activate nonsense-mediated mRNA decay (NMD). In Drosophila, however, the EJC does not contribute to NMD. It is instead required for the oskar mRNA localization to the posterior pole of the oocyte. Very recently, the pre-EJC has been shown to play an important role in alternative splicing of mRNA in Drosophila. Reduced EJC expression results in two forms of aberrant splicing. One is the exon skipping, which occurs in MAPK and transcripts that contain long introns or are located at heterochromatin (Ashton-Beaucage, 2010; Roignant, 2010). The other is the intron retention on piwi transcripts. Furthermore, transcriptome analyses in cultured cells indicates the role of EJC in alternative splicing is also conserved in vertebrates (Liu, 2016).

This study has utilized the developing Drosophila wing as an in vivo model system to investigate new mode of regulation of Wg signaling. The pre-EJC was found to positively regulate Wg signaling through its effect on facilitating Wg morphogen reception. Further studies reveal that the basolateral cell polarity gene discs large 1 (dlg1) is an in vivo target of the pre-EJC in Wg signaling. Dlg1 acts independently from its role on cell polarity to stabilize Dsh protein, thus allowing Wg protein internalization required for signaling activation. Furthermore, it was demonstrated that human Dlg2 exhibits a similar protective role on Dvl proteins to enhance Wnt signaling in cultured human cells. Taken together, this study unveils a conserved regulatory mechanism of the EJC and Dlg in Wg/Wnt signaling (Liu, 2016).

In summary, this study uncovers a specific role of the RNA binding protein complex EJC in the Drosophila wing morphogenesis. Genetic and biochemical analyses demonstrate that the pre-EJC is necessary for Wg morphogen reception to activate the signal transduction. The identification of the cell polarity determinant dlg1 as one of the pre-EJC targets provides mechanistic basis for the pre-EJC regulation of the Wg signaling. Dlg1 controls the stability of the scaffold protein Dsh, which is the hub of the Wg signaling cascade. Importantly, this mode of regulation of Dvl by Dlg is conserved from flies to vertebrates (Liu, 2016).

The EJC as well as other RNA binding protein complexes are thought to function in a pleiotropic manner. However, the current data together with several recent studies argue that RNA regulatory machineries can act specifically on developmental signaling for pattern formation and organogenesis. It has been increasingly recognized that the production, transport or the location of mRNA are subject to precise regulation in Wg/Wnt signaling. For example, apical localization of wg RNA is essential for signal activation in epithelial cells. The specific role of RNA machineries on cell signaling is not limited to Wg/Wnt signaling. It has been reported that RNA-binding protein Quaking specifically binds to the 3'UTR of transcription factor gli2a mRNA to modulate Hedgehog signaling in zebrafish muscle development. RNA binding protein RBM5/6 and 10 could differentially control alternative splicing of a negative Notch regulator gene NUMB, thus antagonistically regulating the Notch signaling activity for cancer cell proliferation. Therefore, generally believed pleotropic RNA regulatory machineries emerge as important regulatory means to specifically control cell signaling and related developmental processes (Liu, 2016).

The most studied function of the EJC in development is to localize oskar mRNA to the posterior pole of the oocyte for oocyte polarity establishment and germ cell formation in Drosophila. Further study suggests that the proper oskar RNA localization relies on its mRNA splicing. In light of the current study of the EJC activity on dlg1 mRNA as well as the roles of EJC on mapk and piwi splicing, it is suspected that EJC might regulate oskar mRNA splicing to mediate its mRNA localization. RNA-seq analyses identified several hundreds of candidate mRNAs whose expression may be directly or indirectly subjected to EJC regulation. Apart from defects in Wg and MAPK signaling, however, altered wing patterning associated with other developmental signaling systems was not seen in EJC defective flies, arguing that EJC may primarily regulate Wg and MAPK signaling in patterning the developing wing (Liu, 2016).

Wg/Wnt signaling plays a fundamental role in development and tissue homeostasis in both flies and vertebrates. Its activation and maintenance rely on appropriate activity of the ternary receptor complex including Fz family proteins. In Drosophila, polarized localization of Fz and Fz2 proteins is essential for activation of non-canonical and canonical Wg signaling, respectively. Dsh, which acts as a hub mediating both canonical and non-canonical Wg signaling, however, is found at both the apical cell boundary and in the basal side of the cytoplasm. Thus, the polarized activity of Dsh must require distinct regulatory mechanisms at different sub-membrane compartments. The results provide the in vivo evidence suggesting that the basolateral polarity determinant Dlg1 may play a dominant role to control the Dsh abundance/activity in canonical Wg signaling (Liu, 2016).

Altered Dvl production or activity has been linked with several forms of cancer. The stability of Dvl proteins can be controlled through regulated protein degradation both in vertebrates and in Drosophila as reported in this study. In HEK293T cells, Dapper1 induces whilst Myc-interacting zinc-finger protein 1 (MIZ1) antagonizes autophagic degradation of Dvl2 in lysosome. It is also reported that a tumor suppressor CYLD deubiquitinase inhibits the ubiquitination of Dvl. As Dlg1 prevents Dsh from degradation in Drosophila, it is important to investigate if Dlg1 participates in a posttranslational regulatory network of Dvl to integrate endocytosis and autophagy. Furthermore, upregulation of dvl2 and dlg2 expression has been found in various forms of cancer as shown in the COSMIC database. The study of the interaction between Dlg1 and Dsh may aid the development of novel approaches to prevent or treat relevant diseases. (Liu, 2016).

Dlg1 acts together with L(2)gl to form a basolateral complex in polarized epithelium. Dsh is known to interact with L(2)gl. On one hand, Dsh activity is required for correct localization of L(2)gl to establish apical-basal polarity in Xenopus ectoderm and Drosophila follicular epithelium. On the other hand, L(2)gl can regulate Dsh to maintain planar organization of the embryonic epidermis in Drosophila. Despite the complex interaction between L(2)gl and Dsh, not much is known about mutual regulation between Dlg1 and Dsh. A recent report suggests that Dsh binds to Dlg1 to activate Guk Holder-dependent spindle positioning in Drosophila. The current results unveil another side of the relationship in which Dlg1 controls the turnover of Dsh to ensure developmental signal propagation. Apart from its apical localization at the cell boundary, Dsh is also found in the basal side of the cytoplasm. It is likely that the interactions among Dsh, Dlg1 and L(2)gl may be dependent on their localization, and Dsh may serve as a bridge to connect cell signaling and polarity (Liu, 2016).

Developmental signaling and cell polarity intertwine to control a diverse array of cellular events. It is well known that Wg/Wnt signaling controls cell polarity in distinct manner. Non-canonical signaling acts through cytoskeletal regulators to establish planar cell polarity. Canonical signaling may also directly affect apical-basal cell polarity. On the other hand, disruption of epithelial cell polarity has a profound impact on protein endocytosis and recycling, both of which are essential regulatory steps for signal activation and maintenance. The current results add another layer of complexity by which polarity determinants could contribute to cell signaling independent of their conventional roles in polarity establishment and maintenance. Interestingly, this mode of regulation is also observed for other signaling processes. Loss of Dlg5 impairs Sonic hedgehog-induced Gli2 accumulation at the ciliary tip in mouse fibroblast cells that may not rely on cell polarity regulation. Similarly, L(2)gl regulates Notch signaling via endocytosis, independent of its role in cell polarity. It is believed that other cell polarity determinants may similarly participate in polarity-independent processes, however, the exact mechanism of how they cooperate to modulate developmental signaling awaits further investigation (Liu, 2016).

SR proteins control a complex network of RNA-processing events

SR proteins are a well-conserved class of RNA-binding proteins that are essential for regulation of splice-site selection, and have also been implicated as key regulators during other stages of RNA metabolism. For many SR proteins, the complexity of the RNA targets and specificity of RNA-binding location are poorly understood. It is also unclear if general rules governing SR protein alternative pre-mRNA splicing (AS) regulation uncovered for individual SR proteins on few model genes, apply to the activity of all SR proteins on endogenous targets. Using RNA-seq, this study characterized the global AS regulation of the eight Drosophila SR protein family members. A majority of AS events are regulated by multiple SR proteins, and that all SR proteins can promote exon inclusion, but also exon skipping. Most coregulated targets exhibit cooperative regulation, but some AS events are antagonistically regulated. Additionally, it was found that SR protein levels can affect alternative promoter choices and polyadenylation site selection, as well as overall transcript levels. Cross-linking and immunoprecipitation coupled with high-throughput sequencing (iCLIP-seq), reveals that SR proteins bind a distinct and functionally diverse class of RNAs, which includes several classes of noncoding RNAs, uncovering possible novel functions of the SR protein family. Finally, it was found that SR proteins exhibit positional RNA binding around regulated AS events. Therefore, regulation of AS by the SR proteins is the result of combinatorial regulation by multiple SR protein family members on most endogenous targets, and SR proteins have a broader role in integrating multiple layers of gene expression regulation (Bradley, 2014).

Combinatorial control of Drosophila circular RNA expression by intronic repeats, hnRNPs, and SR proteins

Thousands of eukaryotic protein-coding genes are noncanonically spliced to produce circular RNAs. Bioinformatics has indicated that long introns generally flank exons that circularize in Drosophila, but the underlying mechanisms by which these circular RNAs are generated are largely unknown. This study, using extensive mutagenesis of expression plasmids and RNAi screening, revealed that circularization of the Drosophila laccase2 gene is regulated by both intronic repeats and trans-acting splicing factors. Analogous to what has been observed in humans and mice, base-pairing between highly complementary transposable elements facilitates backsplicing. Long flanking repeats (approximately 400 nucleotides [nt]) promote circularization cotranscriptionally, whereas pre-mRNAs containing minimal repeats (<40 nt) generate circular RNAs predominately after 3' end processing. Unlike the previously characterized Muscleblind (Mbl) circular RNA, which requires the Mbl protein for its biogenesis, it was found that Laccase2 circular RNA levels are not controlled by Mbl or the Laccase2 gene product but rather by multiple hnRNP (heterogeneous nuclear ribonucleoprotein) and SR (serine-arginine) proteins acting in a combinatorial manner. hnRNP and SR proteins also regulate the expression of other Drosophila circular RNAs, including Plexin A (PlexA), suggesting a common strategy for regulating backsplicing. Furthermore, the laccase2 flanking introns support efficient circularization of diverse exons in Drosophila and human cells, providing a new tool for exploring the functional consequences of circular RNA expression across eukaryotes (Kramer, 2015).

It was long assumed that eukaryotic pre-mRNAs are always canonically spliced to generate a linear mRNA that is subsequently translated to produce a protein. However, it is now becoming increasingly clear that many genes can be noncanonically spliced to produce circular RNAs with covalently linked ends. These transcripts are almost exclusively derived from exons, accumulate in the cytoplasm, and are thought to be products of alternative splicing events known as 'backsplicing.' In contrast to canonical splicing, which joins the exons in a linear order (joining exon 1 to exon 2 to exon 3, etc.), backsplicing joins a splice donor to an upstream splice acceptor (e.g., joining the 3' end of exon 2 to the 5' end of exon 2). A handful of RNAs generated in this manner were identified in the 1990s, and recent deep sequencing studies have expanded this observation to thousands of circular RNAs expressed across eukaryotes, including humans, Caenorhabditis elegans, Drosophila (Salzman. 2013; Ashwal-Fluss, 2014; Westholm, 2014), Schizosaccharomyces pombe, and plants. Perhaps surprisingly, for some genes, the abundance of the circular RNA exceeds that of the associated linear mRNA by a factor of 10, suggesting that the major function of some protein-coding genes may be to generate circular RNAs (Kramer, 2015).

Most exons in eukaryotic genomes have splicing signals at both ends and theoretically can circularize. However, only certain exons are observed in circular RNAs, and these backsplicing events often occur in a tissue-specific manner. This suggests that circular RNA biogenesis is tightly regulated. As splicing generally occurs cotranscriptionally, most introns, along with their upstream splice acceptors (which are needed for backsplicing), are rapidly removed. Therefore, for circular RNAs to be produced, canonical splicing likely must occur more slowly around these exons, and/or exon skipping events may be coupled to circular RNA biogenesis. In the latter, the circular RNA is derived from an exon-containing lariat, allowing a pre-mRNA to yield both a linear mRNA and a circular RNA comprised of the skipped exons (Kramer, 2015).

There is little known about the splicing factors that regulate these events. In some cases, the Muscleblind (Mbl) and Quaking proteins appear to facilitate backsplicing by bridging between two introns and causing the splice sites from the intervening exons to be brought into close proximity (Ashwal-Fluss, 2014; Conn, 2015). For example, circular RNA production from the Drosophila mbl gene is triggered when the Mbl splicing factor binds to its own introns (Ashwal-Fluss, 2014). However, in humans, mice, and C. elegans, the predominant determinants of whether a pre-mRNA is subjected to backsplicing are intronic repetitive elements, such as sequences derived from transposons. Almost 90% of human circular RNAs have complementary Alu elements in their flanking introns, and, analogous to the protein-bridging mechanism, base-pairing between complementary sequences allows the intervening splice sites to be brought close together. Interestingly, repeats <40 nucleotides (nt) can drive circular RNA production in human cells, but it is clear that more than simple thermodynamics regulates circularization. For example, base-pairing interactions can be disrupted by ADAR (adenosine deaminase acting on RNA), which converts adenosines in double-stranded regions to inosines. In addition, most mammalian pre-mRNAs contain multiple intronic repeats, allowing distinct circular (or linear) RNAs to be produced depending on which repeats base-pair to one another. Therefore, other factors likely help dictate splicing outcomes by regulating these exon circularization events (Kramer, 2015).

Despite key regulatory roles for intronic repeats in multiple eukaryotes, it has been suggested that circular RNA biogenesis in Drosophila melanogaster is not driven by base-pairing interactions (Westholm, 2014). Instead, a positive correlation between the length of the flanking introns and circular RNA abundance was identified in Drosophila (Westholm, 2014). However, the effect of modulating intron lengths on backsplicing has not yet been directly addressed. It is also completely unknown how Drosophila circular RNAs besides Mbl, of which there are >2500 annotated circular RNAs derived from other genomic loci, are generated or post-transcriptionally regulated. Therefore, it is still unclear whether circular RNA biogenesis strategies are conserved across eukaryotes or whether species such as Drosophila use unique mechanisms to determine which exons should be backspliced (Kramer, 2015).

Once produced, circular RNAs are stable transcripts that are naturally resistant to degradation by exonucleases. Two circular RNAs (ciRS7/CDR1as and Sry) modulate the activity of specific microRNAs (Hansen, 2013; Memczak, 2013), but most other RNA circles (in species other than Drosophila) contain few microRNA-binding sites and likely function differently. For example, it has been proposed that many circular RNAs may regulate neuronal functions, and artificial circular RNAs containing an IRES (internal ribosome entry site) can be translated. However, the lack of efficient methods for modulating circular RNA levels or ectopically expressing circular RNAs has limited the ability to define functions for these transcripts (Kramer, 2015).

This study focused on the Drosophila laccase2 gene, as it produces an abundant circular RNA in vitro and in vivo. Evidence is provided that intronic repeats collaborate with trans-acting splicing factors to regulate circularization in flies. Mechanistically, it was found that miniature introns (<150 nt) containing the splice sites and inverted repeats were sufficient to support Laccase2 circular RNA production. The intronic repeats must base-pair to one another for circularization to occur, as has been observed in other eukaryotes. Furthermore, it was found that the strength of these base-pairing interactions dictates whether backsplicing occurs co- or post-transcriptionally: Long flanking repeats appear to allow cotranscriptional processing. Screening a panel of genes, this study found that multiple hnRNP (heterogeneous nuclear ribonucleoprotein) and SR (serine–arginine) family proteins regulate Laccase2 circular RNA levels in a combinatorial manner. Comparisons with the mbl locus suggest that the circularization mechanisms are distinct, as the Laccase2 circular RNA was not regulated by the Mbl or Laccase2 gene products. Additional circular RNAs were identified that are regulated by unique combinations of hnRNP and SR proteins, suggesting that combinatorial control may be a common regulatory strategy that modulates circular RNA levels. This led to a test of whether this biogenesis mechanism is active in human cells, and it was found that the laccase2 introns can indeed robustly generate circular RNAs. It is thus now possible to efficiently generate "designer" circular RNAs in cells with minimal linear RNA production. In total, the results reveal new insights into how trans-acting factors and intronic repeats collaborate to regulate circular RNA biogenesis across eukaryotes as well as provide new tools for exploring the functions of circular RNAs (Kramer, 2015).

This study demonstrates that intronic repeats and trans-acting hnRNPs and SR proteins combinatorially regulate circularization of the Drosophila laccase2 gene. Base-pairing between transposable elements in the flanking introns facilitates circularization, and the strength of these interactions likely dictates whether backsplicing occurs co- or post-transcriptionally. This mechanism is distinct from the one that regulates Drosophila Mbl circular RNA production (Ashwal-Fluss, 2014) but is similar to that used to generate many circles in humans, mice, and C. elegans. This suggests that base-pairing between intronic repeats may be a major mechanism promoting exon circularization across eukaryotes. Moreover, this study found that the laccase2 exon is dispensable, allowing the laccase2 introns to be used to efficiently generate 'designer' circular RNAs from plasmids in diverse organisms. Altogether, the results suggest that circular RNA biogenesis strategies are conserved across eukaryotes and provide new tools for exploring the functions of circular RNAs (Kramer, 2015).

The current results on the laccase2 locus indicate that base-pairing between complementary intronic sequences efficiently promotes RNA circularization in flies. As the DNAREP1_DM repeats closely flank exon 2 of the laccase2 gene, a model is proposed in which the repeats base-pair to one another, bringing the intervening splice sites into close proximity and facilitating catalysis. The Laccase2 circular RNA then accumulates as one of the most abundant circular RNAs in Drosophila (fifth most abundant across >100 Drosophila RNA sequencing libraries). At the endogenous laccase2 gene locus, the long introns that flank this exon likely slow the overall speed of cotranscriptional splicing, thereby allowing the backsplicing reaction to effectively compete with canonical splicing. Indeed, it was found that the strength of the base-pairing interactions between the flanking introns dictates how quickly backsplicing can occur. When very stable interactions are present, it is possible that exon definition is improved, allowing the rapid and cotranscriptional generation of a circular RNA. Nevertheless, further studies are still required to clarify the exact role that long flanking introns may play in regulating circularization (Kramer, 2015).

Upon examining the introns that flank other abundant Drosophila circular RNAs, this study identified other examples in which complementary regions >60 nt in length flank circularizing exons, including CaMKI, CG11155, CG2052, Parp, and PlexA (which are among the top 25 most abundant Drosophila circular RNAs). Interestingly, the Semaphorin-2b (CG33960) circular RNA (39th most abundant circular RNA) is flanked by introns containing short (CA)n simple repeats that are complementary to each other over a <30-nt region. Upon cloning a 980-nt region of the Semaphorin-2b pre-mRNA downstream from the pMT, circular RNA production from the plasmid was observed in DL1 cells. Removal of either of the (CA)n simple repeats, however, strongly reduced circularization. This suggests that diverse inverted repeat sequences, including short simple repeats, may play a general role in facilitating circularization in Drosophila (Kramer, 2015).

Complementary repeats, however, are not observed at all Drosophila loci that generate circular RNAs. Furthermore, many exons that do not circularize are flanked by complementary repeats, so there must be other mechanisms that regulate circularization. This has been most notably demonstrated at the Drosophila mbl locus, which requires the Mbl splicing factor for its circularization. When Mbl protein is in excess, an intricate feedback mechanism is induced: The Mbl protein decreases the production of its own mRNA by binding its pre-mRNA. This blocks canonical splicing and promotes the biogenesis of the Mbl circular RNA, which further functions as a sponge that binds and sequesters the excess Mbl protein. However, this Mbl-driven mechanism appears to be specific for the mbl locus, as this study found that knockdown of the Mbl linear mRNA had no effect on Laccase2, PlexA, or a panel of other circular RNAs. Knockdown of the Laccase2 linear mRNA likewise did not affect Laccase2 circular RNA levels, indicating that the laccase2 locus is not subjected to a similar direct cis-acting feedback mechanism. Instead, it was found that other splicing factors, including hnRNPs and SR proteins, regulate Laccase2 RNA levels (Kramer, 2015).

At the laccase2 locus, it is proposed that hnRNPs (e.g., Hrb27C and Hrb87F) and SR proteins (e.g., SF2 [SRSF1], SRp54 [SRSF11], and B52 [SRSF6]) add an additional layer of control on top of the DNAREP1_DM intronic repeats. Base-pairing between the intronic repeats promotes circularization, but protein binding likely helps ensure that the appropriate ratio of linear to circular Laccase2 RNA is produced. Depletion of any one of these splicing factors alters Laccase2 circle levels, and additive effects were observed when multiple factors were depleted. This suggests combinatorial control, with each protein playing a nonredundant role. Furthermore, Laccase2 circular RNA production does not appear to be linked to exon skipping, and thus these proteins may specifically modulate spliceosome assembly, the speed of splicing, and/or the stability of the mature circular RNA. Notably, it does not seem that Hrb27F, SF2, SRp54, or B52 affects Laccase2 circular RNA stability, as depletion of these factors did not cause the expression of a plasmid-derived Laccase2 circular RNA to increase. It is thus instead proposed that these hnRNPs and SR proteins regulate Laccase2 circular RNA biogenesis (e.g., by binding to the flanking introns or exons), but further studies are required to understand exactly how the intronic repeats and trans-acting factors collaboratively dictate the splicing outcome. Nevertheless, the same SR proteins that regulate the laccase2 locus also regulate the PlexA circular RNA but not the Mbl circular RNA. Since the laccase2 and PlexA exons are both flanked by inverted repeats, it is hypothesized that intronic repeats may generally provide the opportunity for circularization to occur. This is then further regulated by trans-acting factors that combinatorially fine-tune the amount of each circular RNA that the cell ultimately produces (Kramer, 2015).

Catalogs of circular RNAs expressed in various species and cell types have been reported, but the functions for nearly all of these transcripts, including Laccase2, are currently unknown. This is due in part to the current lack of methods for efficiently generating circular RNAs in cells. For example, the circular RNA expression plasmids that have been described all generally produce circular transcripts at a low efficiency (often 20% or less). These plasmids instead generate abundant amounts of linear RNA, which limits their utility for defining circular RNA functions. Using the Drosophila laccase2 and human ZKSCAN1 introns, this study largely overcame this hurdle and generated circular RNAs (ranging in size from 300 to 1500 nt) at a high efficiency in human and fly cells. These transcripts accumulate in the cytoplasm, are resistant to RNase R treatment, and are likely translated when an IRES is present. Furthermore, easy-to-use restriction sites are present in the plasmids, allowing any desired sequence to be queried. Beyond allowing ectopic expression of circular RNAs, these plasmids can be designed to sponge microRNAs or proteins as well as identify novel IRES sequences (Kramer, 2015).

In summary, the current findings provide key insights into how trans-acting factors and intronic repeats regulate circular RNA biogenesis as well as provide new tools for exploring the functions of circular RNAs across eukaryotes. From humans to flies, repetitive elements in introns can act to facilitate backsplicing, but it is still largely unclear why circular RNAs accumulate only in certain tissues. It is hypothesized that base-pairing between repeats is only one part of the "splicing code", and it is ultimately a combination of cis-acting elements and trans-acting splicing factors, including hnRNPs and SR proteins, that dictates whether canonical splicing or backsplicing occurs. Nevertheless, this study has defined a minimal set of elements that is sufficient for promoting efficient exon circularization, which should facilitate the prediction of circular RNAs as well as enable the functions of many circular RNAs to be revealed. Considering that a surprisingly large number of protein-coding genes generates circular RNAs, these previously overlooked transcripts likely represent key ways that gene functions are expanded and modulated (Kramer, 2015).

m6A potentiates Sxl alternative pre-mRNA splicing for robust Drosophila sex determination

N6-methyladenosine (m6A) is the most common internal modification of eukaryotic messenger RNA (mRNA) and is decoded by YTH domain proteins. Drosophila mRNA m6A methylosome consists of Ime4 and KAR4 (Inducer of meiosis 4 and Karyogamy protein 4), and Female-lethal (2)d (Fl(2)d) and Virilizer (Vir). In Drosophila, fl(2)d and vir are required for sex-dependent regulation of alternative splicing of the sex determination factor Sex lethal (Sxl). However, the functions of m6A in introns in the regulation of alternative splicing remain uncertain. This study shows that m6A is absent in the mRNA of Drosophila lacking Ime4. In contrast to mouse and plant knockout models, Drosophila Ime4-null mutants remain viable, though flightless, and show a sex bias towards maleness. This is because m6A is required for female-specific alternative splicing of Sxl, which determines female physiognomy, but also translationally represses male-specific lethal 2 (msl-2) to prevent dosage compensation in females. The m6A reader protein YT521-B decodes m6A in the sex-specifically spliced intron of Sxl, as its absence phenocopies Ime4 mutants. Loss of m6A also affects alternative splicing of additional genes, predominantly in the 5' untranslated region, and has global effects on the expression of metabolic genes. The requirement of m6A and its reader YT521-B for female-specific Sxl alternative splicing reveals that this hitherto enigmatic mRNA modification constitutes an ancient and specific mechanism to adjust levels of gene expression (Haussmann, 2016).

In mature mRNA the m6A modification is most prevalently found around the stop codon as well as in 5' untranslated regions (UTRs) and in long exons in mammals, plants and yeast. Since methylosome components predominantly localize to the nucleus, it has been speculated that m6A localized in pre-mRNA introns could have a role in alternative splicing regulation in addition to such a role when present in long exons. This prompted the authors to investigate whether m6A is required for Sxl alternative splicing, which determines female sex and prevents dosage compensation in females. A null allele of the Drosophila METTL3 methyltransferase homologue Ime4 was generated by imprecise excision of a P element inserted in the promoter region. The excision allele Δ22-3 deletes most of the protein-coding region, including the catalytic domain, and is thus referred to as Ime4null. These flies are viable and fertile, but both flightless and this phenotype can be rescued by a genomic construct restoring Ime4. Ime4 shows increased expression in the brain and, as in mammals and plants, localizes to the nucleus (Haussmann, 2016).

Following RNase T1 digestion and 32P end-labelling of RNA fragments, this study detected m6A after guanosine (G) in poly(A) mRNA of adult flies at relatively low levels compared to other eukaryotes (m6A/A ratio: 0.06%), but at higher levels in unfertilized eggs (0.18%). After enrichment with an anti-m6A antibody, m6A is readily detected in poly(A) mRNA, but absent from Ime4null flies (Haussmann, 2016).

As found in other systems, and consistent with a potential role in translational regulation, m6A was detected in polysomal mRNA (0.1%), but not in the poly(A)-depleted rRNA fraction. This also confirmed that any m6A modification in rRNA is not after G in Drosophila (Haussmann, 2016).

Consistent with the hypothesis that m6A plays a role in sex determination and dosage compensation, the number of Ime4null females was reduced to 60% compared to the number of males, whereas in the control strain female viability was 89%. The key regulator of sex determination in Drosophila is the RNA-binding protein Sxl, which is specifically expressed in females. Sxl positively auto-regulates expression of itself and its target transformer (tra) through alternative splicing to direct female differentiation. In addition, Sxl suppresses translation of msl-2 to prevent upregulation of transcription on the X chromosome for dosage compensation; full suppression also requires maternal factors. Accordingly, female viability was reduced to 13% by removal of maternal m6A together with zygotic heterozygosity for Sxl and Ime4. Female viability of this genotype is completely rescued by a genomic construct or by preventing ectopic activation of dosage compensation by removal of msl-2. Hence, females are non-viable owing to insufficient suppression of msl-2 expression, resulting in upregulation of gene expression on the X chromosome from reduced Sxl levels. In the absence of msl-2, disruption of Sxl alternative splicing resulted in females with sexual transformations displaying male-specific features such as sex combs, which were mosaic to various degrees, indicating that Sxl threshold levels are affected early during establishment of sexual identities of cells and/or their lineages. In the presence of maternal Ime4, Sxl and Ime4 do not genetically interact. In addition, Sxl is required for germline differentiation in females and its absence results in tumorous ovaries. Consistent with this, tumorous ovaries were detected in Sxl7B0/+;Ime4null/+ daughters from Ime4null females (22%), but not in homozygous Ime4null or heterozygous Sxl7B0 females (Haussmann, 2016).

Furthermore, levels of the Sxl female-specific splice form were reduced to approximately 50%, consistent with a role for m6A in Sxl alternative splicing. As a result, female-specific splice forms of tra and msl-2 were also significantly reduced in adult females (Haussmann, 2016).

To obtain more comprehensive insights into Sxl alternative splicing defects in Ime4null females, splice junction reads were examined from RNA-seq. Besides the significant increase in inclusion of the male-specific Sxl exon in Ime4null females, cryptic splice sites and increased numbers of intronic reads were detected in the regulated intron. Consistent with reverse transcription polymerase chain reaction (RT-PCR) analysis of tra, the reduction of female splicing in the RNA sequencing is modest, and as a consequence, alternative splicing differences of Tra targets dsx and fru were not detected in whole flies, suggesting that cell-type-specific fine-tuning is required to generate splicing robustness rather than being an obligatory regulator. In agreement with dosage-compensation defects as a main consequence of Sxl dysregulation in Ime4null mutants, X-linked, but not autosomal, genes are significantly upregulated in Ime4null females compared to controls (Haussmann, 2016).

Furthermore, Sxl mRNA is enriched in pull-downs with an m6A antibody compared to m6A-deficient yeast mRNA added for quantification. This enrichment is comparable to what was observed for m6A-pull-down from yeast mRNA (Haussmann, 2016).

To map m6A sites in the intron of Sxl, an in vitro m6A methylation assay was employed using Drosophila nuclear extracts and labelled substrate RNA. m6A methylation activity was detected in the vicinity of alternatively spliced exons. Further fine-mapping localized m6A in RNAs C and E to the proximity of Sxl-binding sites. Likewise, the female-lethal single amino acid substitution alleles fl(2)d1 and vir2F interfere with Sxl recruitment, resulting in impaired Sxl auto-regulation and inclusion of the male-specific exon. Female lethality of these alleles can be rescued by Ime4null heterozygosity, further demonstrating the involvement of the m6A methylosome in Sxl alternative splicing (Haussmann, 2016).

Next, alternative splicing changes were examined in Ime4null females compared to the wild-type control strain. A statistically significant reduction was seen in female-specific alternative splicing of Sxl was observed. In addition, 243 alternative splicing events in 163 genes were significantly different in Ime4null females, equivalent to around 2% of alternatively spliced genes in Drosophila. Six genes for which the alternative splicing products could be distinguished on agarose gels were confirmed by RT-PCR. Notably, lack of Ime4 did not affect global alternative splicing and no specific type of alternative splicing event was preferentially affected. However, alternative first exon (18% versus 33%) and mutually exclusive exon (2% versus 15%) events were reduced in Ime4null compared to a global breakdown of alternative splicing in wild-type Drosophila, mostly to the extent of retained introns (16% versus 6%), alternative donor (16% versus 9%) and unclassified events (14% versus 6%). Notably, the majority of affected alternative splicing events in Ime4null were located to the 5' UTR, and these genes had a significantly higher number of AUG start codons in their 5' UTR compared to the 5' UTRs of all genes. Such a feature has been shown to be relevant to translational control under stress conditions (Haussmann, 2016).

The majority of the 163 differentially alternatively spliced genes in Ime4 females are broadly expressed (59%), while most of the remainder are expressed in the nervous system (33%), consistent with higher expression of Ime4 in this tissue. Accordingly, Gene Ontology analysis revealed a highly significant enrichment for genes involved synaptic transmission (Haussmann, 2016).

Since the absence of m6A affects alternative splicing, m6A marks are probably deposited co-transcriptionally before splicing. Co-staining of polytene chromosomes with antibodies against haemagglutinin (HA)-tagged Ime4 and RNA Pol II revealed broad co-localization of Ime4 with sites of transcription, but not with condensed chromatin-visualized with antibodies against histone H4. Furthermore, localization of Ime4 to sites of transcription is RNA-dependent, as staining for Ime4, but not for RNA Pol II, was reduced in an RNase-dependent manner (Haussmann, 2016).

Although m6A levels after G are low in Drosophila compared to other eukaryotes, broad co-localization of Ime4 to sites of transcription suggests profound effects on the gene expression landscape. Indeed, differential gene expression analysis revealed 408 differentially expressed genes where 234 genes were significantly upregulated and 174 significantly downregulated in neuron-enriched head/thorax of adult Ime4null females. Cataloguing these genes according to function reveals prominent effects on gene networks involved in metabolism, including reduced expression of 17 genes involved in oxidative phosphorylation. Notably, overexpression of the m6A mRNA demethylase FTO in mice leads to an imbalance in energy metabolism resulting in obesity (Haussmann, 2016).

Next, whether either of the two substantially divergent YTH proteins, YT521-B and CG6422, decodes m6A marks in Sxl mRNA was tested. When transiently transfected into male S2 cells, YT521-B localizes to the nucleus, whereas CG6422 is cytoplasmic. Nuclear YT521-B can switch Sxl alternative splicing to the female mode and also binds to the Sxl intron in S2 cells. In vitro binding assays with the YTH domain of YT521-B demonstrate increased binding of m6A-containing RNA. In vivo, YT521-B also localizes to the sites of transcription (Haussmann, 2016).

To further examine the role of YT521-B in decoding m6A Drosophila strain YT521-BMI02006, where a transposon in the first intron disrupts YT521-B, was analyaed. This allele is also viable, and phenocopies the flightless phenotype and the female Sxl splicing defect of Ime4null flies. Likewise, removal of maternal YT521-B together with zygotic heterozygosity for Sxl and YT521-B reduces female viability and results in sexual transformations such as male abdominal pigmentation. In addition, overexpression of YT521-B results in male lethality, which can be rescued by removal of Ime4, further reiterating the role of m6A in Sxl alternative splicing. Since YT521-B phenocopies Ime4 for Sxl splicing regulation, it is the main nuclear factor for decoding m6A present in the proximity of the Sxl-binding sites. YT521-B bound to m6A assists Sxl in repressing inclusion of the male-specific exon, thus providing robustness to this vital gene regulatory switch (Haussmann, 2016).

Nuclear localization of m6A methylosome components suggested a role for this 'fifth' nucleotide in alternative splicing regulation. The discovery of the requirement of m6A and its reader YT521-B for female-specific Sxl alternative splicing has important implications for understanding the fundamental biological function of this enigmatic mRNA modification. Its key role in providing robustness to Sxl alternative splicing to prevent ectopic dosage compensation and female lethality, together with localization of the core methylosome component Ime4 to sites of transcription, indicates that the m6A modification is part of an ancient, yet unexplored mechanism to adjust gene expression. Hence, the recently reported role of m6A methylosome components in human dosage compensation further support such a role and suggests that m6A-mediated adjustment of gene expression might be a key step to allow for the development of the diverse sex determination mechanisms found in nature (Haussmann, 2016).

Extensive cross-regulation of post-transcriptional regulatory networks in Drosophila

In eukaryotic cells, RNAs exist as ribonucleoprotein particles (RNPs). Despite the importance of these complexes in many biological processes including splicing, polyadenylation, stability, transportation, localization, and translation, their compositions are largely unknown. Twenty distinct RNA binding proteins (RBPs) were immunopurified from cultured Drosophila melanogaster cells under native conditions, and both the RNA and protein compositions of these RNP complexes were determined. "High occupancy target" (HOT) RNAs were identified that interact with the majority of the RBPs surveyed. HOT RNAs encode components of the nonsense-mediated decay and splicing machinery as well as RNA binding and translation initiation proteins. The RNP complexes contain proteins and mRNAs involved in RNA binding and post-transcriptional regulation. Genes with the capacity to produce hundreds of mRNA isoforms, ultra-complex genes, interact extensively with heterogeneous nuclear ribonuclear proteins (hnRNPs). This data is consistent with a model in which subsets of RNPs include mRNA and protein products from the same gene, indicating the widespread existence of auto-regulatory RNPs. From the simultaneous acquisition and integrative analysis of protein and RNA constituents of RNPs this study identified extensive cross-regulatory and hierarchical interactions in post-transcriptional control (Stoiber, 2015).

Drosophila Nmnat functions as a switch to enhance neuroprotection under stress

Nicotinamide mononucleotide adenylyltransferase (NMNAT) is a conserved enzyme in the NAD synthetic pathway. It has also been identified as an effective and versatile neuroprotective factor. However, it remains unclear how healthy neurons regulate the dual functions of NMNAT and achieve self-protection under stress. This study shows that Drosophila Nmnat (DmNmnat) is alternatively spliced into two mRNA variants, RA and RB, which translate to protein isoforms with divergent neuroprotective capacities against spinocerebellar ataxia 1-induced neurodegeneration. Isoform PA/PC translated from RA is nuclear-localized with minimal neuroprotective ability, and isoform PB/PD translated from RB is cytoplasmic and has robust neuroprotective capacity. Under stress, RB is preferably spliced in neurons to produce the neuroprotective PB/PD isoforms. These results indicate that alternative splicing functions as a switch that regulates the expression of functionally distinct DmNmnat variants. Neurons respond to stress by driving the splicing switch to produce the neuroprotective variant and therefore achieve self-protection (Ruan, 2015).

Alternative splicing within and between Drosophila species, sexes, tissues, and developmental stages

Alternative pre-mRNA splicing ("AS") greatly expands proteome diversity. The transcriptomes from several tissues and developmental stages were studied in males and females from four species across the Drosophila genus. 20-37% of multi-exon genes were found to be alternatively spliced. While males generally express a larger number of genes, AS is more prevalent in females, suggesting that the sexes adopt different expression strategies for their specialized function. The proportion of expressed genes that are alternatively spliced is highest in the very early embryo, before the onset of zygotic transcription. This indicates that females deposit a diversity of isoforms into the egg, consistent with abundant AS found in ovary. Cluster analysis by gene expression levels shows mostly stage-specific clustering in embryonic samples, and tissue-specific clustering in adult tissues. Clustering embryonic stages and adult tissues based on AS profiles results in stronger species-specific clustering, suggesting that diversification of splicing contributes to lineage-specific evolution in Drosophila. Most sex-biased AS found in flies is due to AS in gonads, with little sex-specific splicing in somatic tissues (Gibilisco, 2016).

Protein composition of catalytically active U7-dependent processing complexes assembled on histone pre-mRNA containing biotin and a photo-cleavable linker

3' end cleavage of metazoan replication-dependent histone pre-mRNAs requires the multi-subunit holo-U7 snRNP and the stem-loop binding protein (SLBP). The exact composition of the U7 snRNP and details of SLBP function in processing remain unclear. To identify components of the U7 snRNP in an unbiased manner, a novel approach was developed for purifying processing complexes from Drosophila and mouse nuclear extracts. In this method, catalytically active processing complexes are assembled in vitro on a cleavage-resistant histone pre-mRNA containing biotin and a photo-sensitive linker, and eluted from streptavidin beads by UV irradiation for direct analysis by mass spectrometry. In the purified processing complexes, Drosophila and mouse U7 snRNP have a remarkably similar composition, always being associated with CPSF73, CPSF100, symplekin and CstF64. Many other proteins previously implicated in the U7-dependent processing are not present. Drosophila U7 snRNP bound to histone pre-mRNA in the absence of SLBP contains the same subset of polyadenylation factors but is catalytically inactive and addition of recombinant SLBP is sufficient to trigger cleavage. This result suggests that Drosophila SLBP promotes a structural rearrangement of the processing complex, resulting in juxtaposition of the CPSF73 endonuclease with the cleavage site in the pre-mRNA substrate (Skrajna, 2018).

In metazoans, 3' end processing of replication-dependent histone pre-mRNAs occurs through a single endonucleolytic cleavage, generating mature histone mRNAs that lack a poly(A) tail. This specialized 3' end processing reaction depends on the U7 snRNP, the core of which consists of a ~60-nt U7 snRNA and a unique heptameric Sm ring. In the ring, the spliceosomal subunits SmD1 and SmD2 are replaced by the related Lsm10 and Lsm11 proteins, whereas the remaining subunits (SmB, SmD3, SmE, SmF and SmG) are shared with the spliceosomal snRNPs (Skrajna, 2018).

Lsm11 contains an extended N-terminal region that interacts with the N-terminal region of the 220 kDa protein FLASH. Together, they recruit a specific subset of the proteins that participate in 3' end processing of canonical pre-mRNAs by cleavage and polyadenylation, resulting in formation of the holo-U7 snRNP (Skrajna, 2014). This subset of polyadenylation factors is referred to as the histone pre-mRNA cleavage complex (HCC) and in mammalian nuclear extracts includes symplekin, all subunits of CPSF (CPSF160, WDR33, CPSF100, CPSF73, Fip1 and CPSF30) and CstF64 as the only CstF subunit. The remaining components of the cleavage and polyadenylation machinery, including CstF50 and CstF77, the two CF Im subunits of 68 and 25 kDa, and the two subunits of CF IIm (Clp1 and Pcf11) were consistently absent in the HCC. A similar subset of polyadenylation factors is associated with the Drosophila holo-U7 snRNP (Skrajna, 2018 and references therein).

The substrate specificity in the processing reaction is provided by the U7 snRNA, which through its 5' terminal region base pairs with the histone downstream element (HDE), a sequence in histone pre-mRNA located downstream of the cleavage site. This interaction is assisted by the stem-loop binding protein (SLBP), which binds the highly conserved stem-loop structure located upstream of the cleavage site (Wang, 1996; Martin, 1997; Tan, 2013) and stabilizes the complex of U7 snRNP with histone pre-mRNA (Dominski, 1999), likely by contacting FLASH and Lsm11 (Skrajna, 2017). In mammalian nuclear extracts, histone pre-mRNAs that form a strong duplex with the U7 snRNA are cleaved efficiently in the absence of SLBP. In contrast, Drosophila nuclear extracts lacking SLBP are inactive in cleaving histone pre-mRNAs, suggesting that Drosophila SLBP plays an essential role in processing in addition to stabilizing binding of the U7 snRNP to histone pre-mRNA (Skrajna, 2018 and references therein).

Within the HCC, CPSF73 is the endonuclease, acting in a close partnership with its catalytically inactive homolog, CPSF100, and the heat-labile scaffolding protein symplekin. RNAi-mediated depletion of these three HCC subunits in Drosophila cultured cells results in generation of polyadenylated histone mRNAs, an indication of their essential role in the U7-dependent processing. Depletion of the remaining components of the HCC had no effect on the 3' end of histone mRNAs and their function in the U7 snRNP, if any, is less clear. Previous in vivo studies implicated multiple other proteins, in addition to SLBP and components of the U7 snRNP, in generation of correctly processed histone pre-mRNAs. These proteins include ZFP100, CDC73/parafibromin, NELF E, Ars2, CDK9, CF Im68 and RNA-binding protein FUS/TLS (Fused in Sarcoma/Translocated in Sarcoma). ZFP100, CF Im68 and FUS were shown to interact with Lsm11, whereas Ars2 was shown to interact with FLASH, raising the possibility that they may be essential components of the cleavage machinery (Skrajna, 2018).

To determine which factors are required for the cleavage reaction, a novel method for purification of in vitro assembled Drosophila and mouse processing complexes was developed. In this method, histone pre-mRNAs containing biotin and a photo-cleavable linker in either cis or trans are incubated with a nuclear extract and the assembled processing complexes are immobilized on streptavidin beads, washed and released into solution by irradiation with long wave UV. This approach yielded remarkably pure processing complexes that were suitable for direct and unbiased analysis by mass spectrometry, providing a complete view of the holo-U7 snRNP and other proteins that associate with histone pre-mRNA for 3' end processing (Skrajna, 2018).

In this method, processing complexes were assembled in a nuclear extract on a synthetic histone pre-mRNA containing biotin and a photo-cleavable linker at the 5' end. The major cleavage site and the two neighboring nucleotides on each side were modified with a 2'O-methyl group, hence preventing endonucleolytic cleavage of the pre-mRNA and increasing the efficiency of capturing intact processing complexes. Following immobilization on streptavidin beads, the pre-mRNA and the bound proteins were washed and released to solution by irradiation with long wave UV. This UV-elution step, by eliminating all background proteins non-specifically bound to streptavidin beads, resulted in isolation of remarkably pure processing complexes that were suitable for direct analysis by mass spectrometry. This is the first successful use of the photo-cleavable linker and the UV-elution step for purification of an in vitro assembled RNA/protein complex. Parallel experiments with pre-mRNA substrates lacking 2'O-methyl nucleotides at the cleavage site demonstrated that the immobilized processing complexes retain catalytic activity. Thus, the mass spectrometry analysis of the UV-eluted material is likely to provide a global and unbiased view of all essential proteins that associate with histone pre-mRNA for 3' end processing (Skrajna, 2018).

Since chemical synthesis of RNAs containing covalently attached biotin and the photo-cleavable linker (cis configuration) is both expensive and limited to sequences not exceeding 60-70 nt, longer histone pre-mRNAs generated by T7 transcription were tested. Biotin and the photo-cleavable linker can be attached to the 3' end of these pre-mRNAs in trans via a short complementary oligonucleotide. This modification makes the UV-elution method more cost effective and potentially applicable for purification of RNA-protein complexes that require longer RNA binding targets, including spliceosomes and complexes involved in cleavage and polyadenylation (Skrajna, 2018).

In the UV-eluted mouse and Drosophila processing complexes, mass spectrometry identified SLBP and all known subunits of the U7-specific Sm ring, including Lsm10 and Lsm11. Readily detectable in mouse and Drosophila processing complexes were also FLASH and subunits of the HCC. The HCC is remarkably similar in composition between the two species, with symplekin, CPSF100, CPSF73 and CstF64 being most abundant and present in close to stoichiometric amounts, as determined by both silver staining and emPAI value analysis. The remaining CPSF subunits (CPSF160, WDR33, Fip1 and CPSF30) are present in lower amounts, suggesting that they are substoichiometric, being stably associated only with a fraction of the U7 snRNP (Skrajna, 2018).

In both mouse and Drosophila experiments, SLBP and the components of the U7 snRNP were the only proteins that consistently failed to bind histone pre-mRNAs in the presence of processing competitors: SL RNA and αU7 oligonucleotide. Other proteins were detected both in the samples containing processing complexes and in the matching negative controls, where formation of processing complexes was blocked. Among them, the most prevalent were non-specific RNA binding proteins, including hnRNP Q in mouse nuclear extracts, and IGF2BP1 in Drosophila nuclear extracts. All these proteins likely bind to sites in histone pre-mRNAs unoccupied by SLBP and U7 snRNP, and play no essential role in processing (Skrajna, 2018).

CstF50 and CstF77 were not detected in the UV-eluted mouse processing complexes and were present only in some Drosophila complexes, always with low scores, consistent with a previous conclusion that of the three CstF subunits only CstF64 stably associates with the U7 snRNP. No peptides were detected for CF Im (68 and 25 kDa) and CF IIm (Clp1 and Pcf11) in any of the mouse experiments, suggesting that these factors are also uniquely involved in cleavage and polyadenylation. Mass spectrometry identified the orthologues of the 68 and 25 kDa subunits in some Drosophila experiments, but they were clearly contaminants, persisting in the presence of the SL RNA and αU7 oligonucleotide. CF Im68 was previously reported to interact with Lsm11 and to co-purify with U7 snRNP. Based on this analysis, this subunit is unlikely to interact with Lsm11 in the processing complex (Skrajna, 2018).

Catalytically active mouse processing complexes also lacked ZFP100 (ZN473), a zinc finger protein that co-localizes with Lsm11 and stimulates expression of a reporter gene containing U7-dependent processing signals. ZFP100 was initially identified by the yeast two-hybrid system as a protein interacting with SLBP bound to the SL RNA and suggested to function as a bridging factor in the SLBP-mediated recruitment of the U7 snRNP to histone pre-mRNA. However, the absence of ZFP100 in the UV-eluted mouse processing complexes containing both SLBP and U7 snRNP strongly argues against this function. ZFP100 may instead participate in a different aspect of histone gene expression in vivo, perhaps acting as a coupling factor that integrates transcription of histone genes with 3' end processing of the nascent histone pre-mRNAs (Skrajna, 2018).

A similar role in vivo may be played by the multi-functional protein FUS and other proteins previously linked to 3' end processing of histone pre-mRNAs in mammalian cells, including Ars2, CDC73/parafibromin, NELF E and CDK9. These factors were never specifically detected in the UV-eluted mouse processing complexes, suggesting that they have no direct role in processing in vitro. Their downregulation by RNAi results in production of a small amount of polyadenylated histone mRNAs, which may be due to a defect in coupling of histone gene transcription with processing and/or cell-cycle progression (Skrajna, 2018).

Although this study identified several polyadenylation subunits in a stable association with the U7 snRNP, the experiments do not directly address which of them are essential for processing of histone pre-mRNAs. In Drosophila cultured cells, RNAi-mediated depletion of each of only three U7-associated polyadenylation subunits, symplekin, CPSF100 and CPSF73, consistently resulted in accumulation of histone mRNAs terminated with a poly(A) tail, an indication of a defect in the U7-dependent processing mechanism. Depletion of the remaining HCC subunits had no effect, suggesting that their association with the U7 snRNP is not essential for 3' end processing of histone pre-mRNAs. Symplekin, CPSF100 and CPSF73 are present in Drosophila cells as a stable sub-complex and likely act together as an autonomous cleavage module recruited for processing to either histone or canonical pre-mRNAs by specialized RNA recognition sub-complexes. For canonical pre-mRNAs, this role is played by the remaining CPSF subunits, CPSF160, WDR33, Fip1 and CPSF30, recently shown to co-operate in recognizing the AAUAAA signal during the polyadenylation step. In 3' end processing of histone pre-mRNAs, the recruitment of the cleavage sub-complex is mediated by the U7 snRNA, which recognizes the substrate by the base pairing interaction, further arguing that CPSF160, WDR33, Fip1 and CPSF30 are likely non-essential bystanders in the U7 snRNP (Skrajna, 2018).

A less clear role in 3' end processing of histone pre-mRNAs is played by CstF64, which in spite of being relatively abundant in Drosophila U7 snRNP can be depleted from Drosophila cells without causing a detectable misprocessing of histone pre-mRNAs. A defect in the U7-dependent processing was however observed in human cells partially depleted of CstF64, suggesting that in mammalian cells this subunit may play a more critical role, perhaps helping to stabilize the three essential subunits of the HCC on the FLASH/Lsm11 complex. Clearly, determining which subunits are essential for cleavage will require reconstitution of a catalytically active processing complex from recombinant components (Skrajna, 2018).

This study brings a new perspective on the essential role of Drosophila SLBP in processing. It was recently demonstrated that Drosophila SLBP, like its mammalian counterpart, enhances the recruitment of U7 snRNP to histone pre-mRNA. A small amount of U7 snRNP binds to histone pre-mRNA in the absence of Drosophila SLBP but the bound U7 snRNP in spite of containing all major HCC subunits is catalytically inactive. This study now shows that processing complexes assembled in the absence of SLBP can be activated for cleavage by simply adding recombinant WT SLBP, providing evidence that SLBP is the only missing factor in the assembled complexes. A mutant Drosophila SLBP that is deficient in recruiting U7 snRNP to histone pre-mRNA is also unable to activate the assembled complex for cleavage. Based on these results, it is proposed that the interaction of Drosophila SLBP with the U7 snRNP promotes an essential structural rearrangement of the entire processing complexes that juxtaposes the catalytic site of CPSF73 with the pre-mRNA (see A hypothetical model explaining essential role of Drosophila in processing). It is possible that higher metazoans developed an additional positioning mechanism for the CPSF73 endonuclease, resulting in efficient cleavage in the absence of SLBP (Skrajna, 2018).

Sex-specific transcript diversity in the fly head Is established during pupal stages and adulthood and is largely independent of the mating process and the germline

Alternative splicing (AS), the process which generates multiple RNA and protein isoforms from a single pre-mRNA, greatly contributes to transcript diversity and compensates for the fact that the gene number does not scale with organismal complexity. A number of genomic approaches have established that the extent of AS is much higher than previously expected, raising questions on its spatio-temporal regulation and function. The present study addresses AS in the context of sex-specific neuronal development in the model Drosophila melanogaster. At least 47 genes display sex-specific AS in the adult fly head. Unlike targets of the classical Sex lethal-dependent sex determination cascade, sex-specific isoforms of the vast majority of these genes are not present during larval development but start accumulating during metamorphosis or later, indicating the existence of novel mechanisms in the induction of sex-specific AS. It was also established that sex-specific AS in the adult fly head is largely independent of the germline or the mating process. Finally, the role of sex-specific AS of the sulfotransferase Tango13 pre-mRNA was investigated and first evidence is provided that differential expression of certain isoforms of this protein significantly affects courtship and mating behavior in male flies (Mohr, 2017)

The Y chromosome modulates splicing and sex-biased intron retention rates in Drosophila

The Drosophila Y chromosome is a 40MB segment of mostly repetitive DNA; it harbors a handful of protein coding genes and a disproportionate amount of satellite repeats, transposable elements, and multicopy DNA arrays. Intron retention (IR) is a type of alternative splicing (AS) event by which one or more introns remain within the mature transcript. IR recently emerged as a deliberate cellular mechanism to modulate gene expression levels and has been implicated in multiple biological processes. However, the extent of sex differences in IR and the contribution of the Y chromosome to the modulation of alternative splicing and intron retention rates has not been addressed. This study showed pervasive intron retention (IR) in the fruit fly Drosophila melanogaster with thousands of novel IR events, hundreds of which displayed extensive sex-bias. The data also revealed an unsuspected role for the Y chromosome in the modulation of alternative splicing and intron retention. The majority of sex-biased IR events introduced premature termination codons and the magnitude of sex-bias was associated with gene expression differences between the sexes. Surprisingly, an extra Y chromosome in males (X^YY genotype) or the presence of a Y chromosome in females (X^XY genotype) significantly modulated IR and recapitulated natural differences in IR between the sexes. These results highlight the significance of sex-biased IR in tuning sex differences and the role of the Y chromosome as a source of variable IR rates between the sexes. Modulation of splicing and intron retention rates across the genome represent new and unexpected outcomes of the Drosophila Y chromosome (Wang, 2018).

The kinetics of pre-mRNA splicing in the Drosophila genome and the influence of gene architecture

Production of most eukaryotic mRNAs requires splicing of introns from pre-mRNA. The splicing reaction requires definition of splice sites, which are initially recognized in either intron-spanning ('intron definition') or exon-spanning ('exon definition') pairs. To understand how exon and intron length and splice site recognition mode impact splicing, splicing rates were measured genome-wide in Drosophila, using metabolic labeling/RNA sequencing and new mathematical models to estimate rates. The modal intron length range of 60-70 nt was found to represent a local maximum of splicing rates, but much longer exon-defined introns are spliced even faster and more accurately. Unexpectedly low variation was observed in splicing rates across introns in the same gene, suggesting the presence of gene-level influences, and multiple gene level variables associated with splicing rate were identified. Together these data suggest that developmental and stress response genes may have preferentially evolved exon definition in order to enhance the rate or accuracy of splicing (Pai, 2017).

piRNA-mediated regulation of transposon alternative splicing in the soma and germ line

Transposable elements can drive genome evolution, but their enhanced activity is detrimental to the host and therefore must be tightly regulated. The Piwi-interacting small RNA (piRNA) pathway is vital for the regulation of transposable elements, by inducing transcriptional silencing or post-transcriptional decay of mRNAs. This study shows that piRNAs and piRNA biogenesis components regulate precursor mRNA splicing of P-transposable element transcripts in vivo, leading to the production of the non-transposase-encoding mature mRNA isoform in Drosophila germ cells. Unexpectedly, it was shown that the piRNA pathway components do not act to reduce transcript levels of the P-element transposon during P-M hybrid dysgenesis, a syndrome that affects germline development in Drosophila. Instead, splicing regulation is mechanistically achieved together with piRNA-mediated changes to repressive chromatin states, and relies on the function of the Piwi-piRNA complex proteins Asterix (also known as Gtsf1) and Panoramix (Silencio), as well as Heterochromatin protein 1a [HP1a; encoded by Su(var)205]. Furthermore, this machinery, together with the piRNA Flamenco cluster, not only controls the accumulation of Gypsy retrotransposon transcripts but also regulates the splicing of Gypsy mRNAs in cultured ovarian somatic cells, a process required for the production of infectious particles that can lead to heritable transposition events. These findings identify splicing regulation as a new role and essential function for the Piwi pathway in protecting the genome against transposon mobility, and provide a model system for studying the role of chromatin structure in modulating alternative splicing during development (Teixeira, 2017).

Hybrid dysgenesis is a syndrome that affects progeny in a non-reciprocal fashion, being normally restricted to the offspring of crosses in which males carry transposable elements but which females lack. In Drosophila, the dysgenic traits triggered by the P-element DNA transposon are restricted to the germ line and include chromosomal rearrangements, high rates of mutation, and sterility. The impairment is most prominent when hybrids are grown at higher temperatures, with adult dysgenic females being completely sterile at 29°C. Despite the severe phenotypes, little is known about the development of germ cells during P-M dysgenesis. To address this, germline development was characterized in the progeny obtained from reciprocal crosses between w1118 (P-element-devoid strain) and Harwich (P-element-containing strain) flies at 29°C. In non-dysgenic progeny, germline development progressed normally throughout embryonic and larval stages, leading to fertile adults. Although the development of dysgenic germline cells was not disturbed during embryogenesis, germ cells decreased in number during early larval stages, leading to animals with no germ cells by late larval stages. These results indicate that the detrimental effects elicited by P-element activity are triggered early on during primordial germ cell (PGC) development in dysgenic progeny, leading to premature germ cell death (Teixeira, 2017).

Maternally deposited small RNAs cognate to the P-element are thought to provide the 'P-cytotype' by conferring the transgenerationally inherited ability to protect developing germ cells against P-elements. Small RNA-based transposon regulation is typically mediated by either transcriptional silencing or post-transcriptional clearance of mRNAs, both of which result in a decrease in the accumulation of transposon mRNA. To understand how maternally provided small RNAs control P-elements in germ cells, this study focused on embryonic PGCs sorted from 4- to 20-h-old embryos generated from reciprocal crosses between w1118 and Harwich strains. Surprisingly, the accumulation of P-element RNA as measured by quantitative reverse transcription PCR (RT-qPCR) showed no change in dysgenic PGCs when compared to non-dysgenic PGCs. This indicates that P-cytotype small RNAs exert their function by means other than regulating P-element mRNA levels (Teixeira, 2017).

P-element activity relies on production of a functional P-element transposase protein, the expression of which requires precursor mRNA (pre-mRNA) splicing of three introns. To analyse P-element RNA splicing in germ cells during hybrid dysgenesis, primers were designed that specifically anneal to spliced mRNA transcripts. The accumulation of spliced forms for the first two introns (IVS1 and IVS2) did not show changes in dysgenic PGCs when compared to non-dysgenic PGCs. By contrast, the accumulation of spliced transcripts for the third intron (IVS3) was substantially increased in dysgenic germ cells. Given that the overall accumulation of P-element mRNA showed no changes, the results indicate that the maternally provided P-cytotype can negatively regulate P-element IVS3 splicing and therefore inhibits the production of functional P-transposase in germ cells (Teixeira, 2017).

Analysis of publically available small RNA sequencing data from 0-2-h-old embryos laid by Harwich females indicated that two classes of small RNAs cognate to the P-element are maternally transmitted: small interfering RNAs (siRNAs, 20-22-nucleotides long) and piRNAs (23-29 nucleotides long). To test the role of distinct small RNA populations on P-element expression, mutants were analyzed uniquely affecting each small RNA biogenesis pathway in the Harwich background. Mutations that disrupt siRNA biogenesis components Dicer-2 (Dcr-2) and Argonaute 2 (AGO2), or mutations ablating components of the piRNA biogenesis pathway, such as the Argonautes piwi, aubergine (aub), and Argonaute 3 (AGO3), as well as the RNA helicase vasa (vas) and spindle E (spn-E), did not affect P-element mRNA accumulation in adult ovaries as measured by RT-qPCR. However, mutations that disrupted piRNA biogenesis, and not the siRNA pathway, led to a strong and specific increase in the accumulation of IVS3-spliced mRNAs. RNA sequencing (RNA-seq) analysis on poly(A)-selected RNAs from aub and piwi mutant adult ovaries confirmed the specific effect on IVS3 splicing. To examine transposon expression in tissue, RNA fluorescent in situ hybridization (FISH) was performed using probes specific for the P-element and for the Burdock retrotransposon, a classic target of the germline piRNA pathway. In mutants affecting piRNA biogenesis, increased abundance of Burdock RNA was readily observed in germline tissues, with most of the signal accumulating close to the oocyte. By contrast, no difference was detected in the P-element RNA FISH signal in piRNA biogenesis mutants compared to control. Nuclear RNA foci observed in nurse cells were of similar intensity and number regardless of the genotype, and cytoplasmic signal showed no detectable difference. Therefore, the results indicate that in germ cells, piRNAs specifically modulate IVS3 splicing. This regulation is reminiscent of the well-documented mechanism that restricts P-element activity to germline tissues, which involves the expression of a host-encoded RNA binding repressor protein that negatively regulates IVS3 splicing in somatic tissues (Teixeira, 2017).

In somatic tissues, P-element alternative splicing regulation is mediated by the assembly of a splicing repressor complex on an exonic splicing silencer element directly upstream of IVS3. To test whether the P-element IVS3 and flanking exon sequences were sufficient to trigger the piRNA-mediated splicing regulation in germ cells, a transgenic reporter system for IVS3 splicing was used in which a heterologous promoter (Hsp83) drives the expression of an IVS3-lacZ-neo fusion mRNA specifically in the germ line. Using RT-qPCR, the F1 progeny from reciprocal crosses between w1118 and Harwich flies were analyzed in the presence of the hsp83-IVS3-lacZ-neo reporter. The fraction of spliced mRNAs produced from the transgenic reporter was substantially increased in dysgenic compared to non-dysgenic adult ovaries, in agreement with previously reported results. Most importantly, genetic experiments confirmed that the repression of IVS3 splicing in germ cells relies on piRNA biogenesis, as the splicing repression observed with this reporter in non-dysgenic progeny was specifically abolished in adult ovaries of aub and vas mutants (Teixeira, 2017).

Mechanistically, piRNA-mediated splicing regulation may be achieved through direct action of piRNA complexes on target pre-mRNAs carrying the IVS3 sequence or indirectly by piRNA-mediated changes in chromatin states. Piwi-interacting proteins such as Asterix (Arx) and Panoramix (Panx) are dispensable for piRNA biogenesis but are essential for establishing Piwi-mediated chromatin changes, possibly by acting as a scaffold to recruit histone-modifying enzymes and chromatin-binding proteins to target loci. To test the role of these chromatin regulators on P-element splicing, germline-specific RNA interference (RNAi) knockdown experiments were performed in the Harwich background. Similar to what was observed for the piRNA biogenesis components, germline knockdown of Arx and Panx showed no change in the accumulation of P-element RNA, but a strong and specific effect on IVS3 splicing in adult ovaries. The same pattern on IVS3 splicing was observed in the germline knockdown of HP1a and Maelstrom (Mael), both of which act downstream of Piwi-mediated targeting to modulate chromatin structure. The same genetic requirement for Panx for IVS3 splicing control was also confirmed when using the transgenic IVS3 splicing reporter, further indicating that Piwi-mediated chromatin changes at the target locus are involved in IVS3 splicing regulation. At target loci, Piwi complexes are known to mediate the deposition of the classic heterochromatin mark histone H3 lysine 9 trimethylation (H3K9me3). To assess the effect of piRNA-targeting on P-element chromatin marks directly, H3K9me3 chromatin immunoprecipitation was performed followed by sequencing (ChIP-seq) or quantitative PCR on adult ovaries of progeny from reciprocal crosses between w1118 and Harwich strains (to avoid developmental defects, ChIP was performed on F1 progeny raised at 18°C. This analysis revealed a specific loss of global H3K9me3 levels over P-element insertions in dysgenic progeny when compared to non-dysgenic progeny (Teixeira, 2017).

To analyse the chromatin structure at individual P-element insertions, DNA sequencing (DNA-seq) data was used to identify all euchromatic insertions in the Harwich strain, and RNA-seq analysis was used to define transcriptionally active insertions. At transcriptionally active P-element euchromatic insertions, the spreading of H3K9me3 into the flanking genomic regions was readily observed in non-dysgenic progeny, but was completely absent in dysgenic offspring. Similarly, a reduction in H3K9me3 modification levels was also observed over the IVS3 transgenic reporter in dysgenic progeny when compared to non-dysgenic progeny. Interestingly, euchromatic insertions with no evidence of transcriptional activity were devoid of an H3K9me3 signal in both non-dysgenic and dysgenic crosses, providing further evidence for a model initially suggested in yeast and more recently proposed for Drosophila and mammals, in which H3K9me3 deposition by piRNA complexes would require transcription of the target loci. Mechanistically different from the well-described somatic repression, the results uncovered the existence of an unexpected piRNA-mediated, chromatin-based mechanism regulating IVS3 alternative splicing in germ cells (Teixeira, 2017).

To expand the analysis, the literature was searched for other cases of transposon splicing regulation. Drosophila Gypsy elements are retrotransposons that have retrovirus-like, infective capacity owing to their envelope (Env) protein. These elements are expressed in somatic ovarian cells, in which they are regulated by the flamenco locus, a well-known piRNA cluster that is a soma-specific source of antisense piRNAs cognate to Gypsy. Interestingly, it has been shown that mutations in flamenco not only elicited the accumulation of Gypsy RNA, but also modulated pre-mRNA splicing, favouring the production of the env mRNA and therefore germline infection. To test whether the piRNA pathway, in addition to its role in regulating the accumulation of Gypsy RNA, is also responsible for modulating the splicing of Gypsy elements in somatic tissues, publically available RNA-seq data from poly(A)-selected RNAs extracted from in vivo cultures of ovarian somatic cells (OSCs) was analyzed. The analysis indicates that piwi knockdown was sufficient to modulate Gypsy splicing, favouring the accumulation of env-encoding mRNA. In agreement with a chromatin-mediated regulation of alternative splicing, RNAi depletion of Arx, Panx, HP1a and Mael, as well as knockdown of the histone linker H1, was sufficient to favour Gypsy splicing, recapitulating the effect caused by Piwi depletion. Notably, this was also the case for the H3K9 methyltransferase Setdb1, but not for the H3K9 methyltransferases Su(var)3-9 and G9a, indicating specific genetic requirements. Taken together, the results indicate that the piRNA pathway, through its role in mediating changes in chromatin states, regulates the splicing of transposon pre-mRNAs in both somatic and germline tissues (Teixeira, 2017).

Using P-M hybrid dysgenesis as a model, this study hasa uncovered splicing regulation elicited by chromatin changes as a previously unknown mechanism by which the piRNA pathway protects the genome from the detrimental effects of transposon activity. Splicing control at piRNA-target loci is likely to be mechanistically different from what has been observed for germline piRNA clusters given the low enrichment of the HP1 homologue Rhino (also known as HP1D) protein, which is required for piRNA cluster RNA processing, over the endogenous P-element insertions in the Harwich genome or over the transgenic IVS3 splicing reporter in non-dysgenic and dysgenic progeny (as measured by ChIP-qPCR). Because small RNA-based systems leading to chromatin mark changes at target loci are pervasive in eukaryotes, it is expected that this new type of targeted regulation is of importance in settings far beyond the scope of the piRNA pathway and Drosophila. Indeed, small RNA-guided DNA methylation over the LINE retrotransposon Karma was recently shown to modulate alternative splicing in oil palm, disrupting nearby gene expression and ultimately affecting crop yield. In this context, small RNA-based control of chromatin structure may be crucially important in genomes with a high content of intronic transposon insertions, such as the human genome, by providing a mechanism to suppress exonization of repeat elements. Although the means by which piRNA-mediated changes in chromatin states could regulate alternative splicing remain to be determined, it is tempting to speculate that piRNA pathway components do so by co-transcriptionally modulating interactions between RNA polymerase II and the spliceosome (Teixeira, 2017).

Short cryptic exons mediate recursive splicing in Drosophila

Many long Drosophila introns are processed by an unusual recursive strategy. The presence of ~200 adjacent splice acceptor and splice donor sites, termed ratchet points (RPs), were inferred to reflect 'zero-nucleotide exons', whose sequential processing subdivides removal of long host introns. This study used CRISPR-Cas9 to disrupt several intronic RPs in Drosophila melanogaster, some of which recapitulated characteristic loss-of-function phenotypes. Unexpectedly, selective disruption of RP splice donors revealed constitutive retention of unannotated short exons. Assays using functional minigenes confirm that unannotated cryptic splice donor sites are critical for recognition of intronic RPs, demonstrating that recursive splicing involves the recognition of cryptic RP exons. This appears to be a general mechanism, because canonical, conserved splice donors are specifically enriched in a 40-80-nt window downstream of known and newly annotated intronic RPs and exhibit similar properties to a broadly expanded class of expressed RP exons. Overall, these studies unify the mechanism of Drosophila recursive splicing with that in mammals (Joseph, 2018).

Proper splicing contributes to visual function in the aging Drosophila eye

Changes in splicing patterns are a characteristic of the aging transcriptome; however, it is unclear whether these age-related changes in splicing facilitate the progressive functional decline that defines aging. In Drosophila, visual behavior declines with age and correlates with altered gene expression in photoreceptors, including downregulation of genes encoding splicing factors. This study characterized the significance of these age-regulated splicing-associated genes in both splicing and visual function. To do this, differential splicing events were identified in either the entire eye or photoreceptors of young and old flies. Intriguingly, aging photoreceptors show differential splicing of a large number of visual function genes. In addition, as shown previously for aging photoreceptors, aging eyes showed increased accumulation of circular RNAs, which result from noncanonical splicing events. To test whether proper splicing was necessary for visual behavior, age-regulated splicing factors were knocked down in photoreceptors in young flies and phototaxis was examined. Notably, many of the age-regulated splicing factors tested were necessary for proper visual behavior. In addition, knockdown of individual splicing factors resulted in changes in both alternative splicing at age-spliced genes and increased accumulation of circular RNAs. Together, these data suggest that cumulative decreases in splicing factor expression could contribute to the differential splicing, circular RNA accumulation, and defective visual behavior observed in aging photoreceptors (Stegeman, 2018).

Numerous recursive sites contribute to accuracy of splicing in long introns in flies

Recursive splicing, a process by which a single intron is removed from pre-mRNA transcripts in multiple distinct segments, has been observed in a small subset of Drosophila melanogaster introns. However, detection of recursive splicing requires observation of splicing intermediates that are inherently unstable, making it difficult to study. This study developed new computational approaches to identify recursively spliced introns and applied them, in combination with existing methods, to nascent RNA sequencing data from Drosophila S2 cells. These approaches identified hundreds of novel sites of recursive splicing, expanding the catalog of recursively spliced fly introns by 4-fold. A subset of recursive sites were validated by RT-PCR and sequencing. Recursive sites occur in most very long (> 40 kb) fly introns, including many genes involved in morphogenesis and development, and tend to occur near the midpoints of introns. Suggesting a possible function for recursive splicing, it was observed that fly introns with recursive sites are spliced more accurately than comparably sized non-recursive introns (Pai, 2018).

Striking circadian neuron diversity and cycling of Drosophila alternative splicing

Although alternative pre-mRNA splicing (AS) significantly diversifies the neuronal proteome, the extent of AS is still unknown due in part to the large number of diverse cell types in the brain. To address this complexity issue, this study used an annotation-free computational method to analyze and compare the AS profiles between small specific groups of Drosophila circadian neurons. The method, the Junction Usage Model (JUM), allows the comprehensive profiling of both known and novel AS events from specific RNA-seq libraries. The results show that many diverse and novel pre-mRNA isoforms are preferentially expressed in one class of clock neuron and also absent from the more standard Drosophila head RNA preparation. These AS events are enriched in potassium channels important for neuronal firing, and there are also cycling isoforms with no detectable underlying transcriptional oscillations. The results suggest massive AS regulation in the brain that is also likely important for circadian regulation (Wang, 2018).

Tissues of the nervous and germline systems, such as brain, testes and ovaries, have more complex transcriptomes than other cell types due to extensive alternative pre-mRNA splicing or AS. The nervous system especially exhibits vast numbers of AS isoforms, many of which are novel and are only beginning to be comprehensively identified. This increase in transcript isoform complexity likely contributes to the specification and functional diversity of cell types within the nervous system (Wang, 2018).

This study applied a novel computational algorithm called JUM to characterize the transcript isoform diversity generated by alternative splicing in three circadian neuronal subtypes (LNv, LNd and DN1), as well as a non-circadian dopaminergic neuron population (TH neurons) of the Drosophila central nervous system. JUM can comprehensively analyze, quantitate and compare tissue- or cell-type-specific AS patterns without requiring a priori annotations of known transcripts or transcriptomes. The analysis revealed a previously unappreciated diversity and complexity of alternatively spliced transcript isoform patterns in these four neuronal subtypes, suggesting that they contribute to neuronal identity, connectivity, activity and circadian functions. This is because many of these novel, previously undetected and unannotated isoforms were unique to a given neuronal population and occurred in transcripts from genes implicated in neuronal activity or circadian rhythms (Wang, 2018).

For example, the kinase Shaggy and the blue light photoreceptor Cryptochrome play central roles in circadian clock regulation and have novel AS patterns in discrete subsets of the circadian neurons. In addition, nine different transcripts involved in potassium transport undergo differential AS in circadian neurons compared to non-circadian neurons. These transcripts encode six different potassium channels. Many of these genes have a complex organization known to encode populations of functionally distinct proteins isoforms, which change the activation kinetics as well as calcium sensitivity of the channels. Neuronal firing is known to play a key role in the circadian circuit with recent studies illustrating that different subgroups of circadian neurons have characteristic time-of-day neuronal firing patterns. Although it is not yet fully understood which potassium channels play a critical role in each circadian neuron subgroup, several channel pre-mRNAs that undergo differential splicing in circadian neurons impact circadian behavior and sleep, such as slowpoke (slo), Shaker (Sh) and Hyperkinetic (Hk). It is therefore likely that AS adds diversity and distinct physiological properties to these protein isoforms, which then impacts neuron-specific firing patterns. From a more general perspective, AS augments transcriptional regulation in giving different circadian neurons individual identities and distinct functions (Wang, 2018).

Approximately 5% of the AS events identified in circadian neurons also undergo time-of-day dependent changes in alternative splicing (cycling splicing). It is important to note that all experiments carried out in this study were conducted under 12 hr of light and 12 hr of dark conditions, making it impossible to distinguish between light and clock control. Nonetheless, these data indicate that splicing adds a dramatic layer of gene regulation to diurnal changes in gene expression. Moreover, many of the cycling AS transcripts show constant overall mRNA levels, which suggests the existence of neuron-specific splicing factors that are expressed or activated only at specific times of the day. Indeed, this study has identified several candidate cycling neuron-enriched transcripts that encode RBPs that may help to drive cycling AS patterns (Wang, 2018).

A recent trend in biological research is to generate transcriptome profiles from single cells. For example, this strategy is part of the 'human cell atlas' project aimed at personalized genomic medicine or the 'brain initiative' project to generate profiles of all neurons in the mouse brain. One recent study was able to obtain about 20M sequence reads per isolated human iPS cell but only managed to analyze splicing patterns for the most highly expressed genes. The current study in contrast used ~100 isolated Drosophila neurons for each of the four neuron subtypes along with judicious use of both oligo-dT and random hexamer priming of the cDNA libraries. This strategy obtained about 10-30M sequence reads for each sample, including substantial information from the 5' ends of transcripts, and JUM was able to detect and classify a large number of previously unannotated pre-mRNA isoforms. Many of them are missing from the fly head RNA-seq data assayed and analyzed in parallel, indicating that these new isoforms are cell-type specific. Not surprisingly, the novel isoforms from the three circadian neuron groups fall into many gene ontology (GO) categories associated with specific circadian clock activity and function (Wang, 2018).

Taken together, the work presented in this study indicates that the number of alternative splicing events that take place in neuronal tissues is grossly underestimated, even though publically-funded genome projects, such as the NIH modENCODE projects deeply sequenced transcriptomes from a variety of Drosophila tissues and developmental stages. This is despite the appreciation of how much AS occurs in the nervous system, for example recent comprehensive analysis of splicing patterns through deep sequencing of ~50 mouse and human tissues revealed about 2500 neuronally-regulated alternative splicing events. It is therefore suggested that these events will need to be comprehensively evaluated by much deeper sequencing than is currently afforded by most contemporary single cell RNA-seq studies and by AS analysis software like JUM that is not constrained by a priori knowledge of known splicing events (Wang, 2018).

NineTeen Complex-subunit Salsa is required for efficient splicing of a subset of introns and dorsal-ventral patterning

The NineTeen Complex (NTC), also known as Pre-mRNA-processing factor 19 (Prp19) complex, regulates distinct spliceosome conformational changes necessary for splicing. During Drosophila midblastula transition, splicing is particularly sensitive to mutations in NTC-subunit Fandango, which suggests differential requirements of NTC during development. This study shows that NTC-subunit Salsa, the Drosophila orthologue of human RNA helicase Aquarius (CG31368), is rate-limiting for splicing of a subset of small first introns during oogenesis, including the first intron of gurken. Germ line depletion of Salsa and splice site mutations within gurken first intron both impair adult female fertility and oocyte dorsal-ventral patterning due to an abnormal expression of Gurken. Supporting causality, the fertility and dorsal-ventral patterning defects observed after Salsa depletion could be suppressed by the expression of a gurken construct without its first intron. Altogether these results suggest that one of the key rate-limiting functions of Salsa during oogenesis is to ensure the correct expression and efficient splicing of the first intron of gurken mRNA. Retention of gurken first intron compromises the function of this gene most likely because it undermines the correct structure and function of the transcript 5'UTR (Rathore, 2020).

The spliceosome is a highly dynamic molecular machine, composed of five small nuclear ribonucleoproteins (snRNPs) that sequentially associate to the precursor mRNA (pre-mRNA) during the splicing reaction. Each snRNP (U1, U2, U4, U5, and U6) contains a U-rich snRNA and a unique group of proteins. Although spliceosome assembly is ordered (U1 > U2 > U4/U5/U6 > NineTeen Complex), the splicing reaction is without an apprarent irreversible and/or rate-limiting step, with commitment to splicing progressively increased as snRNPs and NTC bind to the pre-mRNA (Rathore, 2020).

The spliceosomal NineTeen Complex (NTC), also known as Pre-mRNA-processing factor 19 (Prp19) complex, regulates distinct spliceosome conformational changes necessary for efficient pre-mRNA splicing (Hogg, 2010; Chanarat, 2013). NTC composition is dynamic and comprises a subset of conserved core subunits and many transiently associated ones. NTC associates with the spliceosome during its activation and just before the first transesterification. Interestingly, NTC also has a significant role in the crosstalk between transcription, cotranscriptional processing of the nascent RNA, and DNA repair, as distinct NTC subunits have been reported to be important for transcriptional elongation and genomic stability. Human NTC-subunits PRP19, XAB2, and CDC5L are important for transcriptional elongation, transcription-coupled DNA repair, and activation of the ATM-related (ATR)-dependent DNA damage checkpoint. RNA Polymerase II (RNA Pol II) also promotes cotranscriptional splicing activation through the recruitment of NTC (Rathore, 2020 and references therein).

Human Aquarius (AQR) (also known as intron-binding protein 160, IBP160) is an ATP-dependent RNA helicase that associates with NTC during spliceosome activation and formation of the activated B complex (BACT). AQR binds to introns independently of sequence, but usually upstream of the branch-site (BS) and close to the associated U2 snRNP SF3a and SF3b proteins, being essential for intron-binding complex formation and efficient splicing. AQR has also been suggested to be important for deposition of the exon junction complex (EJC) during the splicing reaction and formation of intron-encoded snoRNAs, suggesting it regulates the cross-talk between splicing and other RNA processing events (Rathore, 2020).

Splicing during Drosophila early embryonic development is notably sensitive to mutations in NTC-subunit Fandango (Guilgur, 2014), suggesting differential requirements of NTC during development (Martinho, 2015). To test this possibility, it was decided to investigate the role of other NTC-subunits during Drosophila oogenesis and early embryonic development. Focused of initial work was placed on uncharacterized gene CG31368, which encodes the Drosophila ortholog of human Aquarius. Since there is already a nonrelated Drosophila protease named aquarius (CG14061), CG31368 was renamed salsa. The working hypothesis is that salsa, similar to its Caenorhabditis elegans ortholog emb-4, is likely to have important developmental functions (Rathore, 2020).

During Drosophila oogenesis, gurken mRNA localizes to the posterior cortex of the developing oocyte and Gurken signal is restricted to the underlying posterior follicle cells. In response to a signal from the posterior follicle cells, there is a considerable reorganization of the cytoskeleton and a microtubule-dependent migration of the oocyte nucleus to the anterior cortex. The anteriorly localized nucleus defines the dorsal-anterior region and provides the first detectable dorsal-ventral (D/V) asymmetry of the oocyte, with the expression of both gurken mRNA and protein restricted to the cytoplasmic perinuclear region of the oocyte (Rathore, 2020).

gurken mRNA is transcribed in the supporting nurse cells and actively transported to the dorsal-anterior region of the oocyte by a dynein-mediated transport. The oocyte dorsal-anterior localization of gurken mRNA relies on multiple elements localized to the transcript 5' UTR, 3' UTR and open-reading frame. Although this localization is crucial for its efficient translation, the precise contribution of each element for RNA localization is still a matter of debate (Rathore, 2020).

D/V patterning of the developing Drosophila egg is dependent on the dorsal-anterior localization of Gurken during mid-oogenesis. Gurken is the ligand for the Epidermal growth factor receptor (Egfr) that locates to the apical surface of follicle cells that surround the developing oocyte. Activation of Egfr modifies the cell fate of the dorsal follicle cells and restricts the formation of Spätzle ligand to the ventral region of the oocyte, which is essential for normal morphogenesis of the eggshell dorsal appendages (Rathore, 2020).

This study found that Salsa, the Drosophila ortholog of AQR, is rate-limiting for efficient splicing of a subset of small first introns, including the first intron of gurken. Consistent with the functional relevance of gurken splicing defects, mutations within the splice sites of the first intron of gurken impair the function of this gene. Female germline depletion of Salsa and splice mutations within gurken first intron were both associated to a decrease in female fertility, significant D/V patterning defects of the eggshell and abnormal expression of Gurken during oogenesis. Supporting causality, expression of a gurken construct without its first intron suppressed the female fertility and D/V patterning defects observed after Salsa depletion. Altogether these results suggest that one of the key rate-limiting functions of Salsa during oogenesis is to ensure the correct expression and efficient splicing of the first intron of gurken mRNA (Rathore, 2020).

Oocyte dorsal-anterior localization of gurken mRNA relies on multiple elements localized to the transcript 5'UTR, 3'UTR and open-reading frame, yet the relative importance of each element for mRNA localization is still unclear. The 5' and 3'-UTRs of gurken were reported to be required for dorsal-anterior localization of gurken transcript. Furthermore, and using a genomic gurken construct with a lacZ reporter inserted within the gene open-reading frame, it was shown that whereas gurken 5'UTR is required for transcript oocyte accumulation, its coding region and 3'UTR are necessary for its posterior and dorsal-anterior localization. Nevertheless, it was recently reported, using an oocyte injection assay, that a small stem-loop located within the open-reading frame was necessary and sufficient for gurken transcript localization (Rathore, 2020).

The results show that efficient splicing of the first intron of gurken is required for mRNA dorsal-anterior localization and dorsal-ventral patterning. This is most likely because retention of the first intron impairs the secondary RNA structure of gurken 5'UTR, and the function of a closely located RNA element important for its localization. Splicing of the first intron of gurken is also likely to facilitate Gurken protein expression, as deletion of the first intron of gurken suppresses the dorsalization phenotype associated with increased copy number of gurken gene without affecting the levels of gurken mRNA. The results therefore fully support the role of gurken 5'UTR in mRNA localization within the oocyte, and strongly suggest that Salsa-dependent splicing of the first intron of gurken mRNA is important for the correct expression and function of this gene (Rathore, 2020).

The precise function of human Aquarius in splicing is still poorly understood. This RNA helicase is recruited to the spliceosome as a pentameric complex known as intronbinding complex (IBC), which also contains hSyf1 (also known as Xab2), hIsy1, CypE, and CCDC16 (De, 2015). Coimmunoprecipitation experiments suggest a large interaction interface between IBC and U2 snRNP, within the activated spliceosome (Bact stage) and just before the first splicing reaction. Although Aquarius ability to bind and hydrolyze ATP is important for spliceosome activation and splicing efficiency, the role of its RNA unwinding activity is less clear (Rathore, 2020).

This work has identified a small subset of introns whose splicing is particularly sensitive to depletion of Salsa (the Drosophila ortholog of human Aquarius). The fact that splicing was only affected in a small number of introns is consistent with the observation that immunodepletion of human Aquarius from nuclear extracts only weakly impaired splicing in vitro. This suggests that although this RNA helicase is apparently not critical for overall splicing, during female gametogenesis there is a subset of introns whose efficient removal relies on the function of this enzyme (Rathore, 2020).

Analysis of the introns whose splicing was sensitive to Salsa depletion showed a clear bias for small first introns with weak 3'splice sites, independently of their distance to the transcription start site (TSS), 5'splice site strength and GC content. The bias for small introns suggests that Salsa is mostly rate-limiting when introns are recognized by intron definition, where the initial pairing between U1 and U2 snRNPs occurs across the intron. Furthermore, the bias for introns with weak 3'splice sites is in accordance with the extensive interaction between IBC and U2 snRNP in the activated spliceosome, and implies that depletion of Salsa is likely to impair, at least in a subset of introns, U2 snRNP function during splicing. The absence of any detectable bias for short distances between the TSS and 5'splice site, when evaluating affected and control first introns, or any bias for weak 5'splice site strength, suggests that Salsa is not likely rate-limiting for Cap-Binding Complex-mediated splicing (Rathore, 2020).

Drosophila first introns are more likely to be cotranscriptionally retained than internal and terminal introns. This is not consistent with the kinetic competition model, where the fastest processes are the ones most likely to occur, suggesting additional constraints to first intron splicing. Although the precise nature of such constraints is still poorly understood, binding of transcriptional initiation factors to the 5'splice site-associated U1snRNP potentially restricts splicing efficiency, as it might impair the initial pairing between U1 and U2 snRNPs. The current working hypothesis is that Salsa is required for splicing of small first introns with weak 3'splice site because this enzyme facilitates U2 snRNP function, minimizing the interference effect of transcriptional initiation factors on splicing. Future work will help define the function of this RNA helicase and its contribution for differential gene expression during development (Rathore, 2020).

Loss of the RNA trimethylguanosine cap is compatible with nuclear accumulation of spliceosomal snRNAs but not pre-mRNA splicing or snRNA processing during animal development

The 2,2,7-trimethylguanosine (TMG) cap is one of the first identified modifications on eukaryotic RNAs. TMG, synthesized by the conserved Tgs1 enzyme, is abundantly present on snRNAs essential for pre-mRNA splicing. Results from ex vivo experiments in vertebrate cells suggested that TMG ensures nuclear localization of snRNAs. Functional studies of TMG using tgs1 mutations in unicellular organisms yield results inconsistent with TMG being indispensable for either nuclear import or splicing. Utilizing a hypomorphic Tgs1 mutation in Drosophila, this study shows that TMG reduction impairs germline development by disrupting the processing, particularly of introns with smaller sizes and weaker splice sites. Unexpectedly, loss of TMG does not disrupt snRNAs localization to the nucleus, disputing an essential role of TMG in snRNA transport. Tgs1 loss also leads to defective 3' processing of snRNAs. Remarkably, stronger Tgs1 mutations cause lethality without severely disrupting splicing, likely due to the preponderance of TMG-capped snRNPs. Tgs1, a predominantly nucleolar protein in Drosophila, likely carries out splicing-independent functions indispensable for animal development. Taken together, these results suggest that nuclear import is not a conserved function of TMG. As a distinctive structure on RNA, particularly non-coding RNA, it is suggested that TMG prevents spurious interactions detrimental to the function of RNAs that it modifies (Cheng, 2020).


Alexandrov, A., Colognori, D., Shu, M. D. and Steitz, J. A. (2012). Human spliceosomal protein CWC22 plays a role in coupling splicing to exon junction complex deposition and nonsense-mediated decay. Proc Natl Acad Sci U S A 109(52): 21313-21318. PubMed ID: 23236153

Ashton-Beaucage, D., Udell, C. M., Lavoie, H., Baril, C., Lefrancois, M., Chagnon, P., Gendron, P., Caron-Lizotte, O., Bonneil, E., Thibault, P. and Therrien, M. (2010). The exon junction complex controls the splicing of MAPK and other long intron-containing transcripts in Drosophila. Cell 143: 251-262. PubMed ID: 20946983

Ashwal-Fluss, R., Meyer, M., Pamudurti, N. R., Ivanov, A., Bartok, O., Hanan, M., Evantal, N., Memczak, S., Rajewsky, N. and Kadener, S. (2014). circRNA biogenesis competes with pre-mRNA splicing. Mol Cell 56: 55-66. PubMed ID: 25242144

Barbosa, I., Haque, N., Fiorini, F., Barrandon, C., Tomasetto, C., Blanchette, M. and Le Hir, H. (2012). Human CWC22 escorts the helicase eIF4AIII to spliceosomes and promotes exon junction complex assembly. Nat Struct Mol Biol 19(10): 983-990. PubMed ID: 22961380

Bradley, T., Cook, M. E. and Blanchette, M. (2015). SR proteins control a complex network of RNA-processing events. RNA 21(1):75-92. PubMed ID: 25414008

Cheng, L., Zhang, Y., Zhang, Y., Chen, T., Xu, Y. Z. and Rong, Y. S. (2020). Loss of the RNA trimethylguanosine cap is compatible with nuclear accumulation of spliceosomal snRNAs but not pre-mRNA splicing or snRNA processing during animal development. PLoS Genet 16(10): e1009098. PubMed ID: 33085660

Conn, S. J., Pillman, K. A., Toubia, J., Conn, V. M., Salmanidis, M., Phillips, C. A., Roslan, S., Schreiber, A. W., Gregory, P. A. and Goodall, G. J. (2015). The RNA binding protein Quaking regulates formation of circRNAs. Cell 160: 1125-1134. PubMed ID: 25768908

Erkelenz, S., Stankovic, D., Mundorf, J., Bresser, T., Claudius, A. K., Boehm, V., Gehring, N. H. and Uhlirova, M. (2021). Ecd promotes U5 snRNP maturation and Prp8 stability. Nucleic Acids Res. PubMed ID: 33444449

Fox-Walsh, K. L., et al. (2005). The architecture of pre-mRNAs affects mechanisms of splice-site pairing. Proc. Natl. Acad. Sci. 102(45): 16176-81. 16260721

Gibilisco, L., Zhou, Q., Mahajan, S. and Bachtrog, D. (2016). Alternative splicing within and between Drosophila species, sexes, tissues, and developmental stages. PLoS Genet 12(12): e1006464. PubMed ID: 27935948

Hansen, T. B., Jensen, T. I., Clausen, B. H., Bramsen, J. B., Finsen, B., Damgaard, C. K. and Kjems, J. (2013). Natural RNA circles function as efficient microRNA sponges. Nature 495: 384-388. PubMed ID: 23446346

Haussmann, I. U., Bodi, Z., Sanchez-Moran, E., Mongan, N. P., Archer, N., Fray, R. G. and Soller, M. (2016). m6A potentiates Sxl alternative pre-mRNA splicing for robust Drosophila sex determination. Nature 540(7632): 301-304. PubMed ID: 27919081

Joseph, B., Kondo, S. and Lai, E. C. (2018). Short cryptic exons mediate recursive splicing in Drosophila. Nat Struct Mol Biol. PubMed ID: 29632374

Kramer, M. C., Liang, D., Tatomer, D. C., Gold, B., March, Z. M., Cherry, S. and Wilusz, J. E. (2015). Combinatorial control of Drosophila circular RNA expression by intronic repeats, hnRNPs, and SR proteins. Genes Dev 29(20):2168-82. PubMed ID: 26450910

Lam, B. J. and Hertel, K. J. (2002). A general role for splicing enhancers in exon definition. RNA 8(10): 1233-41. 12403462

Liu, M., Li, Y., Liu, A., Li, R., Su, Y., Du, J., Li, C. and Zhu, A. J. (2016). The exon junction complex regulates the splicing of cell polarity gene dlg1 to control Wingless signaling in development. Elife 5:e17200. PubMed ID: 27536874

Memczak, S., Jens, M., Elefsinioti, A., Torti, F., Krueger, J., Rybak, A., Maier, L., Mackowiak, S. D., Gregersen, L. H., Munschauer, M., Loewer, A., Ziebold, U., Landthaler, M., Kocks, C., le Noble, F. and Rajewsky, N. (2013). Circular RNAs are a large class of animal RNAs with regulatory potency. Nature 495: 333-338. PubMed ID: 23446348

Mohr, C., Kleiner, S., Blanchette, M., Pyrowolakis, G. and Hartmann, B. (2017). Sex-specific transcript diversity in the fly head Is established during pupal stages and adulthood and is largely independent of the mating process and the germline. Sex Dev [Epub ahead of print]. PubMed ID: 28273663

Obrdlik, A., Lin, G., Haberman, N., Ule, J. and Ephrussi, A. (2019). The transcriptome-wide landscape and modalities of EJC binding in adult Drosophila. Cell Rep 28(5): 1219-1236. PubMed ID: 31365866

Pai, A. A., Henriques, T., McCue, K., Burkholder, A., Adelman, K. and Burge, C. B. (2017). The kinetics of pre-mRNA splicing in the Drosophila genome and the influence of gene architecture. Elife 6. PubMed ID: 29280736

Pai, A. A., Paggi, J. M., Yan, P., Adelman, K. and Burge, C. B. (2018). Numerous recursive sites contribute to accuracy of splicing in long introns in flies. PLoS Genet 14(8): e1007588. PubMed ID: 30148878

Rathore, O. S., Silva, R. D., Ascensao-Ferreira, M., Matos, R., Carvalho, C., Marques, B., Tiago, M. N., Prudencio, P., Andrade, R. P., Roignant, J. Y., Barbosa-Morais, N. L. and Martinho, R. G. (2020). NineTeen Complex-subunit Salsa is required for efficient splicing of a subset of introns and dorsal-ventral patterning. RNA. PubMed ID: 32963109

Roignant, J. Y. and Treisman, J. E. (2010). Exon junction complex subunits are required to splice Drosophila MAP kinase, a large heterochromatic gene. Cell 143: 238-250. PubMed ID: 20946982

Ruan, K., Zhu, Y., Li, C., Brazill, J.M. and Zhai, R.G. (2015). Alternative splicing of Drosophila Nmnat functions as a switch to enhance neuroprotection under stress. Nat Commun 6: 10057. PubMed ID: 26616331

Salzman, J., Chen, R. E., Olsen, M. N., Wang, P. L. and Brown, P. O. (2013). Cell-type specific features of circular RNA expression. PLoS Genet 9: e1003777. PubMed ID: 24039610

Skrajna, A., Yang, X. C., Bucholc, K., Zhang, J., Hall, T. M., Dadlez, M., Marzluff, W. F. and Dominski, Z. (2017). U7 snRNP is recruited to histone pre-mRNA in a FLASH-dependent manner by two separate regions of the Stem-Loop Binding Protein. RNA 23(6):938-951. PubMed ID: 28289156

Stegeman, R., Hall, H., Escobedo, S. E., Chang, H. C. and Weake, V. M. (2018). Proper splicing contributes to visual function in the aging Drosophila eye. Aging Cell: e12817. PubMed ID: 30003673

Stoiber, M. H., Olson, S., May, G. E., Duff, M. O., Manent, J., Obar, R., Guruharsha, K., Artavanis-Tsakonas, S., Brown, J. B., Graveley, B. R. and Celniker, S. E. (2015). Extensive cross-regulation of post-transcriptional regulatory networks in Drosophila. Genome Res [Epub ahead of print]. PubMed ID: 26294687

Steckelberg, A. L., Altmueller, J., Dieterich, C. and Gehring, N. H. (2015). CWC22-dependent pre-mRNA splicing and eIF4A3 binding enables global deposition of exon junction complexes. Nucleic Acids Res 43(9): 4687-4700. PubMed ID: 25870412

Tian, M. and Maniatis, T. (1992). Positive control of pre-mRNA splicing in vitro. Science 256(5054): 237-40. 1566072

Teixeira, F. K., Okuniewska, M., Malone, C. D., Coux, R. X., Rio, D. C. and Lehmann, R. (2017). piRNA-mediated regulation of transposon alternative splicing in the soma and germ line. Nature 552(7684): 268-272. PubMed ID: 29211718

Wang, M., Branco, A. T. and Lemos, B. (2018). The Y chromosome modulates splicing and sex-biased intron retention rates in Drosophila. Genetics 208(3):1057-1067. PubMed ID: 29263027

Wang, Q., Abruzzi, K. C., Rosbash, M. and Rio, D. C. (2018). Striking circadian neuron diversity and cycling of Drosophila alternative splicing. Elife 7. PubMed ID: 29863472

Westholm, J. O., et al. (2014). Genome-wide analysis of Drosophila circular RNAs reveals their structural and sequence properties and age-dependent neural accumulation. Cell Rep. 9(5): 1966-80. PubMed ID: 25544350

Zygotically transcribed genes

Home page: The Interactive Fly © 1995, 1996 Thomas B. Brody, Ph.D.

The Interactive Fly resides on the
Society for Developmental Biology's Web server.