The Interactive Fly

Zygotically transcribed genes

Splicing factors for processing pre-messenger RNA

Splicing factors


Transcription and splicing dynamics during early Drosophila development

Widespread co-transcriptional splicing has been demonstrated from yeast to human. However, most studies to date addressing the kinetics of splicing relative to transcription used either Saccharomyces cerevisiae or metazoan cultured cell lines. This study adapted native elongating transcript sequencing technology (NET-seq) to measure co-transcriptional splicing dynamics during the early developmental stages of Drosophila melanogaster embryos. These results reveal the position of RNA polymerase II (Pol II) when both canonical and recursive splicing occur. This study found heterogeneity in splicing dynamics, with some RNAs spliced immediately after intron transcription, whereas for other transcripts no splicing was observed over the first 100 nucleotides of the downstream exon. Introns that show splicing completion before Pol II has reached the end of the downstream exon are necessarily intron-defined. This study included the splicing dynamics of both nascent pre-mRNAs transcribed in the early embryo, which have few and short introns, as well as pre-mRNAs transcribed later in embryonic development, which contain multiple long introns. As expected, this study found a relationship between the proportion of spliced reads and intron size. However, intron definition was observed at all intron sizes. This study further observed that genes transcribed in the early embryo tend to be isolated in the genome whereas genes transcribed later are often overlapped by a neighboring convergent gene. In isolated genes, transcription termination occurred soon after the polyadenylation site, while in overlapped genes Pol II persisted associated with the DNA template after cleavage and polyadenylation of the nascent transcript. Taken together, these data unravels novel dynamic features of Pol II transcription and splicing in the developing Drosophila embryo (Prudencio, 2021).

A Candidate RNAi Screen Reveals Diverse RNA-Binding Protein Phenotypes in Drosophila Flight Muscle

The proper regulation of RNA processing is critical for muscle development and the fine-tuning of contractile ability among muscle fiber-types. RNA binding proteins (RBPs) regulate the diverse steps in RNA processing, including alternative splicing, which generates fiber-type specific isoforms of structural proteins that confer contractile sarcomeres with distinct biomechanical properties. Alternative splicing is disrupted in muscle diseases such as myotonic dystrophy and dilated cardiomyopathy and is altered after intense exercise as well as with aging. It is therefore important to understand splicing and RBP function, but currently, only a small fraction of the hundreds of annotated RBPs expressed in muscle have been characterized. This study demonstrate sthe utility of Drosophila as a genetic model system to investigate basic developmental mechanisms of RBP function in myogenesis. This study found that RBPs exhibit dynamic temporal and fiber-type specific expression patterns in mRNA-Seq data and display muscle-specific phenotypes. Knockdown was performed with 105 RNAi hairpins targeting 35 RBPs, and associated lethality, flight, myofiber and sarcomere defects, including flight muscle phenotypes for Doa, Rm62, mub, mbl, sbr, and clu are reported. Knockdown phenotypes of spliceosome components, as highlighted by phenotypes for A-complex components SF1 and Hrb87F (hnRNPA1), revealed level- and temporal-dependent myofibril defects. It was further shown that splicing mediated by SF1 and Hrb87F is necessary for Z-disc stability and proper myofibril development, and strong knockdown of either gene results in impaired localization of kettin to the Z-disc. These results expand the number of RBPs with a described phenotype in muscle and underscore the diversity in myofibril and transcriptomic phenotypes associated with splicing defects. Drosophila is thus a powerful model to gain disease-relevant insight into cellular and molecular phenotypes observed when expression levels of splicing factors, spliceosome components and splicing dynamics are altered (Kao, 2021).

Specification of Drosophila neuropeptidergic neurons by the splicing component brr2

During embryonic development, a number of genetic cues act to generate neuronal diversity. While intrinsic transcriptional cascades are well-known to control neuronal sub-type cell fate, the target cells can also provide critical input to specific neuronal cell fates. Such signals, denoted retrograde signals, are known to provide critical survival cues for neurons, but have also been found to trigger terminal differentiation of neurons. One salient example of such target-derived instructive signals pertains to the specification of the Drosophila FMRFamide neuropeptide neurons, the Tv4 neurons of the ventral nerve cord. Tv4 neurons receive a BMP signal from their target cells, which acts as the final trigger to activate the FMRFa gene. A recent FMRFa-eGFP genetic screen identified several genes involved in Tv4 specification, two of which encode components of the U5 subunit of the spliceosome: brr2 (l(3)72Ab) and Prp8. This study focused on the role of RNA processing during target-derived signaling. brr2 and Prp8 were found to play crucial roles in controlling the expression of the FMRFa neuropeptide specifically in six neurons of the VNC (Tv4 neurons). Detailed analysis of brr2 revealed that this control is executed by two independent mechanisms, both of which are required for the activation of the BMP retrograde signaling pathway in Tv4 neurons: (1) Proper axonal pathfinding to the target tissue in order to receive the BMP ligand. (2) Proper RNA splicing of two genes in the BMP pathway: the thickveins (tkv) gene, encoding a BMP receptor subunit, and the Medea gene, encoding a co-Smad. These results reveal involvement of specific RNA processing in diversifying neuronal identity within the central nervous system (Monedero Cobeta, 2018).

The architecture of pre-mRNAs affects mechanisms of splice-site pairing

The exon/intron architecture of genes determines whether components of the spliceosome recognize splice sites across the intron or across the exon. Using in vitro splicing assays, this study demonstrates that splice-site recognition across introns ceases when intron size is between 200 and 250 nucleotides. Beyond this threshold, splice sites are recognized across the exon. Splice-site recognition across the intron is significantly more efficient than splice-site recognition across the exon, resulting in enhanced inclusion of exons with weak splice sites. Thus, intron size can profoundly influence the likelihood that an exon is constitutively or alternatively spliced. An EST-based alternative-splicing database was used to determine whether the exon/intron architecture influences the probability of alternative splicing in the Drosophila and human genomes. Drosophila exons flanked by long introns display an up to 90-fold-higher probability of being alternatively spliced compared with exons flanked by two short introns, demonstrating that the exon/intron architecture in Drosophila is a major determinant in governing the frequency of alternative splicing. Exon skipping is also more likely to occur when exons are flanked by long introns in the human genome. Interestingly, experimental and computational analyses show that the length of the upstream intron is more influential in inducing alternative splicing than is the length of the downstream intron. It is concluded that the size and location of the flanking introns control the mechanism of splice-site recognition and influence the frequency and the type of alternative splicing that a pre-mRNA transcript undergoes (Fox-Walsh, 2005).

Pre-mRNA splicing is an essential process that accounts for many aspects of regulated gene expression. Of the ~25,000 genes encoded by the human genome, >60% are believed to produce transcripts that are alternatively spliced. Thus, alternative splicing of pre-mRNAs can lead to the production of multiple protein isoforms from a single pre-mRNA, exponentially enriching the proteomic diversity of higher eukaryotic organisms. Because regulation of this process can determine when and where a particular protein isoform is produced, changes in alternative-splicing patterns modulate many cellular activities (Fox-Walsh, 2005).

The spliceosome assembles onto the pre-mRNA in a coordinated manner by binding to sequences located at the 5' and 3' ends of introns. Spliceosome assembly is initiated by the stable associations of the U1 small nuclear ribonucleoprotein particle with the 5' splice site, branch-point-binding protein/SF1 with the branch point, and U2 snRNP auxiliary factor with the pyrimidine tract. ATP hydrolysis then leads to the stable association of U2 snRNP at the branch-point and functional splice-site pairing (Fox-Walsh, 2005).

Intron size has been correlated with rates of evolution and the regulation of genome size. The exon/intron architecture has also been shown to influence splice-site recognition. For example, increasing the size of mammalian exons results in exon skipping. However, the same enlarged exons are included when the flanking introns are small. Thus, splice-site recognition is more efficient when introns or exons are small. Because, in the human genome, the majority of exons are short and introns are long, it is expected that the vast majority of splice sites in the human genome are recognized across the exon. Lower eukaryotes have a genomic architecture that is typified by small introns and flanking exons with variable length, suggesting that splice-site recognition occurs across the intron. Consistent with this model, expansion of small introns in yeast or Drosophila causes loss of splicing, cryptic splicing, or intron retention. Taken together, these observations suggest that splice sites are recognized across an optimal nucleotide length (Fox-Walsh, 2005).

It is unknown whether splice-site recognition across the intron or across the exon results in similar efficiencies of spliceosomal assembly and/or splice-site pairing. This study demonstrates that splice-site recognition across the intron ceases when the intron reaches a length between 200 and 250 nt. Because splice-site recognition is more efficient across the intron, alternative splicing is less likely for exons flanked by short introns. This influence is supported experimentally and by computational analyses of Drosophila and human alternative-splicing databases. It is concluded that the size and location of the flanking introns control the mechanism of splice-site recognition and influence the frequency and the type of alternative pre-mRNA splicing (Fox-Walsh, 2005).

Previous studies have suggested that genes with small introns tend to be recognized across the intron, and genes with large introns are recognized across the exon. To determine the distance at which recognition of splice sites switches from cross-intron interactions to cross-exon interactions occurs, advantage was taken of an in vitro kinetic splicing assay that was originally used to demonstrate that exonic splicing enhancers (ESEs), discrete sequences within exons that promote both constitutive and regulated splicing, activate both splice sites of an exon simultaneously (Lam, 2002). A number of pre-mRNAs were designed with intron lengths ranging from 120 to 425 nt. Within each set, the pre-mRNAs differ only in the presence or absence of a well characterized 13-nt ESE derived from the Drosophila doublesex and Drosophila fruitless pre-mRNAs. Each pre-mRNA harbors the same weak 5' and 3' splice sites that require the activities of ESEs for recognition in their natural context (Tian, 1992; Lam, 2003). Because splicing factors present in HeLa cell nuclear extracts activate the ESEs used (Lam, 2003), the presence of functional or mutant enhancer elements within each test substrate determine its splicing efficiency. If the splice sites are recognized across the exon, it is expected that the activation of the splice sites on each exon constitutes a different step during spliceosomal assembly, because the ESE located on each exon will only aid in the recognition of its weak splice site. Thus, the activities of the separate ESEs are expected to display synergistic kinetics, because the activation of each ESE accelerates an independent step during spliceosomal assembly. However, if the splice sites are recognized across the intron, the ESE located on each exon will aid in the recognition of both weak splice sites, because the recruited spliceosomal components define the entire intron within one step. In this scenario, the activities of the separate ESEs are expected to display additive kinetics, because the activation of each ESE accelerates the same rate-limiting step during spliceosomal assembly (Fox-Walsh, 2005).

In vitro splicing assays were performed with each of the four pre-mRNA sets over a 3-h time course to determine the apparent rates of splicing. Pre-mRNAs with an intron size of 120 nt display additive kinetics. Using Drosophila nuclear extract (Kc), it was possible to demonstrate additive kinetics for substrates containing the 120-nt intron; however, it was not possible to detect sufficient splicing for the substrates containing longer introns. These results are consistent with in vitro studies demonstrating that splicing of pre-mRNAs with long introns is supported in HeLa nuclear extract but not in Kc extract. The kinetics of pre-mRNAs containing an intron 200 nt or less in length are additive. This behavior indicates that the spliceosomal components required for the recognition of both splice sites are recruited to the intron simultaneously. However, constructs with introns >200 nt demonstrate synergistic kinetics. It is concluded that the change from splice-site recognition across the intron to splice-site recognition across the exon occurs when the intronic length is between 200 and 250 nt (Fox-Walsh, 2005).

The kinetic analysis demonstrates that the upstream 5' splice site and the downstream 3' splice site are recognized simultaneously across introns <200 nt. Significantly, in the absence of ESEs, splice-site recognition across the intron is a much more efficient process than splice-site recognition across the exon. Thus, splice-site recognition across the intron may be able to rescue the inclusion of internal exons harboring weak splice sites. To test this hypothesis, a series of pre-mRNA substrates containing three exons was designed for in vitro splicing analysis in which the internal exon contains splice sites that are insufficiently recognized in the absence of ESEs. The four substrates generated differed only in their ability to be recognized across each intron by changing the length of the intron from <200 to >250 nt, thus permitting or discouraging splice-site recognition across the intron. As expected, the internal exon is predominantly excluded when flanked by two long introns. However, significant inclusion of the internal exon is observed if one of the flanking introns is short enough to support splice-site recognition across the intron. In fact, two short introns increase exon inclusion ~30 times greater than two long introns (Fox-Walsh, 2005).

To estimate the fractions of splice sites that may be recognized through cross-intron interactions, the flanking-intron lengths were recored for every internal exon within the human and Drosophila genomes. Genome information was obtained from the Alternative Splicing Database (ASD), which contains information about the exon/intron structure and EST-verified alternative-splicing events of several thousand genes. Within the human genome, many exons are flanked by at least one short intron, creating two separate populations, separated roughly by the intron length that is proposed to represent the transition of splice-site recognition from across the intron to across the exon. As expected from previous intron-length analyses, a very different distribution is seen in the Drosophila genome, where ~85% of exons are flanked by at least one short intron. An overlay of the Drosophila and human genomes demonstrates that the minimum intron length in the human genome is at the same location that demarcates the maximum intron length of the major Drosophila exon population. This difference in genome constraint may reflect specific compositional variations between the Drosophila and human spliceosomes (Fox-Walsh, 2005).

Because splice-site recognition across the intron rescues exon inclusion, how intron length influences alternative splicing within the Drosophila and human genomes was investigated. To do so, the flanking-intron information of each exon was correlated with exon-skipping and alternative-splice-site-activation events reported in the ASD to compute the probability that an exon is involved in alternative splicing, without taking into consideration the contributions made by splice-site signal strength and splicing enhancers or silencers. Thus, the correlation simply tests whether the influence of the exon/intron architecture on alternative splicing is significant enough to be detectable amid all other splicing determinants. Computational analysis of the Drosophila genome supports a significant role for intron length in defining the likelihood of alternative splicing. A striking influence of the exon/intron architecture is observed for simple exon-skipping events. Exons flanked by very long introns are up to 90-fold more likely to be skipped than exons that are flanked by two short introns. Significantly, the most drastic increase in the probability of alternative splicing (>10-fold) was observed when the length of flanking introns increased from 225 to 525 nt. In agreement with the experimental results, a greater probability that an exon is alternatively spliced was observed when the upstream intron is long. This polarity could be the consequence of coupling pre-mRNA splicing to transcription by RNA polymerase II. Even in the category of alternative 5' or 3' splice-site activation, alternative splicing is up to 10-fold more likely for exons that are flanked by long introns. It is concluded that, in Drosophila, exon skipping is a rare event for exons flanked by short introns and that the length of the upstream intron is of greater importance than the length of the downstream intron in determining whether an exon will be involved in exon skipping (Fox-Walsh, 2005).

Within the human genome, a similar correlation between the exon/intron architecture and the probability of exon skipping is observed; however, the ~5-fold maximal variance calculated is significantly lower than that observed for Drosophila. As for Drosophila, the length of the upstream intron is more important in determining the frequency of alternative splicing. In the case of alternative 5' or 3' splice-site usage, the opposite distribution of alternative splicing is seen in the human genome. The activation of alternative splice sites is less likely if the flanking introns are long. It is concluded that exon/intron architecture influences the frequency and type of alternative splicing that an exon may undergo in the Drosophila and human genomes (Fox-Walsh, 2005).

These experiments support the existence of two different mechanisms for splice-site recognition, splice-site recognition across the intron, and splice-site recognition across the exon. Splice-site recognition across the intron ceases when the intron size reaches the threshold length of >200 nt. Importantly, splice-site recognition across the intron is more efficient and increases the inclusion of exons with weak splice sites. These results demonstrate that the distance between splice sites affects efficient spliceosomal assembly. Presumably, the pairing of cross-exon-defined splice sites requires the interaction between two sets of pre-spliceosomes across an intron of variable length. In contrast, splice-site recognition across the intron already identifies the splice sites that will be paired. It is also possible that the kinetics of splice-site pairing are slowed because longer introns associate with an increased number of hnRNP proteins. HnRNP proteins coat nascent pre-mRNAs and are thought to interfere with the splicing reaction. Therefore, larger introns may reduce splicing by decreasing the relative concentration of splicing components through competition with hnRNPs (Fox-Walsh, 2005).

Additive kinetics of splice-site activation demonstrate that splice-site recognition across the intron is achieved through the recruitment of a multicomponent complex that contains components of the splicing machinery required for 5' and 3' splice-site definition. Interestingly, the activation of a single ESE results in a significant increase in splicing activity, suggesting that ESEs influence splice-site activation of adjacent exons. As anticipated from ESE distance/activity correlations, this effect depends on intron length. Given the unique combination of splice sites and cis-acting elements, it is possible that the precise transition from splice-site recognition across the intron to splice-site recognition across the exon may vary for different substrates. The presence of strong splice sites and enhancers or silencers could modulate the cross-intron recognition by increasing or decreasing the strength of interaction between spliceosomal components and the pre-mRNA (Fox-Walsh, 2005).

The observation that increasing exon length decreases exon inclusion suggests that similar distance limitations exist for splice-site recognition across the exon. Approximately 80% of human exons are <200 bp in length, the average being 170 bp. Importantly, exon length is tightly distributed when compared with intron length. These results demonstrate that maintaining exon size in the human genome is more important to the architecture and evolution of a gene than is maintaining intron size. In contrast to the human genome, exon size varies much more than intron size in yeast. The maximum intron length of 182 nt lies well within the size limitations of splice-site recognition across the intron. Taken together, these considerations support the notion that the majority of splice sites in higher eukaryotes are recognized across the exon, whereas lower eukaryotes employ splice-site recognition across the intron (Fox-Walsh, 2005).

It is well established that several types of exon and intron elements influence splice-site choice. The most prominent include the exon/intron junction signals and splicing enhancers and silencers. The results show that the exon/intron architecture is an additional parameter that affects the efficiency of splice-site recognition and alternative pre-mRNA splicing. When compared in otherwise isogenic test substrates, splice-site recognition across the intron rescues the inclusion of a weak internal exon by >10-fold. Even though the computational analysis ignores the contributions made by variable splice sites, enhancers, and silencers, a striking increase in the probability of alternative splicing is observed for Drosophila exons, whose splice sites are recognized across the exon. Thus, the exon/intron architecture in Drosophila is a major determinant in governing the probability of alternative splicing. Within the human genome, a qualitatively similar trend was observed for exon-skipping events but with a reduced magnitude. One major difference between the Drosophila and human gene architecture is intron length. Human genes are dominated by long introns (87% of introns are >250 nt), whereas short introns are much more common in Drosophila (66% are <250 nt). One possible explanation for the small intron size in Drosophila could be the pressure to maintain a constrained genome size in these fast-replicating organisms (Fox-Walsh, 2005).

Alternative splicing is extensive in both species, supporting the argument that both species benefit from expanded proteomes generated from alternative splicing. However, genome analysis suggests that there are significant differences in the weight of the mechanisms by which alternative splicing can be induced. In Drosophila, intron length is a major determinant in promoting alternative splicing patterns. In the human, additional mechanisms of controlling alternative splicing may have gained more influence on intron expansion to maintain balanced levels of alternative splicing (Fox-Walsh, 2005).

U7 snRNP is recruited to histone pre-mRNA in a FLASH-dependent manner by two separate regions of the Stem-Loop Binding Protein

Cleavage of histone pre-mRNAs at the 3' end requires Stem-Loop Binding Protein (SLBP) and U7 snRNP that consists of U7 snRNA and a unique Sm ring containing two U7-specific proteins: Lsm10 and Lsm11. Lsm11 interacts with FLASH and together they bring a subset of polyadenylation factors to U7 snRNP, including the CPSF73 endonuclease that cleaves histone pre-mRNA. SLBP binds to a conserved stem-loop structure upstream of the cleavage site and acts by promoting an interaction between the U7 snRNP and a sequence element located downstream of the cleavage site. This study shows that both human and Drosophila SLBPs stabilize U7 snRNP on histone pre-mRNA via two regions that are not directly involved in recognizing the stem-loop structure: helix B of the RNA Binding Domain and the C-terminal region that follows the RNA Binding Domain. Stabilization of U7 snRNP binding to histone pre-mRNA by SLBP requires FLASH but not the polyadenylation factors. Thus, FLASH plays two roles in 3' end processing of histone pre-mRNAs: it interacts with Lsm11 to form a docking platform for the polyadenylation factors and it co-operates with SLBP to recruit U7 snRNP to histone pre-mRNA (Skrajna, 2017).

In spite of passing 20 years since the discovery of SLBP as a factor that binds the conserved stem-loop structure in histone pre-mRNA, its precise role in processing and interactions with the rest of the processing machinery are incompletely understood. Initial studies in human nuclear extracts demonstrated that human SLBP promotes binding of U7 snRNP to histone pre-mRNA and that the in vitro requirement for SLBP could be bypassed by increasing the extent of complementarity between the 5' end of the U7 snRNA and the histone downstream element (HDE). In contrast, Drosophila SLBP is essential for cleavage of all five Drosophila histone pre-mRNAs in vitro, but whether it acts in the same manner as mammalian SLBP and/or has other functions in processing has not been unambiguously determined (Skrajna, 2017).

Determining the role of Drosophila SLBP in processing proved challenging, in part due to the rapid cleavage of histone pre-mRNAs and hence disruption of the processing complexes containing U7 snRNP during short incubation in Drosophila nuclear extracts. This study used a modified approach for the assembly and purification of Drosophila processing complexes and clearly demonstrates that Drosophila SLBP functionally resembles mammalian SLBP and acts by stabilizing the interaction between the U7 snRNP and histone pre-mRNA. By using a substrate containing biotin at the 3' end, it was shown that the U7 snRNP becomes partially destabilized on the HDE after cleavage, when SLBP is no longer part of the complex, further confirming the role of Drosophila SLBP in promoting stable association of U7 snRNP with histone pre-mRNA. Interestingly, the U7 snRNP that remains associated with the HDE following endonucleolytic cleavage contains FLASH and all subunits of the HCC. This result suggests that holo-U7 snRNP does not undergo a major remodeling during the course of processing reaction (Skrajna, 2017).

Consistent with a recent studies, a detectable amount of Drosophila U7 snRNP binds to histone pre-mRNA in the absence of SLBP, possibly as a result of base-pairing between the 5' end of the U7 snRNA and the HDE. The bound U7 snRNP contains all the key subunits of the HCC, including the CPSF73 endonuclease, but remains functionally inert. This is in sharp contrast to mammalian holo-U7 snRNP, which when bound to the HDE can result in cleavage of histone pre-mRNA even in the absence of SLBP. The reason for this difference between the two systems is unknown (Skrajna, 2017).

Drosophila and human SLBP use the same regions to recruit U7 snRNP to histone pre-mRNA The results indicate that two regions in mammalian and Drosophila SLBPs are critical for the recruitment of the U7 snRNP to histone pre-mRNA: parts of the RBD that do not directly contact the SL RNA and 15-20 amino acids of the C-terminal region located immediately downstream from the RBD. SLBP mutants altered in these regions retain the ability to bind the SL RNA but are partially or completely impaired in supporting cleavage of histone pre-mRNAs. This study shows that these mutants are also impaired in recruiting U7 snRNP to histone pre-mRNA, providing a likely molecular basis for their reduced activity in processing (Skrajna, 2017).

Within the RBD, the most critical role is played by helix B, including the highly conserved D(E)/R motif. Mutating this motif itself is sufficient to strongly reduce processing activity of human SLBP (Dominski, 2001) and the recruitment of U7 snRNP by Drosophila SLBP. An important role in recruiting U7 snRNP may also be played by other amino acids of the RBD, including evolutionarily variable residues in the loop that connects helices B and C. Despite overall conservation, human and Drosophila RBDs are not functionally interchangeable, yielding chimeric proteins that are inactive in both the U7 snRNP recruitment and cleavage of histone pre-mRNAs. It is possible that these variable amino acids are largely responsible for the observed incompatibility between the two RBDs (Skrajna, 2017).

In Drosophila SLBP, the C-terminal region consists of only 17 amino acids and is highly acidic, containing four stoichiometrically phosphorylated serines alternating with four aspartic acids. In addition to these four SD motifs, the C-terminal region contains one TD motif, and the current study indicates that the threonine residue in this motif may be phosphorylated sub-stoichiometrically. The high density of negative charge in this region is critical for the activity of Drosophila SLBP in promoting stable recruitment of the U7 snRNP to histone pre-mRNA and in supporting the cleavage reaction. Interestingly, in the absence of SL RNA the acidic C terminus of Drosophila SLBP associates with helices A and C, bringing them together and preparing for maximum strength interaction with the SL RNA (Zhang, 2014). Upon binding to the RNA target, the phosphorylated C-terminal tail is repelled from the RBD (Zhang, 2014) and may become available for the independent function in the recruitment of the U7 snRNP (Skrajna, 2017).

The C-terminal region of human SLBP lacks the repeated SD motif present in the Drosophila SLBP. Interestingly, six out of 20 residues that immediately follow the RBD in human SLBP are acidic, suggesting that the overall negative charge of this segment may also be important for activity of this protein (Zhang, 2014). However, the C-terminal regions of human and Drosophila SLBP are not functionally interchangeable, indicating that they co-evolved with other component(s) of their respective processing machineries, resembling the divergent evolution of the two RBDs (Skrajna, 2017).

SLBP tightly bound to the upstream stem-loop promotes stable recruitment of U7 snRNP to histone pre-mRNA likely by directly or indirectly interacting with a subunit(s) of the U7 snRNP. In both Drosophila and mouse nuclear extracts, SLBP is active in recruiting U7 snRNP that lacks the HCC. Thus, SLBP is unlikely to interact with any of the polyadenylation factors of the holo-U7 snRNP. In contrast, removal of FLASH from Drosophila U7 snRNP by RNAi abolishes the ability of SLBP to recruit U7 snRNP to histone pre-mRNA and this ability can be restored by addition of the N-terminal fragment of Drosophila FLASH. Mouse nuclear extracts contain primarily core U7 snRNP (FLASH is limiting), and addition of human N-terminal fragments of human FLASH to a mammalian extract stimulates the activity of SLBP in recruiting the U7 snRNP to histone pre-mRNA (Skrajna, 2017).

The N-terminal region of human and Drosophila FLASH was initially characterized as a protein that interacts with Lsm11, hence forming a docking platform for the HCC. This study now identified a new role for the N-terminal FLASH in processing by showing that it is also essential for the SLBP-mediated 'loading' of the U7 snRNP on histone pre-mRNA. This dual FLASH function may be important for the fidelity of 3' end processing of histone pre-mRNAs in vivo. Clearly, the FLASH-bound U7 snRNP that readily associates with the HCC forming holo-U7 snRNP has an advantage in binding to histone pre-mRNA over functionally incompetent core U7 snRNP, likely preventing misprocessing at downstream cryptic sites by cleavage and polyadenylation (Skrajna, 2017).

Previous studies on processing in human cells suggested that the recruitment of U7 snRNP by SLBP might be mediated by ZFP100, a zinc finger protein of 100 kDa that binds both human SLBP and Lsm11. Mrultiple attempts to detect ZFP100 in the mouse processing complexes by silver staining and mass spectrometry failed, although CPSF100 of the same molecular weight was readily identified, arguing against its involvement in the SLBP-mediated recruitment of the U7 snRNP. ZFP100 is a component of histone locus bodies and may play a role in vivo in coupling transcription of histone genes with 3' end processing of the nascent histone pre-mRNAs (Skrajna, 2017).

It is unclear whether SLBP is completely unable to act on the core U7 snRNP and whether its interaction with FLASH is direct or indirect. A model is favored in which SLBP interacts with the FLASH/Lsm11 complex rather than with FLASH alone. Alternatively, FLASH, upon binding Lsm11, induces a structural shift in part of Lsm11, making it competent to interact with SLBP. The interaction may also include Lsm10. Clearly, further studies are required to identify interactions that span across the cleavage site and bring SLBP and U7 snRNP together into a tight processing complex (Skrajna, 2017).

This study has developed a method to identify regions in human and Drosophila SLBP that are essential for the recruitment of U7 snRNP to histone-pre-mRNA. In both proteins, this activity is mediated by helix B and likely other amino acids of the RBD that do not directly contact the SL RNA, and by ∼20 C-terminal amino acids that follow the RBD. The activity of SLBP in promoting stable recruitment of U7 snRNP to histone pre-mRNA depends on FLASH but not the HCC. Thus, FLASH has two functions in processing: First, it is essential for bringing the HCC to U7 snRNP and second, it cooperates with SLBP in facilitating the interaction between U7 snRNP and histone pre-mRNA. The fact that Drosophila and human SLBP recruit U7 snRNP to histone pre-mRNA through the same regions, and that FLASH but not the HCC is essential for this activity of SLBP, suggests that both processing machineries utilize a conserved network of interactions spanning across the cleavage site (Skrajna, 2017).

Ecd promotes U5 snRNP maturation and Prp8 stability

Pre-mRNA splicing catalyzed by the spliceosome represents a critical step in the regulation of gene expression contributing to transcriptome and proteome diversity. The spliceosome consists of five small nuclear ribonucleoprotein particles (snRNPs), the biogenesis of which remains only partially understood. This study defines the evolutionarily conserved protein Ecdysoneless (Ecd) as a critical regulator of U5 snRNP assembly and Prp8 stability. Combining Drosophila genetics with proteomic approaches, this study demonstrates the Ecd requirement for the maintenance of adult healthspan and lifespan and identify the Sm ring protein SmD3 as a novel interaction partner of Ecd. The predominant task of Ecd is to deliver Prp8 to the emerging U5 snRNPs in the cytoplasm. Ecd deficiency, on the other hand, leads to reduced Prp8 protein levels and compromised U5 snRNP biogenesis, causing loss of splicing fidelity and transcriptome integrity. Based on these findings, it is propose that Ecd chaperones Prp8 to the forming U5 snRNP allowing completion of the cytoplasmic part of the U5 snRNP biogenesis pathway necessary to meet the cellular demand for functional spliceosomes (Erkelenz, 2021).

The transcriptome-wide landscape and modalities of EJC binding in adult Drosophila

Exon junction complex (EJC) assembles after splicing at specific positions upstream of exon-exon junctions in mRNAs of all higher eukaryotes, affecting major regulatory events. In mammalian cell cytoplasm, EJC is essential for efficient RNA surveillance, while in Drosophila, EJC is essential for localization of oskar mRNA. This study has developed a method for isolation of protein complexes and associated RNA targets (ipaRt) to explore the EJC RNA-binding landscape in a transcriptome-wide manner in adult Drosophila. The EJC was found at canonical positions, preferably on mRNAs from genes comprising multiple splice sites and long introns. Moreover, EJC occupancy is highest at junctions adjacent to strong splice sites, CG-rich hexamers, and RNA structures. Highly occupied mRNAs tend to be maternally localized and derive from genes involved in differentiation or development. These modalities, which have not been reported in mammals, specify EJC assembly on a biologically coherent set of transcripts in Drosophila (Obrdlik, 2019).

The exon junction complex (EJC) consists of a heterotetramer core composed of eIF4AIII, Mago, Y14 (Tsunagi), and Barentsz (Btz) and auxiliary factors that form the EJC periphery. The complex assembles on mRNAs during splicing, -20 to -24 nt upstream of exon-exon junctions. EJC assembly is a multi-step process that begins with CWC22-mediated deposition of the DEAD-box helicase eIF4AIII on nascent pre-mRNAs (Alexandrov, 2012; Barbosa, 2012; Steckelberg, 2015) and is followed by recruitment of Mago and Y14, forming a pre-EJC intermediate. The pre-EJC is stably bound to RNA because of the ATPase-inhibiting activity of the (non-RNA-binding) Mago-Y14 heterodimer, which 'locks' eIF4AIII helicase in its RNA-bound state. Once formed, the pre-EJC is completed by recruitment of Barentsz (Btz), forming mature EJCs (Obrdlik, 2019).

The roles of the EJC in post-transcriptional control of gene expression are manifold. In the nucleus, EJC subunits have a role in splicing, mRNA export, and nuclear retention of intron-containing RNAs. In the cytoplasm, the EJC is reported to play a role in translation, nonsense-mediated decay (NMD) , and RNA localization. Although most EJC functions appear conserved, in Drosophila the EJC is not crucial for NMD, but it is essential for oskar mRNA localization within the developing oocyte. To better understand the engagement of the EJC in the fly, a strategy to stabilize mRNA binding proteins (mRBPs) associated with their RNA templates has been developed within multi-protein messenger ribonucleoprotein (mRNP) assemblies, and the EJC mRNA interactome was defined in adult Drosophila melanogaster. Through the use of the crosslinking agent dithio(bis-) succinimidylpropionate (DSP), the method captures stable and transient protein interactions in close proximity and allows definition of the binding sites of specific protein (holo-)complexes associated with their RNA templates (isolation of protein complexes and associated RNA targets [ipaRt]). This analysis of EJC-protected sites defined by ipaRt reveals that in Drosophila, EJC binding occurs at canonical deposition sites, with a median coordinate ~22 nt upstream of exon-exon junctions. Although in mammals EJC-mediated protection outside canonical sides was reported, this study finds that in Drosophila the degree of non-canonical EJC-mediated RNA protection is minimal. In Drosophila RNA polymerase II transcripts protected primarily by the EJC derive from genes involved in differentiation or development, while mRNAs protected primarily by mRBPs derive from genes with homeostatic functions. This analysis suggests that the EJC's bias for transcripts in Drosophila is a consequence of several modalities in the genes' architecture, particularly splice site number and intron length. Moreover, EJC binding is enhanced by adjacent RNA secondary structures and CUG-rich hexamers located 3' to the EJC binding site. These modalities were not identified in previous studies of mammalian EJC binding, reflecting either greater specificity of this method for fully assembled EJCs or differences in EJC binding between flies and human. This study provides a comprehensive transcriptome-wide view of EJC-RNA interactions in a whole organism and unravels RNA modalities that contribute to the unforeseen biological coherence of the bound transcripts (Obrdlik, 2019).

This study has profiled the landscape of EJC binding across the transcriptome of a whole animal, Drosophila melanogaster, and determined the parameters that influence the distribution of the complex on RNAs in the organism. Previous knowledge of EJC-RNA interactions was based on UV-crosslinking experiments in specific cell types grown as homogeneous cultures for the individual studies. Although UV crosslinking remains a method of choice for identification of protein binding sites on nucleic acids, because of the inefficient penetration of UV light into tissues and organisms, the method is most useful when applied to cells in culture. In contrast, this analysis of EJC distribution in the tissues of whole Drosophila flies was made possible by ipaRt, which uses the crosslinking agent DSP to freeze protein-protein interactions within otherwise dynamic RNP complexes, such as the EJC (Obrdlik, 2019).

DSP-mediated covalent bond formation between the RNA helicase eIF4AIII and the Mago-Y14 heterodimer is shown to preserve EJCs in their 'locked' state on mRNAs and that efficient recovery of the bound RNAs does not require their crosslinking to eIF4AIII using UV light. The 'ipaRt' approach, like CLIP and iCLIP, enables highly stringent washing of the samples. In support of the robustness and reliability of the DSP-based assays, this study demonstrated high reproducibility not only among technical but also biological replicates of EJC ipaRt, as well as mRBP footprinting sequencing results (Obrdlik, 2019).

Furthermore, ipaRt allows enables the use of non-RNA-binding subunits of the EJC, such as Mago, as immunoprecipitation baits. This is highly relevant in the context of the EJC, as it has been shown that its RNA-binding subunit, the RNA helicase eIF4AIII, may have other, EJC-independent functions in the cell. ipaRt afford the option of using Mago as a EJC bait, and indeed this is a main reason for the high-quality definition of the EJC binding landscape in the fly cytoplasm that was achieved. The protection site reads obtained from EJC ipaRt map almost exclusively to canonical EJC deposition sites with a median protection ~22 nt of the upstream exon's 3' end. In contrast to mammalian EJC CLIP and RIP studies, in which eIF4AIII served as an immunoprecipitation bait, EJC ipaRt reads mapping to regions distant from canonical deposition sites are of low abundance and sequencing coverage. Although this discrepancy could reflect differences in EJC engagement in humans and Drosophila, it more likely reflects the choice of bait or the cell compartment in which the analysis was executed. Indeed, a recent study in human cells revealed that when the cytoplasmic EJC component Btz was chosen as the bait rather than eIF4AIII, the proportion of non-canonical EJC deposition sites was negligible (Obrdlik, 2019).

Finally, in ipaRt the DSP crosslinker is applied ex vivo during tissue disruption and does not require inhibition of translation in vivo. Therefore ipaRt is considered a method of choice for functional investigations of protein-RNA complexes in fully developed organisms and tissues (Obrdlik, 2019).

Through this analysis, factors were defined that contribute to or inhibit EJC assembly on mRNAs and at individual exon-exon junctions in Drosophila. From this it is deduced that the landscape of EJC binding to RNAs is sculpted through regulation of EJC assembly at two levels in the fly (Obrdlik, 2019).

At the upstream regulatory level, the degree to which EJCs are assembled on an mRNA is dictated by the complexity of the gene's architecture: mRNAs produced from genes of simple architecture are marked by fewer EJCs, while mRNAs from genes of complex architecture, comprising multiple splice sites and long introns, are EJC bound to a higher degree. Given that EJCs assemble on mRNAs concomitantly with splicing, it is not surprising that mRNAs of genes containing a greater number of introns are more likely to be EJC bound. However, the finding that the enhancing effect on EJC binding provoked by large introns is not restricted to flanking junctions but occurs at junctions mRNA-wide is unexpected. Loss-of-function experiments indicate that the EJC participates in exon definition during splicing of long intron-containing genes in Drosophila, particularly in definition of exons proximal to the long introns. The data exclude any significant bias toward EJC assembly in proximity to long-intron splice junctions. Instead they reveal a general enhancement of EJC binding at exon-exon junctions throughout transcripts of long-intron genes. Therefore, it is concluded that stable binding of EJCs within mRNAs of long-intron genes is not the result of EJC engagement in exon definition. Instead, it is proposed that the high degree of EJC binding to long-intron transcripts derives from the increased number and resting time of co-transcriptionally assembled spliceosomes on the nascent transcripts, which would increase the probability of CWC22-dependent eIF4AIII recruitment to pre-mRNAs during splicing (Obrdlik, 2019).

At the downstream regulatory level, after EJC assembly rates at transcripts are defined, deposition of EJCs along mRNA exon-exon junctions is modulated by the structural and sequence context of the splice sites. dsRNA stem structures in exon-exon junctions of Drosophila mRNAs either antagonize EJC assembly when present within canonical EJC deposition sites or enhance EJC assembly when located in the vicinity of the EJC deposition site. Absence of dsRNA within the EJC binding moiety is in agreement with reported preference of EJCs for ssRNA. It remains to be elucidated how and why EJC binding is positively affected when RNA stem structures are found in its direct proximity on the bound template (Obrdlik, 2019).

Although it is likely that the structural context of exon-exon junctions in Drosophila directly influences the degree of EJC assembly, sequence composition-derived effects on EJC binding to mRNA are a consequence of the assigned roles of these sequences during pre-mRNA splicing. This study has demonstrated that exon-exon junctions with strong 5' and strong 3' splice sites (SSs) are biased toward junctions with enhanced EJC binding. For the regulation of weak 5' and 3' SSs, which commonly occur at alternatively spliced junctions, cis-acting splicing regulatory elements (SREs) were shown to be of importance. In light of the negative impact of alternative splicing at the level of EJC mRNA binding, it is not surprising that conventional ESEs and ESSs hardly affect EJC binding at the level of individual exon-exon junctions. Whether the position-dependent bias mediated by the UUU-triplet- and CUG-triplet-containing hexamers toward inhibited or enhanced EJC binding that this study has discovered in the Drosophila dataset is due to a direct or indirect influence of these hexamers on splicing remains to be addressed. UUU-triplet-containing hexamers, which are strongly biased against EJC binding, could potentially function as yet undefined 5'ESS in Drosophila. Interestingly, CUG-triplet-containing hexamers, which are strongly biased toward enhanced EJC binding, share sequence similarity with a previously predicted CUG containing 5'ESE of short intron splice sites. It appears likely that the CUG-triplet and UUU-triplet hexamers exert their effect on EJC binding as a yet undefined class of SREs (Obrdlik, 2019).

In agreement with reports in mammals, the extent of EJC occupancy varies between mRNAs and exon-exon junctions also in Drosophila. The splice site score next to a junction correlates with increased EJC deposition in the fly, and this relationship between splicing efficiency and EJC deposition has also been proposed in mammalian studies. Analysis of published mammalian Btz iCLIP data revealed several modalities that correlate with the increased binding landscape of the EJC on mRNAs in both mammals and Drosophila, including the large number of introns, high transcript abundance, and sequence context of individual exon-exon junctions. Interestingly, the presence of long introns has a slightly negative effect and the amount of alternative splicing a slightly positive effect on EJC occupancy in mammals; the latter agrees with previous observations. Studies in cultured mammalian cells have reported that EJC-enriched junctions contain a relatively high proportion of 'non-canonical' protection sites, which were enriched for RBP consensus sequences of the SR protein family. Analysis of mammalian Btz iCLIP data confirms that presence of ESEs in upstream exons and 5'ISEs in introns correlates with enhanced EJC binding. Moreover, a group of junctions have been identified in mammals containing AGAA hexamers that are biased for enhanced EJC binding, but their effects are not especially strong near the canonical EJC deposition site. These hexamers match the AGAA-encompassing consensus sequence of the mammalian SR protein SRSF10, known to function as splicing enhancers, and have been found previously in EJC bound exon-exon junctions. Not only do the in silico results agree with these reports and support the proposed cooperative binding of EJC with SR proteins, they also partially explain the EJC's preference in mammals for alternatively spliced mRNAs (Obrdlik, 2019).

One observation deriving from this analysis of published mammalian Btz iCLIP datasets is surprising. Although junctions in Drosophila are observed to be enhanced or inhibited in EJC binding by specific base-pairing probability (bpp) profiles, thus by specific RNA folding categories, it was not possible to detect any striking difference between overall bpp profiles of exon-exon junctions with enhanced or inhibited EJC binding in mammals. Indeed, the only aspect of RNA structure shared by mammals and Drosophila is the negative effect of dsRNA when directly overlapping with the canonical EJC deposition site. In Drosophila, however, the presence of dsRNA close to canonical deposition sites enhances EJC binding, an effect that is not observed in mammalian cells (Obrdlik, 2019).

The findings regarding the differences in the RNA modalities enriched at highly occupied mammalian and Drosophila EJC sites provide insight into the expansion of functions of the EJC during eukaryotic evolution. Spliceosome catalyzed splicing reactions are bidirectional, and efficient formation of exon-exon junctions during RNA maturation is achieved by Prp22-induced release of spliceosomes from mRNAs. The EJC is absent in organisms with low rates of RNA splicing, such as Saccharomyces cerevisiae, but present in organisms with high splicing rates, such as Schizosaccharomyces pombe. This suggests that with the increased demand for splicing accuracy in higher eukaryotes, the EJC evolved to function as an exon-exon junction 'lock' hindering spliceosome reassembly at spliced exon-exon junctions. Because EJC binding in the fly is enhanced at strong splices sites, but is not affected by splicing enhancer elements, and is not biased toward alternatively spliced mRNAs, it is proposed that the EJC preserved its primary function as such a lock in Drosophila. Two recent studies provide evidence that also in mammals bound EJCs hinder spliceosome assembly, suppressing recursive splicing (RS) of RS exons. The previously reported importance of EJC for splicing fidelity, and the current observations on the mode of EJC binding to transcripts in the fly revealing its independence from splicing regulatory elements indeed supports that the EJC's most conserved function is to ensure splicing irreversibility (Obrdlik, 2019).

The EJC further evolved to become a central component of the NMD pathway in mammals, in which more than 95% of all genes are alternatively spliced. This may explain why EJCs in mammals are enriched on alternatively spliced transcripts. In Drosophila, in which only 30% of all genes appear to be alternatively spliced, the EJC is not a component of the main NMD pathway. It is proposed that although the EJC-NMD pathway evolved before segregation of the proto- and deuterostome clades, it gained importance by complementing the faux 3'UTR-NMD pathway during the evolution of vertebrates, for which RNA surveillance and spatiotemporal control of gene expression are essential (Obrdlik, 2019).

Similarly, recruitment of the EJC and interacting proteins upon splicing to facilitate mRNA localization so far seems exclusive to Drosophila. Two Drosophila-specific features that modulate EJC binding, namely, the presence of a large intron within a gene and secondary structure near the junction, are also predictive of mRNA localization. Although the precise strength of association between these features and mRNA localization remains to be verified with larger and more quantitative datasets, previous studies with the SOLE in oskar RNA have shown that RNA structure and EJC binding are indeed crucial for oskar mRNA localization (Obrdlik, 2019).

The exon junction complex regulates the splicing of cell polarity gene dlg1 to control Wingless signaling in development

Wingless (Wg)/Wnt signaling is conserved in all metazoan animals and plays critical roles in development. The Wg/Wnt morphogen reception is essential for signal activation, whose activity is mediated through the receptor complex and a scaffold protein Dishevelled (Dsh). This study reports that the exon junction complex (EJC) activity is indispensable for Wg signaling by maintaining an appropriate level of Dsh protein for Wg ligand reception in Drosophila. Transcriptome analyses in Drosophila wing imaginal discs indicate that the EJC controls the splicing of the cell polarity gene discs large 1 (dlg1), whose coding protein directly interacts with Dsh. Genetic and biochemical experiments demonstrate that Dlg1 protein acts independently from its role in cell polarity to protect Dsh protein from lysosomal degradation. More importantly, human orthologous Dlg protein is sufficient to promote Dvl protein stabilization and Wnt signaling activity, thus revealing a conserved regulatory mechanism of Wg/Wnt signaling by Dlg and EJC (Liu, 2016).

The EJC is known to act in several aspects of posttranscriptional regulation, including mRNA localization, translation and degradation. After transcription, the pre-mRNA associated subunit eIF4AIII is loaded to nascent transcripts about 20-24 bases upstream of each exon junction, resulting in binding of Mago nashi (Mago)/Magoh and Tsunagi (Tsu)/Y14 proteins to form the pre-EJC core complex. The pre-EJC then recruits other proteins including Barentsz (Btz) to facilitate its diverse function). In vertebrates, the EJC is known to ensure translation efficiency as well as to activate nonsense-mediated mRNA decay (NMD). In Drosophila, however, the EJC does not contribute to NMD. It is instead required for the oskar mRNA localization to the posterior pole of the oocyte. Very recently, the pre-EJC has been shown to play an important role in alternative splicing of mRNA in Drosophila. Reduced EJC expression results in two forms of aberrant splicing. One is the exon skipping, which occurs in MAPK and transcripts that contain long introns or are located at heterochromatin (Ashton-Beaucage, 2010; Roignant, 2010). The other is the intron retention on piwi transcripts. Furthermore, transcriptome analyses in cultured cells indicates the role of EJC in alternative splicing is also conserved in vertebrates (Liu, 2016).

This study has utilized the developing Drosophila wing as an in vivo model system to investigate new mode of regulation of Wg signaling. The pre-EJC was found to positively regulate Wg signaling through its effect on facilitating Wg morphogen reception. Further studies reveal that the basolateral cell polarity gene discs large 1 (dlg1) is an in vivo target of the pre-EJC in Wg signaling. Dlg1 acts independently from its role on cell polarity to stabilize Dsh protein, thus allowing Wg protein internalization required for signaling activation. Furthermore, it was demonstrated that human Dlg2 exhibits a similar protective role on Dvl proteins to enhance Wnt signaling in cultured human cells. Taken together, this study unveils a conserved regulatory mechanism of the EJC and Dlg in Wg/Wnt signaling (Liu, 2016).

In summary, this study uncovers a specific role of the RNA binding protein complex EJC in the Drosophila wing morphogenesis. Genetic and biochemical analyses demonstrate that the pre-EJC is necessary for Wg morphogen reception to activate the signal transduction. The identification of the cell polarity determinant dlg1 as one of the pre-EJC targets provides mechanistic basis for the pre-EJC regulation of the Wg signaling. Dlg1 controls the stability of the scaffold protein Dsh, which is the hub of the Wg signaling cascade. Importantly, this mode of regulation of Dvl by Dlg is conserved from flies to vertebrates (Liu, 2016).

The EJC as well as other RNA binding protein complexes are thought to function in a pleiotropic manner. However, the current data together with several recent studies argue that RNA regulatory machineries can act specifically on developmental signaling for pattern formation and organogenesis. It has been increasingly recognized that the production, transport or the location of mRNA are subject to precise regulation in Wg/Wnt signaling. For example, apical localization of wg RNA is essential for signal activation in epithelial cells. The specific role of RNA machineries on cell signaling is not limited to Wg/Wnt signaling. It has been reported that RNA-binding protein Quaking specifically binds to the 3'UTR of transcription factor gli2a mRNA to modulate Hedgehog signaling in zebrafish muscle development. RNA binding protein RBM5/6 and 10 could differentially control alternative splicing of a negative Notch regulator gene NUMB, thus antagonistically regulating the Notch signaling activity for cancer cell proliferation. Therefore, generally believed pleotropic RNA regulatory machineries emerge as important regulatory means to specifically control cell signaling and related developmental processes (Liu, 2016).

The most studied function of the EJC in development is to localize oskar mRNA to the posterior pole of the oocyte for oocyte polarity establishment and germ cell formation in Drosophila. Further study suggests that the proper oskar RNA localization relies on its mRNA splicing. In light of the current study of the EJC activity on dlg1 mRNA as well as the roles of EJC on mapk and piwi splicing, it is suspected that EJC might regulate oskar mRNA splicing to mediate its mRNA localization. RNA-seq analyses identified several hundreds of candidate mRNAs whose expression may be directly or indirectly subjected to EJC regulation. Apart from defects in Wg and MAPK signaling, however, altered wing patterning associated with other developmental signaling systems was not seen in EJC defective flies, arguing that EJC may primarily regulate Wg and MAPK signaling in patterning the developing wing (Liu, 2016).

Wg/Wnt signaling plays a fundamental role in development and tissue homeostasis in both flies and vertebrates. Its activation and maintenance rely on appropriate activity of the ternary receptor complex including Fz family proteins. In Drosophila, polarized localization of Fz and Fz2 proteins is essential for activation of non-canonical and canonical Wg signaling, respectively. Dsh, which acts as a hub mediating both canonical and non-canonical Wg signaling, however, is found at both the apical cell boundary and in the basal side of the cytoplasm. Thus, the polarized activity of Dsh must require distinct regulatory mechanisms at different sub-membrane compartments. The results provide the in vivo evidence suggesting that the basolateral polarity determinant Dlg1 may play a dominant role to control the Dsh abundance/activity in canonical Wg signaling (Liu, 2016).

Altered Dvl production or activity has been linked with several forms of cancer. The stability of Dvl proteins can be controlled through regulated protein degradation both in vertebrates and in Drosophila as reported in this study. In HEK293T cells, Dapper1 induces whilst Myc-interacting zinc-finger protein 1 (MIZ1) antagonizes autophagic degradation of Dvl2 in lysosome. It is also reported that a tumor suppressor CYLD deubiquitinase inhibits the ubiquitination of Dvl. As Dlg1 prevents Dsh from degradation in Drosophila, it is important to investigate if Dlg1 participates in a posttranslational regulatory network of Dvl to integrate endocytosis and autophagy. Furthermore, upregulation of dvl2 and dlg2 expression has been found in various forms of cancer as shown in the COSMIC database. The study of the interaction between Dlg1 and Dsh may aid the development of novel approaches to prevent or treat relevant diseases. (Liu, 2016).

Dlg1 acts together with L(2)gl to form a basolateral complex in polarized epithelium. Dsh is known to interact with L(2)gl. On one hand, Dsh activity is required for correct localization of L(2)gl to establish apical-basal polarity in Xenopus ectoderm and Drosophila follicular epithelium. On the other hand, L(2)gl can regulate Dsh to maintain planar organization of the embryonic epidermis in Drosophila. Despite the complex interaction between L(2)gl and Dsh, not much is known about mutual regulation between Dlg1 and Dsh. A recent report suggests that Dsh binds to Dlg1 to activate Guk Holder-dependent spindle positioning in Drosophila. The current results unveil another side of the relationship in which Dlg1 controls the turnover of Dsh to ensure developmental signal propagation. Apart from its apical localization at the cell boundary, Dsh is also found in the basal side of the cytoplasm. It is likely that the interactions among Dsh, Dlg1 and L(2)gl may be dependent on their localization, and Dsh may serve as a bridge to connect cell signaling and polarity (Liu, 2016).

Developmental signaling and cell polarity intertwine to control a diverse array of cellular events. It is well known that Wg/Wnt signaling controls cell polarity in distinct manner. Non-canonical signaling acts through cytoskeletal regulators to establish planar cell polarity. Canonical signaling may also directly affect apical-basal cell polarity. On the other hand, disruption of epithelial cell polarity has a profound impact on protein endocytosis and recycling, both of which are essential regulatory steps for signal activation and maintenance. The current results add another layer of complexity by which polarity determinants could contribute to cell signaling independent of their conventional roles in polarity establishment and maintenance. Interestingly, this mode of regulation is also observed for other signaling processes. Loss of Dlg5 impairs Sonic hedgehog-induced Gli2 accumulation at the ciliary tip in mouse fibroblast cells that may not rely on cell polarity regulation. Similarly, L(2)gl regulates Notch signaling via endocytosis, independent of its role in cell polarity. It is believed that other cell polarity determinants may similarly participate in polarity-independent processes, however, the exact mechanism of how they cooperate to modulate developmental signaling awaits further investigation (Liu, 2016).

  • The EJC disassembly factor PYM is an intrinsically disordered protein and forms a fuzzy complex with RNA

    The discovery of several functional interactions where one or even both partners remain disordered has demonstrated that specific interactions do not necessarily require well-defined intermolecular interfaces. This study describes a fuzzy protein-RNA complex formed by the intrinsically unfolded protein PYM and RNA. PYM is a cytosolic protein, which has been reported to bind the exon junction complex (EJC). In the process of Oskar mRNA localization in Drosophila melanogaster, removal of the first intron and deposition of the EJC are essential, while PYM is required to recycle the EJC components after localization has been accomplished. This study demonstrates that the first 160 amino acids of PYM (PYM(1-160)) are intrinsically disordered. PYM(1-160) binds RNA independently of its nucleotide sequence, forming a fuzzy protein-RNA complex that is incompatible with PYM's function as an EJC recycling factor. It is proposed that the role of RNA binding consists in down-regulating PYM activity by blocking the EJC interaction surface of PYM until localization has been accomplished. It is suggested that the largely unstructured character of PYM may act to enable binding to a variety of diverse interaction partners, such as multiple RNA sequences and the EJC proteins Y14 and Mago (Verma, 2023).

    SR proteins control a complex network of RNA-processing events

    SR proteins are a well-conserved class of RNA-binding proteins that are essential for regulation of splice-site selection, and have also been implicated as key regulators during other stages of RNA metabolism. For many SR proteins, the complexity of the RNA targets and specificity of RNA-binding location are poorly understood. It is also unclear if general rules governing SR protein alternative pre-mRNA splicing (AS) regulation uncovered for individual SR proteins on few model genes, apply to the activity of all SR proteins on endogenous targets. Using RNA-seq, this study characterized the global AS regulation of the eight Drosophila SR protein family members. A majority of AS events are regulated by multiple SR proteins, and that all SR proteins can promote exon inclusion, but also exon skipping. Most coregulated targets exhibit cooperative regulation, but some AS events are antagonistically regulated. Additionally, it was found that SR protein levels can affect alternative promoter choices and polyadenylation site selection, as well as overall transcript levels. Cross-linking and immunoprecipitation coupled with high-throughput sequencing (iCLIP-seq), reveals that SR proteins bind a distinct and functionally diverse class of RNAs, which includes several classes of noncoding RNAs, uncovering possible novel functions of the SR protein family. Finally, it was found that SR proteins exhibit positional RNA binding around regulated AS events. Therefore, regulation of AS by the SR proteins is the result of combinatorial regulation by multiple SR protein family members on most endogenous targets, and SR proteins have a broader role in integrating multiple layers of gene expression regulation (Bradley, 2014).

    Combinatorial control of Drosophila circular RNA expression by intronic repeats, hnRNPs, and SR proteins

    Thousands of eukaryotic protein-coding genes are noncanonically spliced to produce circular RNAs. Bioinformatics has indicated that long introns generally flank exons that circularize in Drosophila, but the underlying mechanisms by which these circular RNAs are generated are largely unknown. This study, using extensive mutagenesis of expression plasmids and RNAi screening, revealed that circularization of the Drosophila laccase2 gene is regulated by both intronic repeats and trans-acting splicing factors. Analogous to what has been observed in humans and mice, base-pairing between highly complementary transposable elements facilitates backsplicing. Long flanking repeats (approximately 400 nucleotides [nt]) promote circularization cotranscriptionally, whereas pre-mRNAs containing minimal repeats (<40 nt) generate circular RNAs predominately after 3' end processing. Unlike the previously characterized Muscleblind (Mbl) circular RNA, which requires the Mbl protein for its biogenesis, it was found that Laccase2 circular RNA levels are not controlled by Mbl or the Laccase2 gene product but rather by multiple hnRNP (heterogeneous nuclear ribonucleoprotein) and SR (serine-arginine) proteins acting in a combinatorial manner. hnRNP and SR proteins also regulate the expression of other Drosophila circular RNAs, including Plexin A (PlexA), suggesting a common strategy for regulating backsplicing. Furthermore, the laccase2 flanking introns support efficient circularization of diverse exons in Drosophila and human cells, providing a new tool for exploring the functional consequences of circular RNA expression across eukaryotes (Kramer, 2015).

    It was long assumed that eukaryotic pre-mRNAs are always canonically spliced to generate a linear mRNA that is subsequently translated to produce a protein. However, it is now becoming increasingly clear that many genes can be noncanonically spliced to produce circular RNAs with covalently linked ends. These transcripts are almost exclusively derived from exons, accumulate in the cytoplasm, and are thought to be products of alternative splicing events known as 'backsplicing.' In contrast to canonical splicing, which joins the exons in a linear order (joining exon 1 to exon 2 to exon 3, etc.), backsplicing joins a splice donor to an upstream splice acceptor (e.g., joining the 3' end of exon 2 to the 5' end of exon 2). A handful of RNAs generated in this manner were identified in the 1990s, and recent deep sequencing studies have expanded this observation to thousands of circular RNAs expressed across eukaryotes, including humans, Caenorhabditis elegans, Drosophila (Salzman. 2013; Ashwal-Fluss, 2014; Westholm, 2014), Schizosaccharomyces pombe, and plants. Perhaps surprisingly, for some genes, the abundance of the circular RNA exceeds that of the associated linear mRNA by a factor of 10, suggesting that the major function of some protein-coding genes may be to generate circular RNAs (Kramer, 2015).

    Most exons in eukaryotic genomes have splicing signals at both ends and theoretically can circularize. However, only certain exons are observed in circular RNAs, and these backsplicing events often occur in a tissue-specific manner. This suggests that circular RNA biogenesis is tightly regulated. As splicing generally occurs cotranscriptionally, most introns, along with their upstream splice acceptors (which are needed for backsplicing), are rapidly removed. Therefore, for circular RNAs to be produced, canonical splicing likely must occur more slowly around these exons, and/or exon skipping events may be coupled to circular RNA biogenesis. In the latter, the circular RNA is derived from an exon-containing lariat, allowing a pre-mRNA to yield both a linear mRNA and a circular RNA comprised of the skipped exons (Kramer, 2015).

    There is little known about the splicing factors that regulate these events. In some cases, the Muscleblind (Mbl) and Quaking proteins appear to facilitate backsplicing by bridging between two introns and causing the splice sites from the intervening exons to be brought into close proximity (Ashwal-Fluss, 2014; Conn, 2015). For example, circular RNA production from the Drosophila mbl gene is triggered when the Mbl splicing factor binds to its own introns (Ashwal-Fluss, 2014). However, in humans, mice, and C. elegans, the predominant determinants of whether a pre-mRNA is subjected to backsplicing are intronic repetitive elements, such as sequences derived from transposons. Almost 90% of human circular RNAs have complementary Alu elements in their flanking introns, and, analogous to the protein-bridging mechanism, base-pairing between complementary sequences allows the intervening splice sites to be brought close together. Interestingly, repeats <40 nucleotides (nt) can drive circular RNA production in human cells, but it is clear that more than simple thermodynamics regulates circularization. For example, base-pairing interactions can be disrupted by ADAR (adenosine deaminase acting on RNA), which converts adenosines in double-stranded regions to inosines. In addition, most mammalian pre-mRNAs contain multiple intronic repeats, allowing distinct circular (or linear) RNAs to be produced depending on which repeats base-pair to one another. Therefore, other factors likely help dictate splicing outcomes by regulating these exon circularization events (Kramer, 2015).

    Despite key regulatory roles for intronic repeats in multiple eukaryotes, it has been suggested that circular RNA biogenesis in Drosophila melanogaster is not driven by base-pairing interactions (Westholm, 2014). Instead, a positive correlation between the length of the flanking introns and circular RNA abundance was identified in Drosophila (Westholm, 2014). However, the effect of modulating intron lengths on backsplicing has not yet been directly addressed. It is also completely unknown how Drosophila circular RNAs besides Mbl, of which there are >2500 annotated circular RNAs derived from other genomic loci, are generated or post-transcriptionally regulated. Therefore, it is still unclear whether circular RNA biogenesis strategies are conserved across eukaryotes or whether species such as Drosophila use unique mechanisms to determine which exons should be backspliced (Kramer, 2015).

    Once produced, circular RNAs are stable transcripts that are naturally resistant to degradation by exonucleases. Two circular RNAs (ciRS7/CDR1as and Sry) modulate the activity of specific microRNAs (Hansen, 2013; Memczak, 2013), but most other RNA circles (in species other than Drosophila) contain few microRNA-binding sites and likely function differently. For example, it has been proposed that many circular RNAs may regulate neuronal functions, and artificial circular RNAs containing an IRES (internal ribosome entry site) can be translated. However, the lack of efficient methods for modulating circular RNA levels or ectopically expressing circular RNAs has limited the ability to define functions for these transcripts (Kramer, 2015).

    This study focused on the Drosophila laccase2 gene, as it produces an abundant circular RNA in vitro and in vivo. Evidence is provided that intronic repeats collaborate with trans-acting splicing factors to regulate circularization in flies. Mechanistically, it was found that miniature introns (<150 nt) containing the splice sites and inverted repeats were sufficient to support Laccase2 circular RNA production. The intronic repeats must base-pair to one another for circularization to occur, as has been observed in other eukaryotes. Furthermore, it was found that the strength of these base-pairing interactions dictates whether backsplicing occurs co- or post-transcriptionally: Long flanking repeats appear to allow cotranscriptional processing. Screening a panel of genes, this study found that multiple hnRNP (heterogeneous nuclear ribonucleoprotein) and SR (serine–arginine) family proteins regulate Laccase2 circular RNA levels in a combinatorial manner. Comparisons with the mbl locus suggest that the circularization mechanisms are distinct, as the Laccase2 circular RNA was not regulated by the Mbl or Laccase2 gene products. Additional circular RNAs were identified that are regulated by unique combinations of hnRNP and SR proteins, suggesting that combinatorial control may be a common regulatory strategy that modulates circular RNA levels. This led to a test of whether this biogenesis mechanism is active in human cells, and it was found that the laccase2 introns can indeed robustly generate circular RNAs. It is thus now possible to efficiently generate "designer" circular RNAs in cells with minimal linear RNA production. In total, the results reveal new insights into how trans-acting factors and intronic repeats collaborate to regulate circular RNA biogenesis across eukaryotes as well as provide new tools for exploring the functions of circular RNAs (Kramer, 2015).

    This study demonstrates that intronic repeats and trans-acting hnRNPs and SR proteins combinatorially regulate circularization of the Drosophila laccase2 gene. Base-pairing between transposable elements in the flanking introns facilitates circularization, and the strength of these interactions likely dictates whether backsplicing occurs co- or post-transcriptionally. This mechanism is distinct from the one that regulates Drosophila Mbl circular RNA production (Ashwal-Fluss, 2014) but is similar to that used to generate many circles in humans, mice, and C. elegans. This suggests that base-pairing between intronic repeats may be a major mechanism promoting exon circularization across eukaryotes. Moreover, this study found that the laccase2 exon is dispensable, allowing the laccase2 introns to be used to efficiently generate 'designer' circular RNAs from plasmids in diverse organisms. Altogether, the results suggest that circular RNA biogenesis strategies are conserved across eukaryotes and provide new tools for exploring the functions of circular RNAs (Kramer, 2015).

    The current results on the laccase2 locus indicate that base-pairing between complementary intronic sequences efficiently promotes RNA circularization in flies. As the DNAREP1_DM repeats closely flank exon 2 of the laccase2 gene, a model is proposed in which the repeats base-pair to one another, bringing the intervening splice sites into close proximity and facilitating catalysis. The Laccase2 circular RNA then accumulates as one of the most abundant circular RNAs in Drosophila (fifth most abundant across >100 Drosophila RNA sequencing libraries). At the endogenous laccase2 gene locus, the long introns that flank this exon likely slow the overall speed of cotranscriptional splicing, thereby allowing the backsplicing reaction to effectively compete with canonical splicing. Indeed, it was found that the strength of the base-pairing interactions between the flanking introns dictates how quickly backsplicing can occur. When very stable interactions are present, it is possible that exon definition is improved, allowing the rapid and cotranscriptional generation of a circular RNA. Nevertheless, further studies are still required to clarify the exact role that long flanking introns may play in regulating circularization (Kramer, 2015).

    Upon examining the introns that flank other abundant Drosophila circular RNAs, this study identified other examples in which complementary regions >60 nt in length flank circularizing exons, including CaMKI, CG11155, CG2052, Parp, and PlexA (which are among the top 25 most abundant Drosophila circular RNAs). Interestingly, the Semaphorin-2b (CG33960) circular RNA (39th most abundant circular RNA) is flanked by introns containing short (CA)n simple repeats that are complementary to each other over a <30-nt region. Upon cloning a 980-nt region of the Semaphorin-2b pre-mRNA downstream from the pMT, circular RNA production from the plasmid was observed in DL1 cells. Removal of either of the (CA)n simple repeats, however, strongly reduced circularization. This suggests that diverse inverted repeat sequences, including short simple repeats, may play a general role in facilitating circularization in Drosophila (Kramer, 2015).

    Complementary repeats, however, are not observed at all Drosophila loci that generate circular RNAs. Furthermore, many exons that do not circularize are flanked by complementary repeats, so there must be other mechanisms that regulate circularization. This has been most notably demonstrated at the Drosophila mbl locus, which requires the Mbl splicing factor for its circularization. When Mbl protein is in excess, an intricate feedback mechanism is induced: The Mbl protein decreases the production of its own mRNA by binding its pre-mRNA. This blocks canonical splicing and promotes the biogenesis of the Mbl circular RNA, which further functions as a sponge that binds and sequesters the excess Mbl protein. However, this Mbl-driven mechanism appears to be specific for the mbl locus, as this study found that knockdown of the Mbl linear mRNA had no effect on Laccase2, PlexA, or a panel of other circular RNAs. Knockdown of the Laccase2 linear mRNA likewise did not affect Laccase2 circular RNA levels, indicating that the laccase2 locus is not subjected to a similar direct cis-acting feedback mechanism. Instead, it was found that other splicing factors, including hnRNPs and SR proteins, regulate Laccase2 RNA levels (Kramer, 2015).

    At the laccase2 locus, it is proposed that hnRNPs (e.g., Hrb27C and Hrb87F) and SR proteins (e.g., SF2 [SRSF1], SRp54 [SRSF11], and B52 [SRSF6]) add an additional layer of control on top of the DNAREP1_DM intronic repeats. Base-pairing between the intronic repeats promotes circularization, but protein binding likely helps ensure that the appropriate ratio of linear to circular Laccase2 RNA is produced. Depletion of any one of these splicing factors alters Laccase2 circle levels, and additive effects were observed when multiple factors were depleted. This suggests combinatorial control, with each protein playing a nonredundant role. Furthermore, Laccase2 circular RNA production does not appear to be linked to exon skipping, and thus these proteins may specifically modulate spliceosome assembly, the speed of splicing, and/or the stability of the mature circular RNA. Notably, it does not seem that Hrb27F, SF2, SRp54, or B52 affects Laccase2 circular RNA stability, as depletion of these factors did not cause the expression of a plasmid-derived Laccase2 circular RNA to increase. It is thus instead proposed that these hnRNPs and SR proteins regulate Laccase2 circular RNA biogenesis (e.g., by binding to the flanking introns or exons), but further studies are required to understand exactly how the intronic repeats and trans-acting factors collaboratively dictate the splicing outcome. Nevertheless, the same SR proteins that regulate the laccase2 locus also regulate the PlexA circular RNA but not the Mbl circular RNA. Since the laccase2 and PlexA exons are both flanked by inverted repeats, it is hypothesized that intronic repeats may generally provide the opportunity for circularization to occur. This is then further regulated by trans-acting factors that combinatorially fine-tune the amount of each circular RNA that the cell ultimately produces (Kramer, 2015).

    Catalogs of circular RNAs expressed in various species and cell types have been reported, but the functions for nearly all of these transcripts, including Laccase2, are currently unknown. This is due in part to the current lack of methods for efficiently generating circular RNAs in cells. For example, the circular RNA expression plasmids that have been described all generally produce circular transcripts at a low efficiency (often 20% or less). These plasmids instead generate abundant amounts of linear RNA, which limits their utility for defining circular RNA functions. Using the Drosophila laccase2 and human ZKSCAN1 introns, this study largely overcame this hurdle and generated circular RNAs (ranging in size from 300 to 1500 nt) at a high efficiency in human and fly cells. These transcripts accumulate in the cytoplasm, are resistant to RNase R treatment, and are likely translated when an IRES is present. Furthermore, easy-to-use restriction sites are present in the plasmids, allowing any desired sequence to be queried. Beyond allowing ectopic expression of circular RNAs, these plasmids can be designed to sponge microRNAs or proteins as well as identify novel IRES sequences (Kramer, 2015).

    In summary, the current findings provide key insights into how trans-acting factors and intronic repeats regulate circular RNA biogenesis as well as provide new tools for exploring the functions of circular RNAs across eukaryotes. From humans to flies, repetitive elements in introns can act to facilitate backsplicing, but it is still largely unclear why circular RNAs accumulate only in certain tissues. It is hypothesized that base-pairing between repeats is only one part of the "splicing code", and it is ultimately a combination of cis-acting elements and trans-acting splicing factors, including hnRNPs and SR proteins, that dictates whether canonical splicing or backsplicing occurs. Nevertheless, this study has defined a minimal set of elements that is sufficient for promoting efficient exon circularization, which should facilitate the prediction of circular RNAs as well as enable the functions of many circular RNAs to be revealed. Considering that a surprisingly large number of protein-coding genes generates circular RNAs, these previously overlooked transcripts likely represent key ways that gene functions are expanded and modulated (Kramer, 2015).

    m6A potentiates Sxl alternative pre-mRNA splicing for robust Drosophila sex determination

    N6-methyladenosine (m6A) is the most common internal modification of eukaryotic messenger RNA (mRNA) and is decoded by YTH domain proteins. Drosophila mRNA m6A methylosome consists of Ime4 and KAR4 (Inducer of meiosis 4 and Karyogamy protein 4), and Female-lethal (2)d (Fl(2)d) and Virilizer (Vir). In Drosophila, fl(2)d and vir are required for sex-dependent regulation of alternative splicing of the sex determination factor Sex lethal (Sxl). However, the functions of m6A in introns in the regulation of alternative splicing remain uncertain. This study shows that m6A is absent in the mRNA of Drosophila lacking Ime4. In contrast to mouse and plant knockout models, Drosophila Ime4-null mutants remain viable, though flightless, and show a sex bias towards maleness. This is because m6A is required for female-specific alternative splicing of Sxl, which determines female physiognomy, but also translationally represses male-specific lethal 2 (msl-2) to prevent dosage compensation in females. The m6A reader protein YT521-B decodes m6A in the sex-specifically spliced intron of Sxl, as its absence phenocopies Ime4 mutants. Loss of m6A also affects alternative splicing of additional genes, predominantly in the 5' untranslated region, and has global effects on the expression of metabolic genes. The requirement of m6A and its reader YT521-B for female-specific Sxl alternative splicing reveals that this hitherto enigmatic mRNA modification constitutes an ancient and specific mechanism to adjust levels of gene expression (Haussmann, 2016).

    In mature mRNA the m6A modification is most prevalently found around the stop codon as well as in 5' untranslated regions (UTRs) and in long exons in mammals, plants and yeast. Since methylosome components predominantly localize to the nucleus, it has been speculated that m6A localized in pre-mRNA introns could have a role in alternative splicing regulation in addition to such a role when present in long exons. This prompted the authors to investigate whether m6A is required for Sxl alternative splicing, which determines female sex and prevents dosage compensation in females. A null allele of the Drosophila METTL3 methyltransferase homologue Ime4 was generated by imprecise excision of a P element inserted in the promoter region. The excision allele Δ22-3 deletes most of the protein-coding region, including the catalytic domain, and is thus referred to as Ime4null. These flies are viable and fertile, but both flightless and this phenotype can be rescued by a genomic construct restoring Ime4. Ime4 shows increased expression in the brain and, as in mammals and plants, localizes to the nucleus (Haussmann, 2016).

    Following RNase T1 digestion and 32P end-labelling of RNA fragments, this study detected m6A after guanosine (G) in poly(A) mRNA of adult flies at relatively low levels compared to other eukaryotes (m6A/A ratio: 0.06%), but at higher levels in unfertilized eggs (0.18%). After enrichment with an anti-m6A antibody, m6A is readily detected in poly(A) mRNA, but absent from Ime4null flies (Haussmann, 2016).

    As found in other systems, and consistent with a potential role in translational regulation, m6A was detected in polysomal mRNA (0.1%), but not in the poly(A)-depleted rRNA fraction. This also confirmed that any m6A modification in rRNA is not after G in Drosophila (Haussmann, 2016).

    Consistent with the hypothesis that m6A plays a role in sex determination and dosage compensation, the number of Ime4null females was reduced to 60% compared to the number of males, whereas in the control strain female viability was 89%. The key regulator of sex determination in Drosophila is the RNA-binding protein Sxl, which is specifically expressed in females. Sxl positively auto-regulates expression of itself and its target transformer (tra) through alternative splicing to direct female differentiation. In addition, Sxl suppresses translation of msl-2 to prevent upregulation of transcription on the X chromosome for dosage compensation; full suppression also requires maternal factors. Accordingly, female viability was reduced to 13% by removal of maternal m6A together with zygotic heterozygosity for Sxl and Ime4. Female viability of this genotype is completely rescued by a genomic construct or by preventing ectopic activation of dosage compensation by removal of msl-2. Hence, females are non-viable owing to insufficient suppression of msl-2 expression, resulting in upregulation of gene expression on the X chromosome from reduced Sxl levels. In the absence of msl-2, disruption of Sxl alternative splicing resulted in females with sexual transformations displaying male-specific features such as sex combs, which were mosaic to various degrees, indicating that Sxl threshold levels are affected early during establishment of sexual identities of cells and/or their lineages. In the presence of maternal Ime4, Sxl and Ime4 do not genetically interact. In addition, Sxl is required for germline differentiation in females and its absence results in tumorous ovaries. Consistent with this, tumorous ovaries were detected in Sxl7B0/+;Ime4null/+ daughters from Ime4null females (22%), but not in homozygous Ime4null or heterozygous Sxl7B0 females (Haussmann, 2016).

    Furthermore, levels of the Sxl female-specific splice form were reduced to approximately 50%, consistent with a role for m6A in Sxl alternative splicing. As a result, female-specific splice forms of tra and msl-2 were also significantly reduced in adult females (Haussmann, 2016).

    To obtain more comprehensive insights into Sxl alternative splicing defects in Ime4null females, splice junction reads were examined from RNA-seq. Besides the significant increase in inclusion of the male-specific Sxl exon in Ime4null females, cryptic splice sites and increased numbers of intronic reads were detected in the regulated intron. Consistent with reverse transcription polymerase chain reaction (RT-PCR) analysis of tra, the reduction of female splicing in the RNA sequencing is modest, and as a consequence, alternative splicing differences of Tra targets dsx and fru were not detected in whole flies, suggesting that cell-type-specific fine-tuning is required to generate splicing robustness rather than being an obligatory regulator. In agreement with dosage-compensation defects as a main consequence of Sxl dysregulation in Ime4null mutants, X-linked, but not autosomal, genes are significantly upregulated in Ime4null females compared to controls (Haussmann, 2016).

    Furthermore, Sxl mRNA is enriched in pull-downs with an m6A antibody compared to m6A-deficient yeast mRNA added for quantification. This enrichment is comparable to what was observed for m6A-pull-down from yeast mRNA (Haussmann, 2016).

    To map m6A sites in the intron of Sxl, an in vitro m6A methylation assay was employed using Drosophila nuclear extracts and labelled substrate RNA. m6A methylation activity was detected in the vicinity of alternatively spliced exons. Further fine-mapping localized m6A in RNAs C and E to the proximity of Sxl-binding sites. Likewise, the female-lethal single amino acid substitution alleles fl(2)d1 and vir2F interfere with Sxl recruitment, resulting in impaired Sxl auto-regulation and inclusion of the male-specific exon. Female lethality of these alleles can be rescued by Ime4null heterozygosity, further demonstrating the involvement of the m6A methylosome in Sxl alternative splicing (Haussmann, 2016).

    Next, alternative splicing changes were examined in Ime4null females compared to the wild-type control strain. A statistically significant reduction was seen in female-specific alternative splicing of Sxl was observed. In addition, 243 alternative splicing events in 163 genes were significantly different in Ime4null females, equivalent to around 2% of alternatively spliced genes in Drosophila. Six genes for which the alternative splicing products could be distinguished on agarose gels were confirmed by RT-PCR. Notably, lack of Ime4 did not affect global alternative splicing and no specific type of alternative splicing event was preferentially affected. However, alternative first exon (18% versus 33%) and mutually exclusive exon (2% versus 15%) events were reduced in Ime4null compared to a global breakdown of alternative splicing in wild-type Drosophila, mostly to the extent of retained introns (16% versus 6%), alternative donor (16% versus 9%) and unclassified events (14% versus 6%). Notably, the majority of affected alternative splicing events in Ime4null were located to the 5' UTR, and these genes had a significantly higher number of AUG start codons in their 5' UTR compared to the 5' UTRs of all genes. Such a feature has been shown to be relevant to translational control under stress conditions (Haussmann, 2016).

    The majority of the 163 differentially alternatively spliced genes in Ime4 females are broadly expressed (59%), while most of the remainder are expressed in the nervous system (33%), consistent with higher expression of Ime4 in this tissue. Accordingly, Gene Ontology analysis revealed a highly significant enrichment for genes involved synaptic transmission (Haussmann, 2016).

    Since the absence of m6A affects alternative splicing, m6A marks are probably deposited co-transcriptionally before splicing. Co-staining of polytene chromosomes with antibodies against haemagglutinin (HA)-tagged Ime4 and RNA Pol II revealed broad co-localization of Ime4 with sites of transcription, but not with condensed chromatin-visualized with antibodies against histone H4. Furthermore, localization of Ime4 to sites of transcription is RNA-dependent, as staining for Ime4, but not for RNA Pol II, was reduced in an RNase-dependent manner (Haussmann, 2016).

    Although m6A levels after G are low in Drosophila compared to other eukaryotes, broad co-localization of Ime4 to sites of transcription suggests profound effects on the gene expression landscape. Indeed, differential gene expression analysis revealed 408 differentially expressed genes where 234 genes were significantly upregulated and 174 significantly downregulated in neuron-enriched head/thorax of adult Ime4null females. Cataloguing these genes according to function reveals prominent effects on gene networks involved in metabolism, including reduced expression of 17 genes involved in oxidative phosphorylation. Notably, overexpression of the m6A mRNA demethylase FTO in mice leads to an imbalance in energy metabolism resulting in obesity (Haussmann, 2016).

    Next, whether either of the two substantially divergent YTH proteins, YT521-B and CG6422, decodes m6A marks in Sxl mRNA was tested. When transiently transfected into male S2 cells, YT521-B localizes to the nucleus, whereas CG6422 is cytoplasmic. Nuclear YT521-B can switch Sxl alternative splicing to the female mode and also binds to the Sxl intron in S2 cells. In vitro binding assays with the YTH domain of YT521-B demonstrate increased binding of m6A-containing RNA. In vivo, YT521-B also localizes to the sites of transcription (Haussmann, 2016).

    To further examine the role of YT521-B in decoding m6A Drosophila strain YT521-BMI02006, where a transposon in the first intron disrupts YT521-B, was analyaed. This allele is also viable, and phenocopies the flightless phenotype and the female Sxl splicing defect of Ime4null flies. Likewise, removal of maternal YT521-B together with zygotic heterozygosity for Sxl and YT521-B reduces female viability and results in sexual transformations such as male abdominal pigmentation. In addition, overexpression of YT521-B results in male lethality, which can be rescued by removal of Ime4, further reiterating the role of m6A in Sxl alternative splicing. Since YT521-B phenocopies Ime4 for Sxl splicing regulation, it is the main nuclear factor for decoding m6A present in the proximity of the Sxl-binding sites. YT521-B bound to m6A assists Sxl in repressing inclusion of the male-specific exon, thus providing robustness to this vital gene regulatory switch (Haussmann, 2016).

    Nuclear localization of m6A methylosome components suggested a role for this 'fifth' nucleotide in alternative splicing regulation. The discovery of the requirement of m6A and its reader YT521-B for female-specific Sxl alternative splicing has important implications for understanding the fundamental biological function of this enigmatic mRNA modification. Its key role in providing robustness to Sxl alternative splicing to prevent ectopic dosage compensation and female lethality, together with localization of the core methylosome component Ime4 to sites of transcription, indicates that the m6A modification is part of an ancient, yet unexplored mechanism to adjust gene expression. Hence, the recently reported role of m6A methylosome components in human dosage compensation further support such a role and suggests that m6A-mediated adjustment of gene expression might be a key step to allow for the development of the diverse sex determination mechanisms found in nature (Haussmann, 2016).

    Early Divergence of the C-Terminal Variable Region of Troponin T Via a Pair of Mutually Exclusive Alternatively Spliced Exons Followed by a Selective Fixation in Vertebrate Heart

    Troponin T (TnT) is the thin filament anchoring subunit of troponin complex and plays an organizer role in the Ca(2+)-regulation of striated muscle contraction. From an ancestral gene emerged ~ 700 million years ago in Bilateria, three homologous genes have evolved in vertebrates to encode muscle type-specific isoforms of TnT. Alternative splicing variants of TnT are present in vertebrate and invertebrate muscles to add functional diversity. While the C-terminal region of TnT is largely conserved, it contains an alternatively spliced segment emerged early in C. elegans, which has evolved into a pair of mutually exclusive exons in arthropods (10A and 10B of Drosophila TpnT gene) and vertebrates (16 and 17 of fast skeletal muscle Tnnt3 gene). The C-terminal alternatively spliced segment of TnT interfaces with the other two subunits of troponin with functional significance. The vertebrate cardiac TnT gene that emerged from duplication of the fast TnT gene has eliminated this alternative splicing by the fixation of an exon 17-like constitutive exon, indicating a functional value in slower and rhythmic contractions. The vertebrate slow skeletal muscle TnT gene that emerged from duplication of the cardiac TnT gene has the exon 17-like structure conserved, indicating its further function in sustained and fatigue resistant contractions. This functionality-based evolution is consistent with the finding that exon 10B-encoded segment of Drosophila TnT homologous to the exon 17-encoded segment of vertebrate fast TnT is selectively expressed in insect heart and leg muscles. The evolution of the C-terminal variable region of TnT demonstrates a submolecular mechanism in modifying striated muscle contractility and for the treatment of muscle and heart diseases (Cao, 2022).

    Deep Splicer: A CNN Model for Splice Site Prediction in Genetic Sequences

    The genes of higher eukaryotic organisms contain coding sequences, known as exons and non-coding sequences, known as introns, which are removed on splice sites after the DNA is transcribed into RNA. Genome annotation is the process of identifying the location of coding regions and determining their function. This process is fundamental for understanding gene structure; however, it is time-consuming and expensive when done by biochemical methods. With technological advances, splice site detection can be done computationally. Although various software tools have been developed to predict splice sites, they need to improve accuracy and reduce false-positive rates. The main goal of this research was to generate Deep Splicer, a deep learning model to identify splice sites in the genomes of humans and other species. This model has good performance metrics and a lower false-positive rate than the currently existing tools. Deep Splicer achieved an accuracy between 93.55% and 99.66% on the genetic sequences of different organisms, while Splice2Deep, another splice site detection tool, had an accuracy between 90.52% and 98.08%. Splice2Deep surpassed Deep Splicer on the accuracy obtained after evaluating C. elegans genomic sequences (97.88% vs. 93.62%) and A. thaliana (95.40% vs. 94.93%); however, Deep Splicer's accuracy was better for H. sapiens (98.94% vs. 97.15%) and D. melanogaster (97.14% vs. 92.30%). The rate of false positives was 0.11% for human genetic sequences and 0.25% for other species' genetic sequences. Another splice prediction tool, Splice Finder, had between 1% and 3% of false positives for human sequences, while other species' sequences had around 4% and 10% (Fernandez-Castillo, 2022).

    ELAV/Hu RNA binding proteins determine multiple programs of neural alternative splicing

    ELAV/Hu factors are conserved RNA binding proteins (RBPs) that play diverse roles in mRNA processing and regulation. The founding member, Drosophila Elav, was recognized as a vital neural factor 35 years ago. Nevertheless, little was known about its impacts on the transcriptome, and potential functional overlap with its paralogs. Building on recent findings that neural-specific lengthened 3' UTR isoforms are co-determined by ELAV/Hu factors, this study addressed their impacts on splicing. While only a few splicing targets of Drosophila are known, ectopic expression of each of the three family members (Fne and Rbp9) alters hundreds of cassette exon and alternative last exon (ALE) splicing choices. Reciprocally, double mutants of elav/fne, but not elav alone, exhibit opposite effects on both classes of regulated mRNA processing events in larval CNS. While manipulation of Drosophila ELAV/Hu RBPs induces both exon skipping and inclusion, characteristic ELAV/Hu motifs are enriched only within introns flanking exons that are suppressed by ELAV/Hu factors. Moreover, the roles of ELAV/Hu factors in global promotion of distal ALE splicing are mechanistically linked to terminal 3' UTR extensions in neurons, since both processes involve bypass of proximal polyadenylation signals linked to ELAV/Hu motifs downstream of cleavage sites. This study corroborates the direct action of Elav in diverse modes of mRNA processing using RRM-dependent Elav-CLIP data from S2 cells. Finally, evidence is provided for conservation in mammalian neurons, which undergo broad programs of distal ALE and APA lengthening, linked to ELAV/Hu motifs downstream of regulated polyadenylation sites. Overall, ELAV/Hu RBPs orchestrate multiple broad programs of neuronal mRNA processing and isoform diversification in Drosophila and mammalian neurons (Lee, 2021).

    Mammalian ELAV/Hu RBPs have been extensively connected to alternative splicing of cassette exons, but only to selected alternative polyadenylation (APA) events. In contrast, only a handful of Drosophila genes were known to be alternative splicing targets of Elav, of which only two loci (Dscam1 and arm) harbor regulated cassette exons. Thus, it was unclear to what extent there are conserved utilities of this RBP family in mRNA processing (Lee, 2021).

    This study shows that all three ELAV/Hu members specifying hundreds of alternative splicing events. We show endogenous relevance, by demonstrating that dual deletion of elav and fne causes reciprocal changes to splice isoform accumulation. Notably, this study revealed the endogenous breadth of splicing control by ELAV/Hu RBPs by analyzing dissected larval CNS, which contains more mature neurons than embryos and also removes the expression of non-neuronal isoforms outside of the nervous system from consideration. In particular, elav null L1-CNS has only mild effects on alternative splicing, despite its lethality, and analysis of fne nulls showed no effects on specific targets. Thus, the combined activity of ELAV/Hu RBPs, likely involving a hierarchial suppression of Fne nuclear localization via exon-exclusion of fne splicing by Elav, is critical to broadly determine neuronal mRNA isoforms (Lee, 2021).

    Until now, evidence for roles of Rbp9 in mRNA processing is based on ectopic expression. Even though Drosophila ELAV/Hu RBPs exhibit distinct subcellular preferences, all of them exhibit similar binding capacities in vitro, and have overlapping regulatory capacities in ectopic assays. Since triple mutant larvae of Drosophila ELAV/Hu members could not be attained, it was not possible to assay nervous system devoid of this RBP family. This may require creative conditional genetics to achieve the requisite conditions, especially in pupal and/or adult stages, when Rbp9 is expressed at much higher levels in the nervous system (Lee, 2021).

    Substantial differences were observed in the flanking intronic content of exon classes that are regulated ELAV/Hu RBPs. Their exclusion targets are substantially enriched for characteristic U-rich ELAV/Hu binding motifs, and have elevated Elav-CLIP signal, but such features were not observed with their inclusion targets. In general, little is known of the mechanism of splicing control by ELAV/Hu RBPs. In mammals, exclusion of a Fas cassette exon by HuR was reported to involve competition with U2AF65 at the upstream 3' splice site. A competition model is potentially consistent with the fly data, since substantially higher density of ELAV/Hu RBP motifs was observed upstream of excluded exons. However, this study also observe enrichment of ELAV/Hu RBP motifs downstream, although to a lesser extent. For exons that are preferentially included in the presence of ELAV/Hu members, they might still depend on binding that is below the sensitivity of these analyses. Another possibility is that these exons might involve additional regulatory factors, which is hinted at by enrichment for A-rich motifs located downstream of regulated exons. It was notde that PABP, PABP2 (PABPN1), ZC3H14/dNab2, and hnRNP-Q (Syncrip) proteins associate with qualitatively similar A-rich motifs, and include known neuronal splicing regulators. The discovery of extensive ELAV/Hu-mediated cassette exon targets, including the finding that individual ELAV/Hu proteins can robustly induce exon exclusion and inclusion in an ectopic context, provides a framework for future mechanistic dissection (Lee, 2021).

    Many studies in the literature have treated ALEs and tandem UTRs separately, since ALEs may be regulated by splicing while tandem UTRs are only regulated by alternative polyadenylation. Nevertheless, distal ALE and downstream tandem APA usage are correlated in mammals, with directionality toward more distal/longer isoforms in neurons. The underlying mechanisms have not been specifically defined. It is known that telescripting, suppression of premature cleavage and polyadenylation, via U1 snRNP suppresses premature 3'-end cleavage and polyadenylation. While this can occur in intronic regions and terminal 3' UTRs, the dominant usage of this mechanism seems to be for U1 to inhibit the usage of cryptic polyadenylation signals that are especially abundant within long introns, and U1/telescripting has not yet been shown to have a broad impact on endogenous tissue-specific implementation of 3' isoforms (Lee, 2021).

    Drosophila Elav was linked to both isoform regulatory programs, since it was originally shown to promote distal ALE switching by suppressing 3' end usage of proximal internal last exons at ewg and nrg and later shown to mediate neuronal 3' extension of select loci. Likewise, regulation of APA was shown for all four Hu proteins in suppressing an intronic polyA site in the calcitonin/CGRP gene and HuR autoregulates by APA. In addition, HuR regulates 3'-end processing of several membrane proteins. This individual cases set the possibility that ELAV/Hu RBPs may coregulate these programs (Lee, 2021).

    This work work has established that the three Drosophila ELAV/Hu members (Elav/Fne/Rbp9) are individually sufficient to induce the neural extended 3' UTR landscape, and that endogenous overlapping activities of Drosophila Elav and its paralog Fne are critical to determine the extended 3' UTR landscape of the larval CNS, as also shown in the embryo. This study extends this to reveal broad catalogs of directional alternative last exon (ALE) isoform switches by ELAV/Hu factors. Using mechanistic tests and genomic analyses of de novo motif and RRM-dependent Elav CLIP maps this study was now able to unify the rationale for distinct neuronal mRNA processing programs. In particular, Drosophila ELAV/Hu RBPs are necessary and sufficient to specify broad switching to distal alternative last exons, analogous to broad lengthening of terminal 3' UTRs via usage of distal pA sites. In both settings, ELAV/Hu RBPs suppress proximal pA sites via downstream U-rich sequences/ELAV motifs downstream of cleavage sites, and promote distal isoform usage by acting within newly-synthesized, chromatin-associated transcripts. Since this study also found that ELAV/Hu proteins are broadly involved in exon exclusion, via overt enrichment of their sites near regulated exons, broad analogies are suggested for ELAV/Hu RBPs to promote isoform diversity by suppression of processing sites used outside of the nervous system (Lee, 2021).

    Importantly, it is suggested that similar regulatory rationale applies to the implementation of both neuronal ALE and APA in mammalian neurons. In particular, this study provides evidence that ELAV/Hu RBPs are poised to regulate both classes of 3' ends using similar mechanisms (i.e. polyA bypass mediated through U-rich sequences). Mammalian ELAV/Hu factors are well-known to mediate diverse regulatory outputs, ranging from mRNA stability and translation, to splicing and terminal APA regulation of selected loci. However, they are not yet documented to have broad roles in directional selection of alternative last exons or pA sites within terminal 3' UTRs. This genomic analyses now lends strong support to this notion (Lee, 2021).

    Given that Elav paralogs have strongly compensatory activity that masks the effects of single elav mutants, and only double mutants of mammalian neural Elav factors have been examined to date, it is suggested that other multiple-knockout conditions may reveal greater collective impacts of ELAV/Hu factors on the neural transcriptome. More generally, the data argue that these classes of 3' ends can be broadly coregulated and that they may be just two versions of the same process (with splicing playing a comparative minor role in ALE regulation compared to polyadenylation). This may underlie the observation that global ALE-APA and TUTR-APA utilization are broadly correlated in mammals, and may be coregulated by other RBPs (Lee, 2021).

    Extensive cross-regulation of post-transcriptional regulatory networks in Drosophila

    In eukaryotic cells, RNAs exist as ribonucleoprotein particles (RNPs). Despite the importance of these complexes in many biological processes including splicing, polyadenylation, stability, transportation, localization, and translation, their compositions are largely unknown. Twenty distinct RNA binding proteins (RBPs) were immunopurified from cultured Drosophila melanogaster cells under native conditions, and both the RNA and protein compositions of these RNP complexes were determined. "High occupancy target" (HOT) RNAs were identified that interact with the majority of the RBPs surveyed. HOT RNAs encode components of the nonsense-mediated decay and splicing machinery as well as RNA binding and translation initiation proteins. The RNP complexes contain proteins and mRNAs involved in RNA binding and post-transcriptional regulation. Genes with the capacity to produce hundreds of mRNA isoforms, ultra-complex genes, interact extensively with heterogeneous nuclear ribonuclear proteins (hnRNPs). This data is consistent with a model in which subsets of RNPs include mRNA and protein products from the same gene, indicating the widespread existence of auto-regulatory RNPs. From the simultaneous acquisition and integrative analysis of protein and RNA constituents of RNPs this study identified extensive cross-regulatory and hierarchical interactions in post-transcriptional control (Stoiber, 2015).

    Drosophila Nmnat functions as a switch to enhance neuroprotection under stress

    Nicotinamide mononucleotide adenylyltransferase (NMNAT) is a conserved enzyme in the NAD synthetic pathway. It has also been identified as an effective and versatile neuroprotective factor. However, it remains unclear how healthy neurons regulate the dual functions of NMNAT and achieve self-protection under stress. This study shows that Drosophila Nmnat (DmNmnat) is alternatively spliced into two mRNA variants, RA and RB, which translate to protein isoforms with divergent neuroprotective capacities against spinocerebellar ataxia 1-induced neurodegeneration. Isoform PA/PC translated from RA is nuclear-localized with minimal neuroprotective ability, and isoform PB/PD translated from RB is cytoplasmic and has robust neuroprotective capacity. Under stress, RB is preferably spliced in neurons to produce the neuroprotective PB/PD isoforms. These results indicate that alternative splicing functions as a switch that regulates the expression of functionally distinct DmNmnat variants. Neurons respond to stress by driving the splicing switch to produce the neuroprotective variant and therefore achieve self-protection (Ruan, 2015).

    Alternative splicing within and between Drosophila species, sexes, tissues, and developmental stages

    Alternative pre-mRNA splicing ("AS") greatly expands proteome diversity. The transcriptomes from several tissues and developmental stages were studied in males and females from four species across the Drosophila genus. 20-37% of multi-exon genes were found to be alternatively spliced. While males generally express a larger number of genes, AS is more prevalent in females, suggesting that the sexes adopt different expression strategies for their specialized function. The proportion of expressed genes that are alternatively spliced is highest in the very early embryo, before the onset of zygotic transcription. This indicates that females deposit a diversity of isoforms into the egg, consistent with abundant AS found in ovary. Cluster analysis by gene expression levels shows mostly stage-specific clustering in embryonic samples, and tissue-specific clustering in adult tissues. Clustering embryonic stages and adult tissues based on AS profiles results in stronger species-specific clustering, suggesting that diversification of splicing contributes to lineage-specific evolution in Drosophila. Most sex-biased AS found in flies is due to AS in gonads, with little sex-specific splicing in somatic tissues (Gibilisco, 2016).

    Protein composition of catalytically active U7-dependent processing complexes assembled on histone pre-mRNA containing biotin and a photo-cleavable linker

    3' end cleavage of metazoan replication-dependent histone pre-mRNAs requires the multi-subunit holo-U7 snRNP and the stem-loop binding protein (SLBP). The exact composition of the U7 snRNP and details of SLBP function in processing remain unclear. To identify components of the U7 snRNP in an unbiased manner, a novel approach was developed for purifying processing complexes from Drosophila and mouse nuclear extracts. In this method, catalytically active processing complexes are assembled in vitro on a cleavage-resistant histone pre-mRNA containing biotin and a photo-sensitive linker, and eluted from streptavidin beads by UV irradiation for direct analysis by mass spectrometry. In the purified processing complexes, Drosophila and mouse U7 snRNP have a remarkably similar composition, always being associated with CPSF73, CPSF100, symplekin and CstF64. Many other proteins previously implicated in the U7-dependent processing are not present. Drosophila U7 snRNP bound to histone pre-mRNA in the absence of SLBP contains the same subset of polyadenylation factors but is catalytically inactive and addition of recombinant SLBP is sufficient to trigger cleavage. This result suggests that Drosophila SLBP promotes a structural rearrangement of the processing complex, resulting in juxtaposition of the CPSF73 endonuclease with the cleavage site in the pre-mRNA substrate (Skrajna, 2018).

    In metazoans, 3' end processing of replication-dependent histone pre-mRNAs occurs through a single endonucleolytic cleavage, generating mature histone mRNAs that lack a poly(A) tail. This specialized 3' end processing reaction depends on the U7 snRNP, the core of which consists of a ~60-nt U7 snRNA and a unique heptameric Sm ring. In the ring, the spliceosomal subunits SmD1 and SmD2 are replaced by the related Lsm10 and Lsm11 proteins, whereas the remaining subunits (SmB, SmD3, SmE, SmF and SmG) are shared with the spliceosomal snRNPs (Skrajna, 2018).

    Lsm11 contains an extended N-terminal region that interacts with the N-terminal region of the 220 kDa protein FLASH. Together, they recruit a specific subset of the proteins that participate in 3' end processing of canonical pre-mRNAs by cleavage and polyadenylation, resulting in formation of the holo-U7 snRNP (Skrajna, 2014). This subset of polyadenylation factors is referred to as the histone pre-mRNA cleavage complex (HCC) and in mammalian nuclear extracts includes symplekin, all subunits of CPSF (CPSF160, WDR33, CPSF100, CPSF73, Fip1 and CPSF30) and CstF64 as the only CstF subunit. The remaining components of the cleavage and polyadenylation machinery, including CstF50 and CstF77, the two CF Im subunits of 68 and 25 kDa, and the two subunits of CF IIm (Clp1 and Pcf11) were consistently absent in the HCC. A similar subset of polyadenylation factors is associated with the Drosophila holo-U7 snRNP (Skrajna, 2018 and references therein).

    The substrate specificity in the processing reaction is provided by the U7 snRNA, which through its 5' terminal region base pairs with the histone downstream element (HDE), a sequence in histone pre-mRNA located downstream of the cleavage site. This interaction is assisted by the stem-loop binding protein (SLBP), which binds the highly conserved stem-loop structure located upstream of the cleavage site (Wang, 1996; Martin, 1997; Tan, 2013) and stabilizes the complex of U7 snRNP with histone pre-mRNA (Dominski, 1999), likely by contacting FLASH and Lsm11 (Skrajna, 2017). In mammalian nuclear extracts, histone pre-mRNAs that form a strong duplex with the U7 snRNA are cleaved efficiently in the absence of SLBP. In contrast, Drosophila nuclear extracts lacking SLBP are inactive in cleaving histone pre-mRNAs, suggesting that Drosophila SLBP plays an essential role in processing in addition to stabilizing binding of the U7 snRNP to histone pre-mRNA (Skrajna, 2018 and references therein).

    Within the HCC, CPSF73 is the endonuclease, acting in a close partnership with its catalytically inactive homolog, CPSF100, and the heat-labile scaffolding protein symplekin. RNAi-mediated depletion of these three HCC subunits in Drosophila cultured cells results in generation of polyadenylated histone mRNAs, an indication of their essential role in the U7-dependent processing. Depletion of the remaining components of the HCC had no effect on the 3' end of histone mRNAs and their function in the U7 snRNP, if any, is less clear. Previous in vivo studies implicated multiple other proteins, in addition to SLBP and components of the U7 snRNP, in generation of correctly processed histone pre-mRNAs. These proteins include ZFP100, CDC73/parafibromin, NELF E, Ars2, CDK9, CF Im68 and RNA-binding protein FUS/TLS (Fused in Sarcoma/Translocated in Sarcoma). ZFP100, CF Im68 and FUS were shown to interact with Lsm11, whereas Ars2 was shown to interact with FLASH, raising the possibility that they may be essential components of the cleavage machinery (Skrajna, 2018).

    To determine which factors are required for the cleavage reaction, a novel method for purification of in vitro assembled Drosophila and mouse processing complexes was developed. In this method, histone pre-mRNAs containing biotin and a photo-cleavable linker in either cis or trans are incubated with a nuclear extract and the assembled processing complexes are immobilized on streptavidin beads, washed and released into solution by irradiation with long wave UV. This approach yielded remarkably pure processing complexes that were suitable for direct and unbiased analysis by mass spectrometry, providing a complete view of the holo-U7 snRNP and other proteins that associate with histone pre-mRNA for 3' end processing (Skrajna, 2018).

    In this method, processing complexes were assembled in a nuclear extract on a synthetic histone pre-mRNA containing biotin and a photo-cleavable linker at the 5' end. The major cleavage site and the two neighboring nucleotides on each side were modified with a 2'O-methyl group, hence preventing endonucleolytic cleavage of the pre-mRNA and increasing the efficiency of capturing intact processing complexes. Following immobilization on streptavidin beads, the pre-mRNA and the bound proteins were washed and released to solution by irradiation with long wave UV. This UV-elution step, by eliminating all background proteins non-specifically bound to streptavidin beads, resulted in isolation of remarkably pure processing complexes that were suitable for direct analysis by mass spectrometry. This is the first successful use of the photo-cleavable linker and the UV-elution step for purification of an in vitro assembled RNA/protein complex. Parallel experiments with pre-mRNA substrates lacking 2'O-methyl nucleotides at the cleavage site demonstrated that the immobilized processing complexes retain catalytic activity. Thus, the mass spectrometry analysis of the UV-eluted material is likely to provide a global and unbiased view of all essential proteins that associate with histone pre-mRNA for 3' end processing (Skrajna, 2018).

    Since chemical synthesis of RNAs containing covalently attached biotin and the photo-cleavable linker (cis configuration) is both expensive and limited to sequences not exceeding 60-70 nt, longer histone pre-mRNAs generated by T7 transcription were tested. Biotin and the photo-cleavable linker can be attached to the 3' end of these pre-mRNAs in trans via a short complementary oligonucleotide. This modification makes the UV-elution method more cost effective and potentially applicable for purification of RNA-protein complexes that require longer RNA binding targets, including spliceosomes and complexes involved in cleavage and polyadenylation (Skrajna, 2018).

    In the UV-eluted mouse and Drosophila processing complexes, mass spectrometry identified SLBP and all known subunits of the U7-specific Sm ring, including Lsm10 and Lsm11. Readily detectable in mouse and Drosophila processing complexes were also FLASH and subunits of the HCC. The HCC is remarkably similar in composition between the two species, with symplekin, CPSF100, CPSF73 and CstF64 being most abundant and present in close to stoichiometric amounts, as determined by both silver staining and emPAI value analysis. The remaining CPSF subunits (CPSF160, WDR33, Fip1 and CPSF30) are present in lower amounts, suggesting that they are substoichiometric, being stably associated only with a fraction of the U7 snRNP (Skrajna, 2018).

    In both mouse and Drosophila experiments, SLBP and the components of the U7 snRNP were the only proteins that consistently failed to bind histone pre-mRNAs in the presence of processing competitors: SL RNA and αU7 oligonucleotide. Other proteins were detected both in the samples containing processing complexes and in the matching negative controls, where formation of processing complexes was blocked. Among them, the most prevalent were non-specific RNA binding proteins, including hnRNP Q in mouse nuclear extracts, and IGF2BP1 in Drosophila nuclear extracts. All these proteins likely bind to sites in histone pre-mRNAs unoccupied by SLBP and U7 snRNP, and play no essential role in processing (Skrajna, 2018).

    CstF50 and CstF77 were not detected in the UV-eluted mouse processing complexes and were present only in some Drosophila complexes, always with low scores, consistent with a previous conclusion that of the three CstF subunits only CstF64 stably associates with the U7 snRNP. No peptides were detected for CF Im (68 and 25 kDa) and CF IIm (Clp1 and Pcf11) in any of the mouse experiments, suggesting that these factors are also uniquely involved in cleavage and polyadenylation. Mass spectrometry identified the orthologues of the 68 and 25 kDa subunits in some Drosophila experiments, but they were clearly contaminants, persisting in the presence of the SL RNA and αU7 oligonucleotide. CF Im68 was previously reported to interact with Lsm11 and to co-purify with U7 snRNP. Based on this analysis, this subunit is unlikely to interact with Lsm11 in the processing complex (Skrajna, 2018).

    Catalytically active mouse processing complexes also lacked ZFP100 (ZN473), a zinc finger protein that co-localizes with Lsm11 and stimulates expression of a reporter gene containing U7-dependent processing signals. ZFP100 was initially identified by the yeast two-hybrid system as a protein interacting with SLBP bound to the SL RNA and suggested to function as a bridging factor in the SLBP-mediated recruitment of the U7 snRNP to histone pre-mRNA. However, the absence of ZFP100 in the UV-eluted mouse processing complexes containing both SLBP and U7 snRNP strongly argues against this function. ZFP100 may instead participate in a different aspect of histone gene expression in vivo, perhaps acting as a coupling factor that integrates transcription of histone genes with 3' end processing of the nascent histone pre-mRNAs (Skrajna, 2018).

    A similar role in vivo may be played by the multi-functional protein FUS and other proteins previously linked to 3' end processing of histone pre-mRNAs in mammalian cells, including Ars2, CDC73/parafibromin, NELF E and CDK9. These factors were never specifically detected in the UV-eluted mouse processing complexes, suggesting that they have no direct role in processing in vitro. Their downregulation by RNAi results in production of a small amount of polyadenylated histone mRNAs, which may be due to a defect in coupling of histone gene transcription with processing and/or cell-cycle progression (Skrajna, 2018).

    Although this study identified several polyadenylation subunits in a stable association with the U7 snRNP, the experiments do not directly address which of them are essential for processing of histone pre-mRNAs. In Drosophila cultured cells, RNAi-mediated depletion of each of only three U7-associated polyadenylation subunits, symplekin, CPSF100 and CPSF73, consistently resulted in accumulation of histone mRNAs terminated with a poly(A) tail, an indication of a defect in the U7-dependent processing mechanism. Depletion of the remaining HCC subunits had no effect, suggesting that their association with the U7 snRNP is not essential for 3' end processing of histone pre-mRNAs. Symplekin, CPSF100 and CPSF73 are present in Drosophila cells as a stable sub-complex and likely act together as an autonomous cleavage module recruited for processing to either histone or canonical pre-mRNAs by specialized RNA recognition sub-complexes. For canonical pre-mRNAs, this role is played by the remaining CPSF subunits, CPSF160, WDR33, Fip1 and CPSF30, recently shown to co-operate in recognizing the AAUAAA signal during the polyadenylation step. In 3' end processing of histone pre-mRNAs, the recruitment of the cleavage sub-complex is mediated by the U7 snRNA, which recognizes the substrate by the base pairing interaction, further arguing that CPSF160, WDR33, Fip1 and CPSF30 are likely non-essential bystanders in the U7 snRNP (Skrajna, 2018).

    A less clear role in 3' end processing of histone pre-mRNAs is played by CstF64, which in spite of being relatively abundant in Drosophila U7 snRNP can be depleted from Drosophila cells without causing a detectable misprocessing of histone pre-mRNAs. A defect in the U7-dependent processing was however observed in human cells partially depleted of CstF64, suggesting that in mammalian cells this subunit may play a more critical role, perhaps helping to stabilize the three essential subunits of the HCC on the FLASH/Lsm11 complex. Clearly, determining which subunits are essential for cleavage will require reconstitution of a catalytically active processing complex from recombinant components (Skrajna, 2018).

    This study brings a new perspective on the essential role of Drosophila SLBP in processing. It was recently demonstrated that Drosophila SLBP, like its mammalian counterpart, enhances the recruitment of U7 snRNP to histone pre-mRNA. A small amount of U7 snRNP binds to histone pre-mRNA in the absence of Drosophila SLBP but the bound U7 snRNP in spite of containing all major HCC subunits is catalytically inactive. This study now shows that processing complexes assembled in the absence of SLBP can be activated for cleavage by simply adding recombinant WT SLBP, providing evidence that SLBP is the only missing factor in the assembled complexes. A mutant Drosophila SLBP that is deficient in recruiting U7 snRNP to histone pre-mRNA is also unable to activate the assembled complex for cleavage. Based on these results, it is proposed that the interaction of Drosophila SLBP with the U7 snRNP promotes an essential structural rearrangement of the entire processing complexes that juxtaposes the catalytic site of CPSF73 with the pre-mRNA (see A hypothetical model explaining essential role of Drosophila in processing). It is possible that higher metazoans developed an additional positioning mechanism for the CPSF73 endonuclease, resulting in efficient cleavage in the absence of SLBP (Skrajna, 2018).

    Sex-specific transcript diversity in the fly head Is established during pupal stages and adulthood and is largely independent of the mating process and the germline

    Alternative splicing (AS), the process which generates multiple RNA and protein isoforms from a single pre-mRNA, greatly contributes to transcript diversity and compensates for the fact that the gene number does not scale with organismal complexity. A number of genomic approaches have established that the extent of AS is much higher than previously expected, raising questions on its spatio-temporal regulation and function. The present study addresses AS in the context of sex-specific neuronal development in the model Drosophila melanogaster. At least 47 genes display sex-specific AS in the adult fly head. Unlike targets of the classical Sex lethal-dependent sex determination cascade, sex-specific isoforms of the vast majority of these genes are not present during larval development but start accumulating during metamorphosis or later, indicating the existence of novel mechanisms in the induction of sex-specific AS. It was also established that sex-specific AS in the adult fly head is largely independent of the germline or the mating process. Finally, the role of sex-specific AS of the sulfotransferase Tango13 pre-mRNA was investigated and first evidence is provided that differential expression of certain isoforms of this protein significantly affects courtship and mating behavior in male flies (Mohr, 2017)

    The Y chromosome modulates splicing and sex-biased intron retention rates in Drosophila

    The Drosophila Y chromosome is a 40MB segment of mostly repetitive DNA; it harbors a handful of protein coding genes and a disproportionate amount of satellite repeats, transposable elements, and multicopy DNA arrays. Intron retention (IR) is a type of alternative splicing (AS) event by which one or more introns remain within the mature transcript. IR recently emerged as a deliberate cellular mechanism to modulate gene expression levels and has been implicated in multiple biological processes. However, the extent of sex differences in IR and the contribution of the Y chromosome to the modulation of alternative splicing and intron retention rates has not been addressed. This study showed pervasive intron retention (IR) in the fruit fly Drosophila melanogaster with thousands of novel IR events, hundreds of which displayed extensive sex-bias. The data also revealed an unsuspected role for the Y chromosome in the modulation of alternative splicing and intron retention. The majority of sex-biased IR events introduced premature termination codons and the magnitude of sex-bias was associated with gene expression differences between the sexes. Surprisingly, an extra Y chromosome in males (X^YY genotype) or the presence of a Y chromosome in females (X^XY genotype) significantly modulated IR and recapitulated natural differences in IR between the sexes. These results highlight the significance of sex-biased IR in tuning sex differences and the role of the Y chromosome as a source of variable IR rates between the sexes. Modulation of splicing and intron retention rates across the genome represent new and unexpected outcomes of the Drosophila Y chromosome (Wang, 2018).

    The kinetics of pre-mRNA splicing in the Drosophila genome and the influence of gene architecture

    Production of most eukaryotic mRNAs requires splicing of introns from pre-mRNA. The splicing reaction requires definition of splice sites, which are initially recognized in either intron-spanning ('intron definition') or exon-spanning ('exon definition') pairs. To understand how exon and intron length and splice site recognition mode impact splicing, splicing rates were measured genome-wide in Drosophila, using metabolic labeling/RNA sequencing and new mathematical models to estimate rates. The modal intron length range of 60-70 nt was found to represent a local maximum of splicing rates, but much longer exon-defined introns are spliced even faster and more accurately. Unexpectedly low variation was observed in splicing rates across introns in the same gene, suggesting the presence of gene-level influences, and multiple gene level variables associated with splicing rate were identified. Together these data suggest that developmental and stress response genes may have preferentially evolved exon definition in order to enhance the rate or accuracy of splicing (Pai, 2017).

    piRNA-mediated regulation of transposon alternative splicing in the soma and germ line

    Transposable elements can drive genome evolution, but their enhanced activity is detrimental to the host and therefore must be tightly regulated. The Piwi-interacting small RNA (piRNA) pathway is vital for the regulation of transposable elements, by inducing transcriptional silencing or post-transcriptional decay of mRNAs. This study shows that piRNAs and piRNA biogenesis components regulate precursor mRNA splicing of P-transposable element transcripts in vivo, leading to the production of the non-transposase-encoding mature mRNA isoform in Drosophila germ cells. Unexpectedly, it was shown that the piRNA pathway components do not act to reduce transcript levels of the P-element transposon during P-M hybrid dysgenesis, a syndrome that affects germline development in Drosophila. Instead, splicing regulation is mechanistically achieved together with piRNA-mediated changes to repressive chromatin states, and relies on the function of the Piwi-piRNA complex proteins Asterix (also known as Gtsf1) and Panoramix (Silencio), as well as Heterochromatin protein 1a [HP1a; encoded by Su(var)205]. Furthermore, this machinery, together with the piRNA Flamenco cluster, not only controls the accumulation of Gypsy retrotransposon transcripts but also regulates the splicing of Gypsy mRNAs in cultured ovarian somatic cells, a process required for the production of infectious particles that can lead to heritable transposition events. These findings identify splicing regulation as a new role and essential function for the Piwi pathway in protecting the genome against transposon mobility, and provide a model system for studying the role of chromatin structure in modulating alternative splicing during development (Teixeira, 2017).

    Hybrid dysgenesis is a syndrome that affects progeny in a non-reciprocal fashion, being normally restricted to the offspring of crosses in which males carry transposable elements but which females lack. In Drosophila, the dysgenic traits triggered by the P-element DNA transposon are restricted to the germ line and include chromosomal rearrangements, high rates of mutation, and sterility. The impairment is most prominent when hybrids are grown at higher temperatures, with adult dysgenic females being completely sterile at 29°C. Despite the severe phenotypes, little is known about the development of germ cells during P-M dysgenesis. To address this, germline development was characterized in the progeny obtained from reciprocal crosses between w1118 (P-element-devoid strain) and Harwich (P-element-containing strain) flies at 29°C. In non-dysgenic progeny, germline development progressed normally throughout embryonic and larval stages, leading to fertile adults. Although the development of dysgenic germline cells was not disturbed during embryogenesis, germ cells decreased in number during early larval stages, leading to animals with no germ cells by late larval stages. These results indicate that the detrimental effects elicited by P-element activity are triggered early on during primordial germ cell (PGC) development in dysgenic progeny, leading to premature germ cell death (Teixeira, 2017).

    Maternally deposited small RNAs cognate to the P-element are thought to provide the 'P-cytotype' by conferring the transgenerationally inherited ability to protect developing germ cells against P-elements. Small RNA-based transposon regulation is typically mediated by either transcriptional silencing or post-transcriptional clearance of mRNAs, both of which result in a decrease in the accumulation of transposon mRNA. To understand how maternally provided small RNAs control P-elements in germ cells, this study focused on embryonic PGCs sorted from 4- to 20-h-old embryos generated from reciprocal crosses between w1118 and Harwich strains. Surprisingly, the accumulation of P-element RNA as measured by quantitative reverse transcription PCR (RT-qPCR) showed no change in dysgenic PGCs when compared to non-dysgenic PGCs. This indicates that P-cytotype small RNAs exert their function by means other than regulating P-element mRNA levels (Teixeira, 2017).

    P-element activity relies on production of a functional P-element transposase protein, the expression of which requires precursor mRNA (pre-mRNA) splicing of three introns. To analyse P-element RNA splicing in germ cells during hybrid dysgenesis, primers were designed that specifically anneal to spliced mRNA transcripts. The accumulation of spliced forms for the first two introns (IVS1 and IVS2) did not show changes in dysgenic PGCs when compared to non-dysgenic PGCs. By contrast, the accumulation of spliced transcripts for the third intron (IVS3) was substantially increased in dysgenic germ cells. Given that the overall accumulation of P-element mRNA showed no changes, the results indicate that the maternally provided P-cytotype can negatively regulate P-element IVS3 splicing and therefore inhibits the production of functional P-transposase in germ cells (Teixeira, 2017).

    Analysis of publically available small RNA sequencing data from 0-2-h-old embryos laid by Harwich females indicated that two classes of small RNAs cognate to the P-element are maternally transmitted: small interfering RNAs (siRNAs, 20-22-nucleotides long) and piRNAs (23-29 nucleotides long). To test the role of distinct small RNA populations on P-element expression, mutants were analyzed uniquely affecting each small RNA biogenesis pathway in the Harwich background. Mutations that disrupt siRNA biogenesis components Dicer-2 (Dcr-2) and Argonaute 2 (AGO2), or mutations ablating components of the piRNA biogenesis pathway, such as the Argonautes piwi, aubergine (aub), and Argonaute 3 (AGO3), as well as the RNA helicase vasa (vas) and spindle E (spn-E), did not affect P-element mRNA accumulation in adult ovaries as measured by RT-qPCR. However, mutations that disrupted piRNA biogenesis, and not the siRNA pathway, led to a strong and specific increase in the accumulation of IVS3-spliced mRNAs. RNA sequencing (RNA-seq) analysis on poly(A)-selected RNAs from aub and piwi mutant adult ovaries confirmed the specific effect on IVS3 splicing. To examine transposon expression in tissue, RNA fluorescent in situ hybridization (FISH) was performed using probes specific for the P-element and for the Burdock retrotransposon, a classic target of the germline piRNA pathway. In mutants affecting piRNA biogenesis, increased abundance of Burdock RNA was readily observed in germline tissues, with most of the signal accumulating close to the oocyte. By contrast, no difference was detected in the P-element RNA FISH signal in piRNA biogenesis mutants compared to control. Nuclear RNA foci observed in nurse cells were of similar intensity and number regardless of the genotype, and cytoplasmic signal showed no detectable difference. Therefore, the results indicate that in germ cells, piRNAs specifically modulate IVS3 splicing. This regulation is reminiscent of the well-documented mechanism that restricts P-element activity to germline tissues, which involves the expression of a host-encoded RNA binding repressor protein that negatively regulates IVS3 splicing in somatic tissues (Teixeira, 2017).

    In somatic tissues, P-element alternative splicing regulation is mediated by the assembly of a splicing repressor complex on an exonic splicing silencer element directly upstream of IVS3. To test whether the P-element IVS3 and flanking exon sequences were sufficient to trigger the piRNA-mediated splicing regulation in germ cells, a transgenic reporter system for IVS3 splicing was used in which a heterologous promoter (Hsp83) drives the expression of an IVS3-lacZ-neo fusion mRNA specifically in the germ line. Using RT-qPCR, the F1 progeny from reciprocal crosses between w1118 and Harwich flies were analyzed in the presence of the hsp83-IVS3-lacZ-neo reporter. The fraction of spliced mRNAs produced from the transgenic reporter was substantially increased in dysgenic compared to non-dysgenic adult ovaries, in agreement with previously reported results. Most importantly, genetic experiments confirmed that the repression of IVS3 splicing in germ cells relies on piRNA biogenesis, as the splicing repression observed with this reporter in non-dysgenic progeny was specifically abolished in adult ovaries of aub and vas mutants (Teixeira, 2017).

    Mechanistically, piRNA-mediated splicing regulation may be achieved through direct action of piRNA complexes on target pre-mRNAs carrying the IVS3 sequence or indirectly by piRNA-mediated changes in chromatin states. Piwi-interacting proteins such as Asterix (Arx) and Panoramix (Panx) are dispensable for piRNA biogenesis but are essential for establishing Piwi-mediated chromatin changes, possibly by acting as a scaffold to recruit histone-modifying enzymes and chromatin-binding proteins to target loci. To test the role of these chromatin regulators on P-element splicing, germline-specific RNA interference (RNAi) knockdown experiments were performed in the Harwich background. Similar to what was observed for the piRNA biogenesis components, germline knockdown of Arx and Panx showed no change in the accumulation of P-element RNA, but a strong and specific effect on IVS3 splicing in adult ovaries. The same pattern on IVS3 splicing was observed in the germline knockdown of HP1a and Maelstrom (Mael), both of which act downstream of Piwi-mediated targeting to modulate chromatin structure. The same genetic requirement for Panx for IVS3 splicing control was also confirmed when using the transgenic IVS3 splicing reporter, further indicating that Piwi-mediated chromatin changes at the target locus are involved in IVS3 splicing regulation. At target loci, Piwi complexes are known to mediate the deposition of the classic heterochromatin mark histone H3 lysine 9 trimethylation (H3K9me3). To assess the effect of piRNA-targeting on P-element chromatin marks directly, H3K9me3 chromatin immunoprecipitation was performed followed by sequencing (ChIP-seq) or quantitative PCR on adult ovaries of progeny from reciprocal crosses between w1118 and Harwich strains (to avoid developmental defects, ChIP was performed on F1 progeny raised at 18°C. This analysis revealed a specific loss of global H3K9me3 levels over P-element insertions in dysgenic progeny when compared to non-dysgenic progeny (Teixeira, 2017).

    To analyse the chromatin structure at individual P-element insertions, DNA sequencing (DNA-seq) data was used to identify all euchromatic insertions in the Harwich strain, and RNA-seq analysis was used to define transcriptionally active insertions. At transcriptionally active P-element euchromatic insertions, the spreading of H3K9me3 into the flanking genomic regions was readily observed in non-dysgenic progeny, but was completely absent in dysgenic offspring. Similarly, a reduction in H3K9me3 modification levels was also observed over the IVS3 transgenic reporter in dysgenic progeny when compared to non-dysgenic progeny. Interestingly, euchromatic insertions with no evidence of transcriptional activity were devoid of an H3K9me3 signal in both non-dysgenic and dysgenic crosses, providing further evidence for a model initially suggested in yeast and more recently proposed for Drosophila and mammals, in which H3K9me3 deposition by piRNA complexes would require transcription of the target loci. Mechanistically different from the well-described somatic repression, the results uncovered the existence of an unexpected piRNA-mediated, chromatin-based mechanism regulating IVS3 alternative splicing in germ cells (Teixeira, 2017).

    To expand the analysis, the literature was searched for other cases of transposon splicing regulation. Drosophila Gypsy elements are retrotransposons that have retrovirus-like, infective capacity owing to their envelope (Env) protein. These elements are expressed in somatic ovarian cells, in which they are regulated by the flamenco locus, a well-known piRNA cluster that is a soma-specific source of antisense piRNAs cognate to Gypsy. Interestingly, it has been shown that mutations in flamenco not only elicited the accumulation of Gypsy RNA, but also modulated pre-mRNA splicing, favouring the production of the env mRNA and therefore germline infection. To test whether the piRNA pathway, in addition to its role in regulating the accumulation of Gypsy RNA, is also responsible for modulating the splicing of Gypsy elements in somatic tissues, publically available RNA-seq data from poly(A)-selected RNAs extracted from in vivo cultures of ovarian somatic cells (OSCs) was analyzed. The analysis indicates that piwi knockdown was sufficient to modulate Gypsy splicing, favouring the accumulation of env-encoding mRNA. In agreement with a chromatin-mediated regulation of alternative splicing, RNAi depletion of Arx, Panx, HP1a and Mael, as well as knockdown of the histone linker H1, was sufficient to favour Gypsy splicing, recapitulating the effect caused by Piwi depletion. Notably, this was also the case for the H3K9 methyltransferase Setdb1, but not for the H3K9 methyltransferases Su(var)3-9 and G9a, indicating specific genetic requirements. Taken together, the results indicate that the piRNA pathway, through its role in mediating changes in chromatin states, regulates the splicing of transposon pre-mRNAs in both somatic and germline tissues (Teixeira, 2017).

    Using P-M hybrid dysgenesis as a model, this study hasa uncovered splicing regulation elicited by chromatin changes as a previously unknown mechanism by which the piRNA pathway protects the genome from the detrimental effects of transposon activity. Splicing control at piRNA-target loci is likely to be mechanistically different from what has been observed for germline piRNA clusters given the low enrichment of the HP1 homologue Rhino (also known as HP1D) protein, which is required for piRNA cluster RNA processing, over the endogenous P-element insertions in the Harwich genome or over the transgenic IVS3 splicing reporter in non-dysgenic and dysgenic progeny (as measured by ChIP-qPCR). Because small RNA-based systems leading to chromatin mark changes at target loci are pervasive in eukaryotes, it is expected that this new type of targeted regulation is of importance in settings far beyond the scope of the piRNA pathway and Drosophila. Indeed, small RNA-guided DNA methylation over the LINE retrotransposon Karma was recently shown to modulate alternative splicing in oil palm, disrupting nearby gene expression and ultimately affecting crop yield. In this context, small RNA-based control of chromatin structure may be crucially important in genomes with a high content of intronic transposon insertions, such as the human genome, by providing a mechanism to suppress exonization of repeat elements. Although the means by which piRNA-mediated changes in chromatin states could regulate alternative splicing remain to be determined, it is tempting to speculate that piRNA pathway components do so by co-transcriptionally modulating interactions between RNA polymerase II and the spliceosome (Teixeira, 2017).

    Short cryptic exons mediate recursive splicing in Drosophila

    Many long Drosophila introns are processed by an unusual recursive strategy. The presence of ~200 adjacent splice acceptor and splice donor sites, termed ratchet points (RPs), were inferred to reflect 'zero-nucleotide exons', whose sequential processing subdivides removal of long host introns. This study used CRISPR-Cas9 to disrupt several intronic RPs in Drosophila melanogaster, some of which recapitulated characteristic loss-of-function phenotypes. Unexpectedly, selective disruption of RP splice donors revealed constitutive retention of unannotated short exons. Assays using functional minigenes confirm that unannotated cryptic splice donor sites are critical for recognition of intronic RPs, demonstrating that recursive splicing involves the recognition of cryptic RP exons. This appears to be a general mechanism, because canonical, conserved splice donors are specifically enriched in a 40-80-nt window downstream of known and newly annotated intronic RPs and exhibit similar properties to a broadly expanded class of expressed RP exons. Overall, these studies unify the mechanism of Drosophila recursive splicing with that in mammals (Joseph, 2018).

    Proper splicing contributes to visual function in the aging Drosophila eye

    Changes in splicing patterns are a characteristic of the aging transcriptome; however, it is unclear whether these age-related changes in splicing facilitate the progressive functional decline that defines aging. In Drosophila, visual behavior declines with age and correlates with altered gene expression in photoreceptors, including downregulation of genes encoding splicing factors. This study characterized the significance of these age-regulated splicing-associated genes in both splicing and visual function. To do this, differential splicing events were identified in either the entire eye or photoreceptors of young and old flies. Intriguingly, aging photoreceptors show differential splicing of a large number of visual function genes. In addition, as shown previously for aging photoreceptors, aging eyes showed increased accumulation of circular RNAs, which result from noncanonical splicing events. To test whether proper splicing was necessary for visual behavior, age-regulated splicing factors were knocked down in photoreceptors in young flies and phototaxis was examined. Notably, many of the age-regulated splicing factors tested were necessary for proper visual behavior. In addition, knockdown of individual splicing factors resulted in changes in both alternative splicing at age-spliced genes and increased accumulation of circular RNAs. Together, these data suggest that cumulative decreases in splicing factor expression could contribute to the differential splicing, circular RNA accumulation, and defective visual behavior observed in aging photoreceptors (Stegeman, 2018).

    Numerous recursive sites contribute to accuracy of splicing in long introns in flies

    Recursive splicing, a process by which a single intron is removed from pre-mRNA transcripts in multiple distinct segments, has been observed in a small subset of Drosophila melanogaster introns. However, detection of recursive splicing requires observation of splicing intermediates that are inherently unstable, making it difficult to study. This study developed new computational approaches to identify recursively spliced introns and applied them, in combination with existing methods, to nascent RNA sequencing data from Drosophila S2 cells. These approaches identified hundreds of novel sites of recursive splicing, expanding the catalog of recursively spliced fly introns by 4-fold. A subset of recursive sites were validated by RT-PCR and sequencing. Recursive sites occur in most very long (> 40 kb) fly introns, including many genes involved in morphogenesis and development, and tend to occur near the midpoints of introns. Suggesting a possible function for recursive splicing, it was observed that fly introns with recursive sites are spliced more accurately than comparably sized non-recursive introns (Pai, 2018).

    Striking circadian neuron diversity and cycling of Drosophila alternative splicing

    Although alternative pre-mRNA splicing (AS) significantly diversifies the neuronal proteome, the extent of AS is still unknown due in part to the large number of diverse cell types in the brain. To address this complexity issue, this study used an annotation-free computational method to analyze and compare the AS profiles between small specific groups of Drosophila circadian neurons. The method, the Junction Usage Model (JUM), allows the comprehensive profiling of both known and novel AS events from specific RNA-seq libraries. The results show that many diverse and novel pre-mRNA isoforms are preferentially expressed in one class of clock neuron and also absent from the more standard Drosophila head RNA preparation. These AS events are enriched in potassium channels important for neuronal firing, and there are also cycling isoforms with no detectable underlying transcriptional oscillations. The results suggest massive AS regulation in the brain that is also likely important for circadian regulation (Wang, 2018).

    Tissues of the nervous and germline systems, such as brain, testes and ovaries, have more complex transcriptomes than other cell types due to extensive alternative pre-mRNA splicing or AS. The nervous system especially exhibits vast numbers of AS isoforms, many of which are novel and are only beginning to be comprehensively identified. This increase in transcript isoform complexity likely contributes to the specification and functional diversity of cell types within the nervous system (Wang, 2018).

    This study applied a novel computational algorithm called JUM to characterize the transcript isoform diversity generated by alternative splicing in three circadian neuronal subtypes (LNv, LNd and DN1), as well as a non-circadian dopaminergic neuron population (TH neurons) of the Drosophila central nervous system. JUM can comprehensively analyze, quantitate and compare tissue- or cell-type-specific AS patterns without requiring a priori annotations of known transcripts or transcriptomes. The analysis revealed a previously unappreciated diversity and complexity of alternatively spliced transcript isoform patterns in these four neuronal subtypes, suggesting that they contribute to neuronal identity, connectivity, activity and circadian functions. This is because many of these novel, previously undetected and unannotated isoforms were unique to a given neuronal population and occurred in transcripts from genes implicated in neuronal activity or circadian rhythms (Wang, 2018).

    For example, the kinase Shaggy and the blue light photoreceptor Cryptochrome play central roles in circadian clock regulation and have novel AS patterns in discrete subsets of the circadian neurons. In addition, nine different transcripts involved in potassium transport undergo differential AS in circadian neurons compared to non-circadian neurons. These transcripts encode six different potassium channels. Many of these genes have a complex organization known to encode populations of functionally distinct proteins isoforms, which change the activation kinetics as well as calcium sensitivity of the channels. Neuronal firing is known to play a key role in the circadian circuit with recent studies illustrating that different subgroups of circadian neurons have characteristic time-of-day neuronal firing patterns. Although it is not yet fully understood which potassium channels play a critical role in each circadian neuron subgroup, several channel pre-mRNAs that undergo differential splicing in circadian neurons impact circadian behavior and sleep, such as slowpoke (slo), Shaker (Sh) and Hyperkinetic (Hk). It is therefore likely that AS adds diversity and distinct physiological properties to these protein isoforms, which then impacts neuron-specific firing patterns. From a more general perspective, AS augments transcriptional regulation in giving different circadian neurons individual identities and distinct functions (Wang, 2018).

    Approximately 5% of the AS events identified in circadian neurons also undergo time-of-day dependent changes in alternative splicing (cycling splicing). It is important to note that all experiments carried out in this study were conducted under 12 hr of light and 12 hr of dark conditions, making it impossible to distinguish between light and clock control. Nonetheless, these data indicate that splicing adds a dramatic layer of gene regulation to diurnal changes in gene expression. Moreover, many of the cycling AS transcripts show constant overall mRNA levels, which suggests the existence of neuron-specific splicing factors that are expressed or activated only at specific times of the day. Indeed, this study has identified several candidate cycling neuron-enriched transcripts that encode RBPs that may help to drive cycling AS patterns (Wang, 2018).

    A recent trend in biological research is to generate transcriptome profiles from single cells. For example, this strategy is part of the 'human cell atlas' project aimed at personalized genomic medicine or the 'brain initiative' project to generate profiles of all neurons in the mouse brain. One recent study was able to obtain about 20M sequence reads per isolated human iPS cell but only managed to analyze splicing patterns for the most highly expressed genes. The current study in contrast used ~100 isolated Drosophila neurons for each of the four neuron subtypes along with judicious use of both oligo-dT and random hexamer priming of the cDNA libraries. This strategy obtained about 10-30M sequence reads for each sample, including substantial information from the 5' ends of transcripts, and JUM was able to detect and classify a large number of previously unannotated pre-mRNA isoforms. Many of them are missing from the fly head RNA-seq data assayed and analyzed in parallel, indicating that these new isoforms are cell-type specific. Not surprisingly, the novel isoforms from the three circadian neuron groups fall into many gene ontology (GO) categories associated with specific circadian clock activity and function (Wang, 2018).

    Taken together, the work presented in this study indicates that the number of alternative splicing events that take place in neuronal tissues is grossly underestimated, even though publically-funded genome projects, such as the NIH modENCODE projects deeply sequenced transcriptomes from a variety of Drosophila tissues and developmental stages. This is despite the appreciation of how much AS occurs in the nervous system, for example recent comprehensive analysis of splicing patterns through deep sequencing of ~50 mouse and human tissues revealed about 2500 neuronally-regulated alternative splicing events. It is therefore suggested that these events will need to be comprehensively evaluated by much deeper sequencing than is currently afforded by most contemporary single cell RNA-seq studies and by AS analysis software like JUM that is not constrained by a priori knowledge of known splicing events (Wang, 2018).

    NineTeen Complex-subunit Salsa is required for efficient splicing of a subset of introns and dorsal-ventral patterning

    The NineTeen Complex (NTC), also known as Pre-mRNA-processing factor 19 (Prp19) complex, regulates distinct spliceosome conformational changes necessary for splicing. During Drosophila midblastula transition, splicing is particularly sensitive to mutations in NTC-subunit Fandango, which suggests differential requirements of NTC during development. This study shows that NTC-subunit Salsa, the Drosophila orthologue of human RNA helicase Aquarius (CG31368), is rate-limiting for splicing of a subset of small first introns during oogenesis, including the first intron of gurken. Germ line depletion of Salsa and splice site mutations within gurken first intron both impair adult female fertility and oocyte dorsal-ventral patterning due to an abnormal expression of Gurken. Supporting causality, the fertility and dorsal-ventral patterning defects observed after Salsa depletion could be suppressed by the expression of a gurken construct without its first intron. Altogether these results suggest that one of the key rate-limiting functions of Salsa during oogenesis is to ensure the correct expression and efficient splicing of the first intron of gurken mRNA. Retention of gurken first intron compromises the function of this gene most likely because it undermines the correct structure and function of the transcript 5'UTR (Rathore, 2020).

    The spliceosome is a highly dynamic molecular machine, composed of five small nuclear ribonucleoproteins (snRNPs) that sequentially associate to the precursor mRNA (pre-mRNA) during the splicing reaction. Each snRNP (U1, U2, U4, U5, and U6) contains a U-rich snRNA and a unique group of proteins. Although spliceosome assembly is ordered (U1 > U2 > U4/U5/U6 > NineTeen Complex), the splicing reaction is without an apprarent irreversible and/or rate-limiting step, with commitment to splicing progressively increased as snRNPs and NTC bind to the pre-mRNA (Rathore, 2020).

    The spliceosomal NineTeen Complex (NTC), also known as Pre-mRNA-processing factor 19 (Prp19) complex, regulates distinct spliceosome conformational changes necessary for efficient pre-mRNA splicing (Hogg, 2010; Chanarat, 2013). NTC composition is dynamic and comprises a subset of conserved core subunits and many transiently associated ones. NTC associates with the spliceosome during its activation and just before the first transesterification. Interestingly, NTC also has a significant role in the crosstalk between transcription, cotranscriptional processing of the nascent RNA, and DNA repair, as distinct NTC subunits have been reported to be important for transcriptional elongation and genomic stability. Human NTC-subunits PRP19, XAB2, and CDC5L are important for transcriptional elongation, transcription-coupled DNA repair, and activation of the ATM-related (ATR)-dependent DNA damage checkpoint. RNA Polymerase II (RNA Pol II) also promotes cotranscriptional splicing activation through the recruitment of NTC (Rathore, 2020 and references therein).

    Human Aquarius (AQR) (also known as intron-binding protein 160, IBP160) is an ATP-dependent RNA helicase that associates with NTC during spliceosome activation and formation of the activated B complex (BACT). AQR binds to introns independently of sequence, but usually upstream of the branch-site (BS) and close to the associated U2 snRNP SF3a and SF3b proteins, being essential for intron-binding complex formation and efficient splicing. AQR has also been suggested to be important for deposition of the exon junction complex (EJC) during the splicing reaction and formation of intron-encoded snoRNAs, suggesting it regulates the cross-talk between splicing and other RNA processing events (Rathore, 2020).

    Splicing during Drosophila early embryonic development is notably sensitive to mutations in NTC-subunit Fandango (Guilgur, 2014), suggesting differential requirements of NTC during development (Martinho, 2015). To test this possibility, it was decided to investigate the role of other NTC-subunits during Drosophila oogenesis and early embryonic development. Focused of initial work was placed on uncharacterized gene CG31368, which encodes the Drosophila ortholog of human Aquarius. Since there is already a nonrelated Drosophila protease named aquarius (CG14061), CG31368 was renamed salsa. The working hypothesis is that salsa, similar to its Caenorhabditis elegans ortholog emb-4, is likely to have important developmental functions (Rathore, 2020).

    During Drosophila oogenesis, gurken mRNA localizes to the posterior cortex of the developing oocyte and Gurken signal is restricted to the underlying posterior follicle cells. In response to a signal from the posterior follicle cells, there is a considerable reorganization of the cytoskeleton and a microtubule-dependent migration of the oocyte nucleus to the anterior cortex. The anteriorly localized nucleus defines the dorsal-anterior region and provides the first detectable dorsal-ventral (D/V) asymmetry of the oocyte, with the expression of both gurken mRNA and protein restricted to the cytoplasmic perinuclear region of the oocyte (Rathore, 2020).

    gurken mRNA is transcribed in the supporting nurse cells and actively transported to the dorsal-anterior region of the oocyte by a dynein-mediated transport. The oocyte dorsal-anterior localization of gurken mRNA relies on multiple elements localized to the transcript 5' UTR, 3' UTR and open-reading frame. Although this localization is crucial for its efficient translation, the precise contribution of each element for RNA localization is still a matter of debate (Rathore, 2020).

    D/V patterning of the developing Drosophila egg is dependent on the dorsal-anterior localization of Gurken during mid-oogenesis. Gurken is the ligand for the Epidermal growth factor receptor (Egfr) that locates to the apical surface of follicle cells that surround the developing oocyte. Activation of Egfr modifies the cell fate of the dorsal follicle cells and restricts the formation of Spätzle ligand to the ventral region of the oocyte, which is essential for normal morphogenesis of the eggshell dorsal appendages (Rathore, 2020).

    This study found that Salsa, the Drosophila ortholog of AQR, is rate-limiting for efficient splicing of a subset of small first introns, including the first intron of gurken. Consistent with the functional relevance of gurken splicing defects, mutations within the splice sites of the first intron of gurken impair the function of this gene. Female germline depletion of Salsa and splice mutations within gurken first intron were both associated to a decrease in female fertility, significant D/V patterning defects of the eggshell and abnormal expression of Gurken during oogenesis. Supporting causality, expression of a gurken construct without its first intron suppressed the female fertility and D/V patterning defects observed after Salsa depletion. Altogether these results suggest that one of the key rate-limiting functions of Salsa during oogenesis is to ensure the correct expression and efficient splicing of the first intron of gurken mRNA (Rathore, 2020).

    Oocyte dorsal-anterior localization of gurken mRNA relies on multiple elements localized to the transcript 5'UTR, 3'UTR and open-reading frame, yet the relative importance of each element for mRNA localization is still unclear. The 5' and 3'-UTRs of gurken were reported to be required for dorsal-anterior localization of gurken transcript. Furthermore, and using a genomic gurken construct with a lacZ reporter inserted within the gene open-reading frame, it was shown that whereas gurken 5'UTR is required for transcript oocyte accumulation, its coding region and 3'UTR are necessary for its posterior and dorsal-anterior localization. Nevertheless, it was recently reported, using an oocyte injection assay, that a small stem-loop located within the open-reading frame was necessary and sufficient for gurken transcript localization (Rathore, 2020).

    The results show that efficient splicing of the first intron of gurken is required for mRNA dorsal-anterior localization and dorsal-ventral patterning. This is most likely because retention of the first intron impairs the secondary RNA structure of gurken 5'UTR, and the function of a closely located RNA element important for its localization. Splicing of the first intron of gurken is also likely to facilitate Gurken protein expression, as deletion of the first intron of gurken suppresses the dorsalization phenotype associated with increased copy number of gurken gene without affecting the levels of gurken mRNA. The results therefore fully support the role of gurken 5'UTR in mRNA localization within the oocyte, and strongly suggest that Salsa-dependent splicing of the first intron of gurken mRNA is important for the correct expression and function of this gene (Rathore, 2020).

    The precise function of human Aquarius in splicing is still poorly understood. This RNA helicase is recruited to the spliceosome as a pentameric complex known as intronbinding complex (IBC), which also contains hSyf1 (also known as Xab2), hIsy1, CypE, and CCDC16 (De, 2015). Coimmunoprecipitation experiments suggest a large interaction interface between IBC and U2 snRNP, within the activated spliceosome (Bact stage) and just before the first splicing reaction. Although Aquarius ability to bind and hydrolyze ATP is important for spliceosome activation and splicing efficiency, the role of its RNA unwinding activity is less clear (Rathore, 2020).

    This work has identified a small subset of introns whose splicing is particularly sensitive to depletion of Salsa (the Drosophila ortholog of human Aquarius). The fact that splicing was only affected in a small number of introns is consistent with the observation that immunodepletion of human Aquarius from nuclear extracts only weakly impaired splicing in vitro. This suggests that although this RNA helicase is apparently not critical for overall splicing, during female gametogenesis there is a subset of introns whose efficient removal relies on the function of this enzyme (Rathore, 2020).

    Analysis of the introns whose splicing was sensitive to Salsa depletion showed a clear bias for small first introns with weak 3'splice sites, independently of their distance to the transcription start site (TSS), 5'splice site strength and GC content. The bias for small introns suggests that Salsa is mostly rate-limiting when introns are recognized by intron definition, where the initial pairing between U1 and U2 snRNPs occurs across the intron. Furthermore, the bias for introns with weak 3'splice sites is in accordance with the extensive interaction between IBC and U2 snRNP in the activated spliceosome, and implies that depletion of Salsa is likely to impair, at least in a subset of introns, U2 snRNP function during splicing. The absence of any detectable bias for short distances between the TSS and 5'splice site, when evaluating affected and control first introns, or any bias for weak 5'splice site strength, suggests that Salsa is not likely rate-limiting for Cap-Binding Complex-mediated splicing (Rathore, 2020).

    Drosophila first introns are more likely to be cotranscriptionally retained than internal and terminal introns. This is not consistent with the kinetic competition model, where the fastest processes are the ones most likely to occur, suggesting additional constraints to first intron splicing. Although the precise nature of such constraints is still poorly understood, binding of transcriptional initiation factors to the 5'splice site-associated U1snRNP potentially restricts splicing efficiency, as it might impair the initial pairing between U1 and U2 snRNPs. The current working hypothesis is that Salsa is required for splicing of small first introns with weak 3'splice site because this enzyme facilitates U2 snRNP function, minimizing the interference effect of transcriptional initiation factors on splicing. Future work will help define the function of this RNA helicase and its contribution for differential gene expression during development (Rathore, 2020).

    Border cell polarity and collective migration require the spliceosome component Cactin

    Border cells are an in vivo model for collective cell migration. This study identify the gene cactin as essential for border cell cluster organization, delamination, and migration. In Cactin-depleted cells, the apical proteins aPKC and Crumbs (Crb) become abnormally concentrated, and overall cluster polarity is lost. Apically tethering excess aPKC is sufficient to cause delamination defects, and relocalizing apical aPKC partially rescues delamination. Cactin is conserved from yeast to humans and has been implicated in diverse processes. In border cells, Cactin's evolutionarily conserved spliceosome function is required. Whole transcriptome analysis revealed alterations in isoform expression in Cactin-depleted cells. Mutations in two affected genes, Sec23 and Sec24CD, which traffic Crb to the apical cell surface, partially rescue border cell cluster organization and migration. Overexpression of Rab5 or Rab11, which promote Crb and aPKC recycling, similarly rescues. Thus, a general splicing factor is specifically required for coordination of cluster polarity and migration, and migrating border cells are particularly sensitive to splicing and cell polarity disruptions (Miao, 2022).

    Loss of the RNA trimethylguanosine cap is compatible with nuclear accumulation of spliceosomal snRNAs but not pre-mRNA splicing or snRNA processing during animal development

    The 2,2,7-trimethylguanosine (TMG) cap is one of the first identified modifications on eukaryotic RNAs. TMG, synthesized by the conserved Tgs1 enzyme, is abundantly present on snRNAs essential for pre-mRNA splicing. Results from ex vivo experiments in vertebrate cells suggested that TMG ensures nuclear localization of snRNAs. Functional studies of TMG using tgs1 mutations in unicellular organisms yield results inconsistent with TMG being indispensable for either nuclear import or splicing. Utilizing a hypomorphic Tgs1 mutation in Drosophila, this study shows that TMG reduction impairs germline development by disrupting the processing, particularly of introns with smaller sizes and weaker splice sites. Unexpectedly, loss of TMG does not disrupt snRNAs localization to the nucleus, disputing an essential role of TMG in snRNA transport. Tgs1 loss also leads to defective 3' processing of snRNAs. Remarkably, stronger Tgs1 mutations cause lethality without severely disrupting splicing, likely due to the preponderance of TMG-capped snRNPs. Tgs1, a predominantly nucleolar protein in Drosophila, likely carries out splicing-independent functions indispensable for animal development. Taken together, these results suggest that nuclear import is not a conserved function of TMG. As a distinctive structure on RNA, particularly non-coding RNA, it is suggested that TMG prevents spurious interactions detrimental to the function of RNAs that it modifies (Cheng, 2020).

    Comprehensive Identification and Alternative Splicing of Microexons in Drosophila

    Interrupted exons in the pre-mRNA transcripts are ligated together through RNA splicing, which plays a critical role in the regulation of gene expression. Exons with a length < 30 nt are defined as microexons that are unique in identification. However, microexons, especially those shorter than 8 nt, have not been well studied in many organisms due to difficulties in mapping short segments from sequencing reads. This study analyzed mRNA-seq data from a variety of Drosophila samples with a newly developed bioinformatic tool, ce-TopHat. In addition to the Flybase annotated, 465 new microexons were identified. Differentially alternatively spliced (AS) microexons were investigated between the Drosophila tissues (head, body, and gonad) and genders. Most of the AS microexons were found in the head and two AS microexons were identified in the sex-determination pathway gene fruitless (Pang, 2021).

    Overlapping activities of ELAV/Hu family RNA binding proteins specify the extended neuronal 3' UTR landscape in Drosophila

    The tissue-specific deployment of highly extended neural 3' UTR isoforms, generated by alternative polyadenylation (APA), is a broad and conserved feature of metazoan genomes. However, the factors and mechanisms that control neural APA isoforms are not well understood. This study shows that three ELAV/Hu RNA binding proteins (Elav, Rbp9, and Fne) have similar capacities to induce a lengthened 3' UTR landscape in an ectopic setting. These factors promote accumulation of chromatin-associated, 3' UTR-extended, nascent transcripts, through inhibition of proximal polyadenylation site (PAS) usage. Notably, Elav represses an unannotated splice isoform of fne, switching the normally cytoplasmic Fne toward the nucleus in elav mutants. This study used genomic profiling to reveal strong and broad loss of neural APA in elav/fne double mutant CNS, the first genetic background to largely abrogate this distinct APA signature. Overall, this study demonstrates how regulatory interplay and functionally overlapping activities of neural ELAV/Hu RBPs drives the neural APA landscape (Wei, 2020).

    The 3' untranslated region (UTR) is the major hub for post-transcriptional control and harbors elements that direct regulation by RNA binding proteins (RBPs), miRNAs, and RNA modifications. Such regulatory elements can be rendered conditional by alternative polyadenylation (APA), which yields 3' UTR diversity from an individual locus. Most eukaryotic genes accumulate distinct 3' UTR isoforms, and this can be influenced by differentiation status, tissue identity, and environmental and metabolic conditions. Moreover, APA is broadly disregulated in disease and cancer and may help to drive aberrant gene expression states (Wei, 2020).

    Many tissues generate characteristic APA landscapes, implying that developmental factors regulate 3' UTR programs. A striking example involves the nervous system, where many hundreds of genes express substantially longer 3' UTRs compared to other tissues. Many of these neural 3' UTR extensions are extremely lengthy, and stable isoforms bearing 20 kb 3' UTRs have been documented in flies and mice by Northern blot. Despite the breadth and conservation of this phenomenon and functional studies that link neural-specific 3' UTRs to splicing choice, transcript localization, local translation, and miRNA regulation, relatively little is known of mechanisms that determine neural-extended 3' UTR isoforms (Wei, 2020).

    Several identified APA mechanisms modulate the levels or activities of cleavage and polyadenylation factors. For example, interaction of U1 snRNP with poly(A) factors plays a major role in inhibiting premature 3' end processing. Other mechanisms that impact poly(A) site choice include recruitment of poly(A) factors at promoters and RNA Pol II speed. However, there is growing appreciation that local recruitment of RBPs can affect poly(A) site recognition or regulate later steps to inhibit cleavage and polyadenylation (Wei, 2020).

    Among RBPs with roles in APA are certain members of the ELAV/Hu family, of which there are four in human (HuR and HuB-D) and three in Drosophila (Elav, Fne, and Rbp9). All are expressed in neurons, but HuB and RBP9 are also expressed in gonads and HuR is ubiquitous. Drosophila Elav was shown to regulate APA at erect wing (ewg), where it binds U-rich motifs distal of the cleavage site and inhibits 3' end processing. Likewise, all four mammalian Hu proteins suppress an intronic poly(A) site in calcitonin/CGRP, and HuR autoregulates by APA. In addition, HuR regulates 3' end processing of several membrane proteins. Given the predominant neuronal expression of many ELAV/Hu members, these proteins are candidate regulators of CNS-specific 3' UTR extensions. Elav mediates neural 3' UTR extensions of certain genes, but the breadth of Elav involvement in the neuronal APA landscape has not been investigated (Wei, 2020).

    To gain a comprehensive understanding of ELAV/Hu RBPs in 3' UTR isoform regulation, genomic approaches were applied, using gain and loss-of-function genetics. Surprisingly, it was found that elav knockouts are not strictly embryonic lethal, as long believed, nor is Elav essential for most neural 3' UTR extensions to accumulate. Using a heterologous system this study found all three Drosophila ELAV/Hu RBPs (Elav, Fne, and Rbp9) have similar capacities to broadly induce a neural 3' UTR extension landscape. They do so by promoting bypass of proximal polyadenylation signals (PAS) in nascent transcripts. Although Elav is normally the predominant nuclear Hu factor in Drosophila, this study found that in elav-null CNS, the normally cytoplasmic Fne protein becomes substantially nuclear, owing to induction of a previously unrecognized splice isoform. Accordingly, genomic analyses of elav/fne double mutant CNS reveal strong loss of neural 3' UTR extensions. Overall, this study demonstrates critical overlapping roles for ELAV/Hu RBPs to generate the neural-extended 3' UTR landscape (Wei, 2020).

    The accumulation of substantially extended 3' UTR isoforms in the nervous system represents a broad and conserved phenomenon. This phenomenon was associated with activity of Elav, a neuronally enriched RBP that has been shown to block proximal PAS usage by binding to U-rich sequences. However, the evidence was limited to a handful of loci. Therefore, the endogenous contribution of Hu RBPs to the general neural 3' UTR extended landscape, and the mechanism of their regulatory impacts, were largely unknown. Indeed, initial studies challenged the notion that Elav alone is critical for this process, since analysis of full knockout elav larval CNS showed they still broadly express neural 3' UTR extensions (Wei, 2020).

    This study resolved this conundrum with two main lines of evidence. First, it was shown that a family of neural Hu family RBPs in Drosophila all have capacity to broadly induce neural 3' UTR extensions, largely by promoting the bypass of proximal PAS to permit continued transcription of extension regions. Second, it was revealed that there is substantial endogenous functional overlap of the Hu RBPs Elav and Fne in broadly driving endogenous neural 3' UTR lengthening. Since Fne proteins accumulate modestly in embryos, later time points were essential to better reveal their genetic interactions. Although many cells and tissues exhibit characteristic 3' UTR profiles, the mechanisms are little known. This work reveals the first demonstration of wholesale loss of a tissue-specific APA landscape, revealed upon co-deletion of elav and fne (Wei, 2020).

    Many hundreds of genes acquire distinct presumably regulatory capacity as a result of neural APA, which can add miRNA and RBP sites and change overall 3' UTR structures. However, until experimental interventions are performed, it is difficult to say how important these extensions are for normal gene regulation, cell behavior, or organismal phenotype. Recently, CRISPR engineering was used to show that neural 3' UTR extension of homothorax contains an array of binding sites for miR-iab-4/8 that control its protein output and are critical for normal adult behavior (Garaulet, 2020). In particular, deletion of the mir-iab-4/8 locus, surgical mutation of their binding sites in the homothorax 3' UTR, and specific deletion of the homothorax neural 3' UTR extension all derepress Homothorax in a specific region of the abdominal ventral nerve cord and induce defective virgin female behavior (Garaulet, 2020). Notably, the current data show that the homothorax 3' UTR extension is largely maintained in elav mutant CNS but is completely lost in elav/fne double mutant CNS. Thus, ELAV/Hu-RBPs are upstream regulators to this newly recognized behavioral switch, and their combinatorial activities are presumably relevant to other neural-specific 3' UTR biology, since they maintain hundreds of neural 3' UTR extensions (Wei, 2020).

    ELAV family proteins have been assigned gene-specific roles in regulating RNA processing at all levels, including alternative splicing, APA, target stability, translation, and subcellular mRNA localization. It was initially thought that individual ELAV/Hu family members would adopt distinct RNA processing functions based on cellular localization. Despite a preferred cellular localization, however, they shuttle between the nucleus and the cytoplasm, and localization also depends on cell type. Accordingly, Drosophila Fne and Rbp9 can regulate the Elav targets ewg, nrg, and arm (Zaharieva, 2015). Such functional overlap was not anticipated as Fne and Rbp9 are normally cytoplasmic (Zaharieva, 2015). The current data suggest that modest levels of nuclear ELAV/Hu proteins can promote genomically widespread neural 3' UTR extensions, since Fne comprises a small fraction of total ELAV/Hu proteins in larval CNS. Conversely, while Elav is largely utilized as a nuclear marker, this study documented it also has ubiquitous cytoplasmic accumulation, so it may conceivably overlap with cytoplasmic Fne/Rbp9 activities (Wei, 2020).

    Complex regulatory interactions among the Drosophila Hu factors have been documented, since misexpression of Fne results in downregulation of endogenous Elav and Fne (Samson, 2003), and misexpression of a NLS-tagged nuclear variant of Rbp9 results in relocalization of endogenous Elav into the cytoplasm (Zaharieva, 2015). This study now documents multiple additional cross-regulatory mechanisms that control total nuclear levels of ELAV/Hu proteins in Drosophila. First, Elav represses fne transcript levels, which may be associated with the strong control of fne neural 3' UTR extension by Elav. Second Fne represses an alternative splice isoform of Fne that is preferentially localized to the nucleus. This Fne microexon, while not previously annotated, is deeply conserved in insects and may reflect the sole ELAV/Hu protein in other arthropods that is likely to carry out both nuclear and cytoplasmic activities (Samson, 2008). By contrast, even though Drosophila elav is the only lethal member of the family, it is intronless and is presumably a derived retrogene copy that originated in the Drosophilid ancestor. The Fne microexon inserts sequence adjacent to the octapeptide in the hinge region, which is known to be involved in nuclear localization. As the hinge region is not sufficient for nuclear localization, other parts of the ELAV/Hu protein may also contribute to its subcellular control (Wei, 2020).

    Cross-over in their regulatory functions is facilitated by the highly overlapping in vitro target specificities of ELAV/Hu factors, including Elav/Fne/Rbp9 (Ray, 2013). Consistent with this, Elav/Fne/Rbp9-repressed cleavage sites were found to be enriched for similar U-rich motifs. Interestingly, the same motif was identified as a high-affinity initiator for forming a larger and saturable megadalton Elav complex (Soller, 2005). In addition, the same motif is the main conserved element in Drosophila virilis about 100 bp distal of the regulated poly(A) site in an otherwise very distinct extended binding sequence in ewg (Wei, 2020).

    These data suggest that Rbp9 may also play a role in neural APA, since it has very similar gain-of-function activities as Elav and Fne. However, its impact may be masked by the earlier accumulation of Elav and Fne proteins in neurons. Because of apparent embryonic lethality of available elav/fne/rbp9 triple mutant genotypes, iy was not possible to analyze this genotype at a developmentally relevant post-embryonic time point (i.e., in 2nd instar larval CNS when Rbp9 protein is more detectably accumulated). As it is suspected that simple RNAi approaches will be insufficient to eliminate the relevant activities, FLP-out systems or somatic CRISPR might be investigated to bypass early lethality of elav mutants (Wei, 2020).

    ELAV and FNE determine neuronal transcript signatures through exon-activated rescue

    The production of alternative RNA variants contributes to the tissue-specific regulation of gene expression. In the animal nervous system, a systematic shift toward distal sites of transcription termination produces transcript signatures that are crucial for neuron development and function. This study reports that, in Drosophila, the highly conserved protein ELAV globally regulates all sites of neuronal 3' end processing and directly binds to proximal polyadenylation sites of target mRNAs in vivo. An endogenous strategy of functional gene rescue was uncovered that safeguards neuronal RNA signatures in an ELAV loss-of-function context. When not directly repressed by ELAV, the transcript encoding the ELAV paralog FNE acquires a mini-exon, generating a new protein able to translocate to the nucleus and rescue ELAV-mediated alternative polyadenylation and alternative splicing. It is proposed that exon-activated functional rescue is a more widespread mechanism that ensures robustness of processes regulated by a hierarchy, rather than redundancy, of effectors (Carrasco, 2020).

    Most metazoan genes express multiple transcript isoforms through the use of alternative polyadenylation (poly(A)) sites that signal transcription termination. Alternative cleavage and polyadenylation (APA) generates mRNA isoforms that differ in their coding sequence (CDS-APA) or, more commonly, their 3' untranslated region (3' UTR-APA). Because 3' UTRs control mRNA fate through regulation of translation, degradation, and subcellular localization, APA profoundly impacts gene expression and the resulting cell behavior. Disrupted patterns of polyadenylation as well as specific APA events have been associated with human diseases, including cancer, autoimmune disorders, and neuropathological diseases (Carrasco, 2020).

    Widespread changes in 3' end isoform usage also occur in a tissue-specific manner. In animals from flies to humans, hundreds of genes undergo a shift toward the distal poly(A) site exclusively in neurons, giving rise to sometimes extremely long 3' UTRs. Systematic changes in poly(A) site usage are understood to be caused by alterations in the expression of core 3' end processing factors. However, neuronal 3' UTR extension occurs in an exquisitely synchronous, specific, and robust manner, indicating that other, neuron-specific regulators are involved (Carrasco, 2020).

    Neuronal ELAV-like proteins are highly conserved RNA-binding proteins (RBPs) that serve as gold-standard markers for neuronal commitment across model organisms. In flies and mammals, neuronal ELAV/ Hu proteins have been shown to regulate transcript stability, alternative splicing, CDS-APA , and, more recently, UTR-APA of individual genes. While ELAV/Hu proteins are prominent for their role in numerous neurological diseases and are required for neuronal differentiation, their molecular function is not well understood. This study postulates that ELAV represents the central effector of neuron-specific transcriptome signatures in vivo (Carrasco, 2020).

    This study demonstrates that two neuronal proteins, ELAV and FNE, globally mediate neuron-specific alternative 3' end processing, thereby shaping the distinct identity of the complex neuronal transcriptome. The drastic physiological consequences of aberrant neuronal APA are immediately evident in cases in which protein-coding sequences are affected, effectively causing the loss of essential neuron-specific proteins such as EWG and giant Ankyrin. The effects of aberrant 3' UTR extension, which constitutes the majority of ELAV/nFNE-mediated APA events, are less well understood. Accumulating evidence indicates that long, neuron-specific 3' UTR isoforms perform specific and important functions in neurogenesis, both globally and individually. The finding that ELAV/nFNE mediate neuronal APA and/or alternative splicing (AS) in hundreds of genes showcases the impact of ELAV-family proteins in neurogenesis and neuronal function. In mammals, ELAV/Hu proteins, though best known for their role in mRNA stabilization in the cytoplasm, also act in AS and APA; it will be interesting to study a global loss of neuronal APA in the mammalian brain. (Carrasco, 2020).

    The ELAV/nFNE genetic interaction described in this study is the first documented example of exon-activated rescue. It is proposed that this mode of context-specific protein activation ensures robustness of other biological processes that depend on one central regulator. Such regulators must hold the potential to alter the coding isoform of a secondary effector; candidates include splicing and APA factors, but can be expanded to transcription factors, chromatin regulators, and RNA editing and modification enzymes (Carrasco, 2020).

    Interestingly, the n-fne mini-exon is conserved. In other insects, including some distantly related Drosophila species, nFNE homologs are naturally expressed and coexist with FNE and ELAV. In mammals, neuronal ELAV proteins are both nuclear and cytoplasmic, and hinge region exons regulate protein localization. In those species, nFNE and ELAV homologs coexist in wild-type conditions, and exon-activated functional rescue may occur under normal circumstances, arguing that redundancy, rather than functional rescue, is at play. In D. melanogaster, functional redundancy between ELAV proteins seems to have been evolutionarily suppressed in favor of hierarchization. Spatial compartmentalization, and more generally, specialization of a protein into a main effector may increase specificity and synchrony of systematic processes like neuronal APA. In such a hierarchy, the activation of a substitute effector represents a safeguarding mechanism to ensure function (Carrasco, 2020).

    Intron-targeted mutagenesis reveals roles for Dscam1 RNA pairing architecture-driven splicing bias in neuronal wiring

    Drosophila melanogaster Down syndrome cell adhesion molecule (Dscam1) can generate 38,016 different isoforms through largely stochastic, yet highly biased, alternative splicing. These isoforms are required for nervous functions. However, the functional significance of splicing bias remains unknown. This study provides evidence that Dscam1 splicing bias is required for mushroom body (MB) axonal wiring. Mutant flies were generated with normal overall protein levels and an identical number but global changes in exon 4 and 9 isoform bias (DscamΔ4D(-/-) and DscamΔ9D(-/-)), respectively. In contrast to DscamΔ4D(-/-), DscamΔ9D(-/-) exhibits remarkable MB defects, suggesting a variable domain-specific requirement for isoform bias. Importantly, changes in isoform bias cause axonal defects but do not influence the self-avoidance of axonal branches. It is concluded that, in contrast to the isoform number that provides the molecular basis for neurite self-avoidance, isoform bias may play a role in MB axonal wiring by influencing non-repulsive signaling (Hong, 2021).

    Hidden RNA pairings counteract the "first-come, first-served" splicing principle to drive stochastic choice in Dscam1 splice variants

    Drosophila melanogaster Dscam1 encodes 38,016 isoforms via mutually exclusive splicing; however, the regulatory mechanism behind this is not fully understood. This study found a set of hidden RNA secondary structures that balance the stochastic choice of Dscam1 splice variants (designated balancer RNA secondary structures). In vivo mutational analyses revealed the dual function of these balancer interactions in driving the stochastic choice of splice variants, through enhancement of the inclusion of distal exon 6s by cooperating with docking site-selector pairing to form a stronger multidomain pre-mRNA structure and through simultaneous repression of the inclusion of proximal exon 6s by antagonizing their docking site-selector pairings. Thus, this study provides an elegant molecular model based on competition and cooperation between two sets of docking site-selector and balancer pairings, which counteracts the "first-come, first-served" principle. These findings provide conceptual and mechanistic insight into the dynamics and functions of long-range RNA secondary structures (Dong, 2022).

    Pre-mRNA alternative splicing is a major source of proteomic and functional diversity. In humans, ~95% of multi-exon genes are subjected to alternative splicing, and splicing defects are associated with a variety of genetic diseases. Common alternative splicing mechanisms include exon skipping, intron retention, alternative 5' or 3' splice site usage, and mutually exclusive splicing. Mutually exclusive splicing occurs when only one exon from a cluster of variable exons is spliced into a specific mRNA product. The most astonishing example of this can be found in the Drosophila melanogaster Down's syndrome cell adhesion molecule 1 (Dscam1) gene, which potentially generates 38,016 different isoforms through mutually exclusive splicing of exon clusters 4, 6, 9, and 17. Growing evidence has revealed that the notable diversity of Dscam1 isoforms is required for both neuronal wiring and immune defense (Dong, 2022).

    In an attempt to explain how only one exon variant of Dscam1 is selected at a time from a cluster of exons, several models based on competing RNA secondary structures have been proposed. This mechanism was initially found in the exon 6 cluster of Dscam1, where the intronic docking site downstream of exon 5 can pair competitively with selector sequences upstream of each exon 6 variant. Similar structural arrangements have been identified in the exon 4 and exon 9 clusters of Dscam1; however, in contrast to the docking site located upstream of the exon 6 cluster, the docking sites for the exon 4 and exon 9 clusters are located within their respective downstream introns. The heterogeneous nuclear ribonucleoprotein 36 (hrp36) also ensures splicing fidelity in the exon 6 cluster but not in the exon 4 or exon 9 cluster. Moreover, a locus control region immediately upstream of the docking site is involved in the inclusion of the nearest exon when the docking site is paired with its upstream selector sequence (Dong, 2022).

    A long-standing question regarding mutually exclusive splicing is how docking site-selector pairing is regulated. Proximal selector sequences are only a few hundred nucleotides away from the docking site, whereas distal selector sequences are located over 10,000 nucleotides away from the docking site. If docking site-selector pairing were dictated on a cotranscriptional 'first-come, first-served' basis, there would be a bias toward docking site-proximal exon 6 variants; however, this is not observable in all tissues. These splicing outcomes in terms of exon 6 inclusion cannot be explained by compensation from splice site strength, because distal exon 6s do not display a preference for stronger splice sites compared to proximal exon 6s. The predicted thermodynamic stability of docking site-selector pairing does not correlate with the selection frequency of the associated exons. The frequency of some exon 6 variants actually increased following deletion of the docking site in a previous study examining D. melanogaster /Drosophila virilis (Dme/Dvi) Dscam1 mutant constructs. It is therefore likely that unknown mechanisms regulate selection of the Dscam1 exon 6 cluster (Dong, 2022).

    The current study found a previously unknown RNA secondary structure hidden within the exon 6 cluster that balances the stochastic selection of Dscam1 splice variants (designated balancer RNA secondary structures). Targeted mutational analyses using a CRISPR-Cas9 system revealed that hidden RNA pairings drive the stochastic choice of splice variants by simultaneously enhancing the inclusion of distal exon 6s by facilitating docking site-selector pairing and by repressing the inclusion of proximal exon 6s. The balancers and docking site-selector base pairings cooperated to form a strong multidomain structure that enhanced distal exon inclusion. From this, a molecular model was developed for the regulation of the stochastic, mutually exclusive splicing of Dscam1 exon 6 based on competition and cooperation between two sets of docking site-selector and balancer RNA secondary structures. This work suggests that Dscam1 has evolved compensatory mechanisms that balance the distance and strength of docking site-selector base pairing to drive the stochastic selection of Dscam1 splice variants and provides an explanation for the lack of obvious 5' to 3' preference during exon 6 variant selection. Moreover, genetic analysis indicated that disruption of docking site pairing or balancer base pairing led to growth and neuronal anomalies, suggestive of their physiological significance. An additional framework for the regulation of complex, mutually exclusive splicing and new insight into the role of long-range RNA secondary structures in gene regulatory networks (Dong, 2022).

    At first glance, these previously unknown balancer RNA secondary structures appeared to be similar to the docking site and selector sequence base pairing structures. Both structures were composed of two types of conserved elements: an upstream element, which was located in the intron downstream of the constitutive exon 5, and multiple downstream elements, which were located in the exon 6 cluster body. The upstream element could pair competitively with various downstream elements. However, the two types of competing RNA secondary structures were fundamentally different in both their function and mechanism. In contrast to the base pairing between the docking site and various exon 6 upstream selectors, the balancer RNA secondary structures tended to be located in the central and 3' regions of the exon 6 cluster. Moreover, the balancer RNA secondary structures themselves did not promote the inclusion of exon 6 variants but rather enhanced the inclusion of distal exon 6s by strengthening docking site-selector base pairing or inhibited the inclusion of proximal exon 6 s by antagonizing docking site-selector base pairing. These data suggest that the regulatory function of balancer RNA secondary structures was dependent on the base pairing between the docking site and selector sequence. Furthermore, the formation of balancer RNA secondary structures enhanced the inclusion of multiple exon 6s in a proximity-dependent manner. For example, the Ubs6.1-Dbs6.40 RNA secondary structure may act as a fulcrum that guides the docking site to pair with the selector upstream of exon 6.41 or 6.42. Thus, base pairing between the docking site and a selector sequence may activate selection of the most proximal exon variant 6, whereas the balancer sites are thought to be effective over a long range. It is concluded that while docking site-selector base pairing interactions determined which exon 6 variant was selected, the balancer base pairings drove the stochastic choice of exon 6 variants (Dong, 2022).

    This study provides a reasonable explanation for the long-standing puzzle in the choice of exon 6 variants of Dscam1. By combining various techniques including informatics, evolutionary analyses, and in vivo mutagenesis experiments, a new model was generated depicting the regulation of exon 6 variant inclusion. According to a proposed model (see Model for balancer RNA secondary structures in balancing the stochastic choice of Dscam1 splice variants), for a 5'-distal exon 6 (i.e., exon 6.z) to be included in mature Dscam1 mRNA, the docking site-selector interaction may cooperate with multiple balancer base pairings to form a stronger multidomain RNA secondary structure; balancer base pairings stabilize the docking site-selector interaction, thereby increasing the frequency of exon 6.z inclusion. For a 5'-proximal exon 6 (i.e., exon 6.x) to be included in mature mRNA, the docking site may pair with the selector sequence to form short RNA secondary structures containing few domains. Therefore, 5'-proximal exon 6 inclusion relies upon docking site-selector base pairing strength. Simultaneously, balancer secondary structures also inhibit the inclusion of 5'-proximal exon 6s by antagonizing their docking site-selector base pairing interactions. For a middle exon 6 to be included in the mature mRNA, strong intermediate RNA secondary structures may be formed. Although docking site-selector pairings did not exhibit an obvious preference for docking site-distal exon 6s, the combined overall base pairings exhibited a low-to-high strength gradient from docking site-proximal to docking site-distal exon 6s. In this study, it was found that docking site deletion led to a bias toward the inclusion of docking site-distal exon 6s (except for exon 6.1), whereas deleting upstream balancers led to increased docking site-proximal exon 6 inclusion. Therefore, it is speculated that the gradient of pairing strength compensates for the docking site-proximal to docking site-distal preference of the exon 6 variants caused by the first-come, first-served principle. Thus, the final splicing outcome did not exhibit an obvious docking site-proximal preference in exon 6 variants. This finding suggests that flies have evolved an intricate compensatory pairing mechanism that drives the stochastic choice of Dscam1 splice variants (Dong, 2022).

    The conservation of balancer RNA secondary structures and genetic analyses highlight their regulatory benefits and physiological roles. First, it would be advantageous to increase balancer base pairing interactions alongside the duplication of exon clusters. Moreover, balancer pairing compensation could provide specific advantages to large exon clusters (i.e., Dscam1) through a variety of mechanisms. An intriguing possibility is that the balancer base pairing interaction induces the juxtaposition of the docking site and its downstream selector sequences, thereby enhancing their pairing interaction. For example, Ubs6.1-Dbs6.40 base pairing acts as a fulcrum that guides the docking site to pair with multiple selector sequences. Balancer base pairing interactions not only reduce the base pairing distance across large RNA molecules but also help with avoiding the formation of nonspecific secondary structures and heterogeneous nuclear ribonucleoprotein complexes that might affect the accuracy or efficiency of specific docking site-selector base pairings. Thus, balancer base pairing strategies may have been evolutionarily favorable in complex alternative splicing units such as Dscam1, in which long-distance base pairing is required. Coordination and competition between diverse balancer and docking site-selector base pairing interactions have increased the flexibility and efficiency of folding-mediated splicing. Phenotypic analyses revealed obvious defects in these mutants bearing disruption of two types of RNA secondary structures, suggesting that two types of RNA secondary structures coordinate to ensure the proper frequency of exon 6 variants, which is required for normal development (Dong, 2022).

    Is this type of functional RNA secondary structures common in vivo? Competing RNA secondary structures were initially identified in the exon 6 cluster of Dscam1, which at that time was believed to be unique. Similar structural codes have recently been revealed in several exon clusters, such as Drosophila 14-3-3Ξ, exon 4 and 9 clusters in Drosophila Dscam1, srp, RIC-3, Branchiostoma MRP, human dynamin 1, and CD55. In addition, this docking site-mediated model governs the regulation of 3'-end alternative splicing in Drosophila PGRP-LC, CG42235, and Pip genes. Therefore, competing RNA secondary structures may be a broadly applicable mechanism to regulate mutually exclusive splicing. In addition to mutually exclusive exons, competing RNA secondary structures may regulate variable exon skipping of human SF1 and alternative backsplicing of human POLR2A. Moreover, this study observed similar balancer RNA secondary structures in the Dscam1 exon 9 cluster and MRP1. Analogous nested RNA secondary structures have been found in the catalog of pairs of conserved complementary regions (PCCRs) in human protein-coding genes. Likewise, two groups of RNA structural modules have been shown to operate together to dynamically regulate alternative splicing in the human Ate1 gene. It is therefore likely that the framework of balancer RNA secondary structures developed in this study is widely applicable to the regulation of mutually exclusive splicing and other RNA processing events (Dong, 2022).

    Moreover, some questions related to the functional regulation of RNA structures should be further addressed. First, why are these exons in the stem-loop not spliced compared to flanking exons? Previous studies have shown that RNA pairing can repress the splicing of exons within a loop. Moreover, Dscam1 exon 6 is kept silent through weak splice sites in combination with splicing repressors, such as the heterogeneous nuclear ribonucleoprotein hrp36. When the docking site pairs with certain selector sequence, the distant enhancers specifically activate the proximal alternative exon by promoting recognition of the splicing site and/or dissociation of the repressors, while other exons in a stem-loop and downstream exons away from the RNA loop are repressed. Second, can the balancer and docking site-selector base pairings form a long-range pseudoknot? It is likely that two sets of RNA secondary structures form a long-range pseudoknot. However, the mutational evidence in the present study supports the notion that balancer secondary structures act to antagonize docking site-selector base pairing of proximal exon 6s. Therefore, it is proposed that these elements function by forming RNA secondary structures rather than pseudoknot structures (Dong, 2022).

    This study has provided sufficient evidence for the splicing regulation of docking site-selector and balancer RNA secondary structures in the Dscam1 gene. Future challenges include directly demonstrating the existence of the purported and/or unknown base pairs and mapping RNA-protein interactions and the effects on the choice of different exons. It will also be important to test whether these variable exons are spliced cotranscriptionally and, if so, how far the RNA polymerase progresses before splicing is detected (Dong, 2022).

    Self-avoidance alone does not explain the function of Dscam1 in mushroom body axonal wiring

    Alternative splicing of Drosophila Dscam1 into 38,016 isoforms provides neurons with a unique molecular code for self-recognition and self-avoidance. A canonical model suggests that the homophilic binding of identical Dscam1 isoforms on the sister branches of mushroom body (MB) axons supports segregation with high fidelity, even when only a single isoform is expressed. This study generated a series of mutant flies with a single exon 4, 6, or 9 variant, encoding 1,584, 396, or 576 potential isoforms, respectively. Surprisingly, most of the mutants in the latter two groups exhibited obvious defects in the growth, branching, and segregation of MB axonal sister branches. This demonstrates that the repertoires of 396 and 576 Dscam1 isoforms were not sufficient for the normal patterning of axonal branches. Moreover, reducing Dscam1 levels largely reversed the defects caused by reduced isoform diversity, suggesting a functional link between Dscam1 expression levels and isoform diversity. Taken together, these results indicate that canonical self-avoidance alone does not explain the function of Dscam1 in MB axonal wiring (Dong, 2022).

    Role of Hakai in m6A modification pathway in Drosophila

    N6-methyladenosine (m6A), the most abundant internal modification in eukaryotic mRNA, is installed by a multi-component writer complex; however, the exact roles of each component remain poorly understood. This study shows that a potential E3 ubiquitin ligase Hakai colocalizes and interacts with other m6A writer components, and Hakai mutants exhibit typical m6A pathway defects in Drosophila, such as lowered m6A levels in mRNA, aberrant Sxl alternative splicing, wing and behavior defects. Hakai, Vir, Fl(2)d and Flacc form a stable complex, and disruption of either Hakai, Vir or Fl(2)d led to the degradation of the other three components. Furthermore, MeRIP-seq indicates that the effective m6A modification is mostly distributed in 5' UTRs in Drosophila, in contrast to the mammalian system. Interestingly, it was demonstrated that m6A modification is deposited onto the Sxl mRNA in a sex-specific fashion, which depends on the m6A writer. Together, this work not only advances the understanding of mechanism and regulation of the m6A writer complex, but also provides insights into how Sxl cooperate with the m6A pathway to control its own splicing (Wang, 2021).

    There are a variety of chemical modifications on biological macromolecules, such as proteins, nucleic acids, and glycolipids. Like DNA methylation and histone modification, RNA modification represents an extra layer of epigenetic regulatory mechanism. More than 150 chemical modifications in RNA have been discovered, and their biological functions are only starting to be revealed. Chemical modifications of RNA exist in all organisms and for all forms of RNA, including tRNA, rRNA, mRNA, and long noncoding RNA. Common RNA modifications include N6-methyladenosine (m6A), N6,2’-O-dimethyladenosine (m6Am), N1-methyladenosine (m1A), 5-methylcytidine (m5C), N4-acetylcytidine (ac4C), 7-methylguanosine (m7G), and pseudouridine (Ψ). Among them, m6A is the most abundant internal modification of mRNA in eukaryotes. Although m6A in mRNA was found more than 40 years ago, it was only recently that the field has made extensive progress owing to technological and experimental breakthroughs. By combining m6A-specific antibody and high-throughput sequencing, MeRIP-Seq or m6A-Seq allows the m6A mapping at the whole transcriptome level, thereby providing the possibility to correlate RNA modifications with their biological functions . These and subsequent studies revealed that m6A sites contain a consensus motif RRACH (R = G/A; H = U/A/C), and m6A peaks are enriched in the 3' untranslated region (UTR) and near the stop codon in yeast and mammals. In Arabidopsis, m6A is enriched not only in 3'UTRs and near the stop codon but also in 5'UTRs and around the start codon. In mammalian cells, m6A also accumulates in the 5'UTR region in response to stress conditions such as heat shock. The distribution of m6A is important since it implies the mechanism by which m6A modification regulates its mRNA (Wang, 2021).

    Another major breakthrough is the gradual elucidation of the m6A modification pathway by biochemical and genetic studies. The m6A is deposited by a multicomponent methyltransferase complex ('writers'), mainly recognized by YTH domain-containing 'readers', and can be removed by FTO and ALKBH5 'erasers', although FTO was also indicated as an m6Am demethylase. The key catalytic component of the m6A writer complex, Mettl3, was purified and cloned in the 1990s. Since then, studies from yeast, Arabidopsis, Drosophila, and mammalian cells have identified several core components of the writer complex, including Mettl14, WTAP (Fl(2)d), VIRMA (Virilizer), RBM15/15B (Spenito), ZC3H13 (Flacc or Xio), and Hakai. Interestingly, Fl(2)d, Virilizer (Vir), Spenito (Nito), and Xio were first identified from Drosophila sex determination screens and later realized as part of the writer complex. They regulate Drosophila sex determination by controlling the alternative splicing of the master regulatory gene Sex-lethal (Sxl). Recently, Mettl3, Mettl14, as well as the reader Ythdc1, were also shown to be involved in this process. However, the detailed mechanism of how the m6A modification cooperates with Sxl protein to modulate its own splicing is still unclear. Thus, Drosophila can serve as a unique system to screen components in the m6A pathway and pinpoints a critical role for m6A in regulating splicing. Other than Sxl splicing, Drosophila m6A genes are highly expressed in the nervous system and exhibit similar wing and behavior defects when mutated. Mutants of several fly m6A factors are viable and thus provide an ideal model to study other processes, such as metabolism and immunity, in the future (Wang, 2021).

    Hakai, also known as CBLL1, was found as an interacting protein with several m6A writer components in proteomic studies. It encodes a RING finger-type E3 ubiquitin ligase and was originally identified as an E-cadherin-binding protein in human cell lines. It was proposed that Hakai ubiquitinates E-cadherin at the plasma membrane and induces its endocytosis, thus playing a negative role post-translationally. Due to the key role of E-cadherin in tumor metastasis, especially epithelial-mesenchymal transition, Hakai has been extensively studied mainly using cell culture and overexpression system, but a previous study using the Drosophila model did not observe an increase of E-cadherin level in Hakai mutants. In Arabidopsis, Hakai mutants show partially reduced m6A levels and the mutant phenotypes are weaker than other writer components. Importantly, the in vivo role of Hakai as a core m6A writer component has not been studied in any animal species. This study analyzed the role of Hakai in the Drosophila m6A modification pathway. The results demonstrated that Hakai is a bona fide member of the m6A writer complex, with its mutants showing reduced global m6A levels, typical m6A mutant phenotypes, and commonly-regulated gene sets. A high-quality fly m6A methylome was obtained using stringent MeRIP-seq, discovered a female-specific m6A methylation pattern for Sxl mRNA, characterized the role of Hakai in the m6A writer complex, and finally revisited the function of Hakai in E-cadherin regulation (Wang, 2021).

    m6A modification has been known for more than 40 years but has recently gained great attention due to the emergence of technologies to map m6A methylome, as well as the identification of the writers, readers, and erasers in this pathway. Since the initial purification of the key methyltransferase Mettl3, other components of the writer complex were gradually identified through biochemical experiments and genetic screens. It is now known that m6A writer complex is comprised of multiple components including Mettl3, Mettl14, WTAP, VIRMA, RBM15/15B, ZC3H13. Hakai was first indicated as a WTAP interaction protein and was shown later to be required for full m6A methylation in Arabidopsis; however, its role in the m6A pathway in animals has not been studied. This study shows that Hakai interacts with other m6A writer subunits, and Hakai mutants exhibit characteristic m6A pathway phenotypes, such as lowered m6A levels in mRNA, aberrant alternative splicing of Sxl and other genes, held-out wings, and flightless flies, as well as reduced m6A peaks shared with Mettl3 and Mettl14 mutants in MeRIP-seq. Altogether, these data unambiguously argue that Hakai is the seventh, and likely last core component of the conserved m6A writer complex (Wang, 2021).

    Each component in the m6A writer complex plays a role in mRNA methylation but their exact roles are not well understood. This systematic analysis of several m6A writer subunits has provided insights into the mechanism of this important complex. l(2)d, Vir, Hakai, and Flacc were found to form a stable complex, and knocking down either of Fl(2)d, Vir, or Hakai led to the degradation of the other three components. Mettl3, Mettl14, and Nito were not affected by the disruption of Fl(2)d, Vir or Hakai, suggesting that they have separate functions. Knocking down Flacc resulted in less nuclear staining of Fl(2)d, consistent with a role in nuclear localization of the writer complex. Based on these results, a model is proposed for the m6A methyltransferase complex. Mettl3 and Mettl14 form a stable heterodimer to catalyze the addition of the methyl group to mRNA. Nito/RBM15 contains three RRM domains and binds to positions adjacent to m6A sites, thus may provide target specificity for the m6A writer complex. Fl(2)-Vir-Hakai-Flacc form a platform to connect different components and may integrate environmental and cellular signals to regulate m6A methylation (Wang, 2021).

    Hakai is a potential E3 ubiquitin ligase with an intact C3HC4 RING domain and a C2H2 domain. Its absence led to the degradation, rather than the accumulation of other m6A writer subunits, indicating that it may not act as an E3 ubiquitin ligase in this complex. Hakai was initially identified as an E-cadherin binding protein to downgrade its levels CR50, and the role of Hakai in cell proliferation and tumor progression was extensively studied in cell culture. However, the current in vivo analysis using various genetic tools did not find a role of Hakai in E-cadherin regulation. In addition, Hakai appeared as a ubiquitous nuclear protein showing little co-localization with E-cadherin in the membrane. Consistently, Hakai was shown to interact with PTB-associated splicing factor (PSF), a nuclear protein, and to affect its RNA-binding ability. Thus, the role of Hakai in E-cadherin regulation needs to be further investigated using the knockout mouse model and whether Hakai has other substrates for its E3 ligase activity also needs to be determined (Wang, 2021).

    Recent emerging studies suggest that m6A is involved in numerous developmental processes and human diseases, mainly by regulating mRNA stability, translation, or splicing. Pioneer work has established the framework for the m6A pathway in Drosophila. However, only published Drosophila m6A methylome was performed in S2R + cells or embryos and was not done against writer mutants. Other than Sxl, few m6A target loci have been firmly mapped. By performing MeRIP-seq in wild-type adult flies as well as Mettl3, Mettl14, and Hakai mutants, this study demonstrated that although most m6A peaks are distributed in 3'UTRs, the functional peaks responding to the loss of m6A writers are mainly located in 5'UTRs. This finding indicates a major difference between Drosophila and mammalian m6A methylome, that mainly occurs in 3'UTRs, and is in agreement with a recently published manuscript using miCLIP. Interestingly, LC-MS data show that the overall level of m6A modification in Drosophila only accounted for 10-20% of that in mammalian cells. Mettl3 or Mettl14 mutants are embryonic lethal in mice while they develop into adults in flies. It is possible that the m6A pathway acquires additional functions during evolution (Wang, 2021).

    m6A modification in 3'UTRs usually causes mRNA instability and m6A in 5'UTRs is linked to translation enhancement. In agreement with the view that functional m6A peaks are located in 5'UTRs in Drosophila, this study did not observe an increase in mRNA half-life of m6A targets in Mettl3 mutants compared to wild-type. These results imply that the major role of m6A modification in Drosophila is not on mRNA degradation, but possibly on translation upregulation, which can be tested by combining ribosome profiling and functional analysis of a single transcript in the future. The current data by combining MeRIP-seq and splicing analysis shed light on how the m6A modification contributes to splicing regulation. In all five cases analyzed, four (Dsp1, CG8929, fl(2)d, Aldh-III) in 5'UTRs and one (Sxl) in exon/intron, reduction of m6A modification was correlated with enhanced splicing, arguing that the normal role of these modifications might be to repress splicing events nearby (Wang, 2021).

    Last but probably the most interesting finding from this work is to demonstrate the female-specific m6A modification around Sxl exon3. Sxl is a textbook paradigm to study alternative splicing and has been intensively investigated for more than thirty years. Sxl protein binds to its own mRNA to control the alternative splicing, but its binding sites are located ~200 nucleotides downstream or upstream of the male exon, meaning other regulators should be involved. Recently, the m6A modification pathway was shown to modulate Sxl alternative splicing, but the detailed mechanism has not been resolved. The MeRIP-seq data revealed that several m6A peaks were deposited only in females on and around Sxl exon3, and they were in the vicinity of Sxl-binding sites. This finding was further validated by independent m6A-IP-qPCR and showed that these modifications were reduced in Mettl3 mutant females. This unexpected finding suggests a model that one main function of Sxl may be to recruit the m6A writer complex that methylates nearby m6A sites. The m6A reader Ythdc1 in turn binds to these sites and might interact with the splicing machinery to repress splicing. Future experiments, such as interactions between Sxl and Mettl3/Mettl14, interactions between Ythdc1 and general splicing factors, mapping of the exact m6A methylation site in Sxl at the single nucleotide level, comparison of transcriptome-wide binding sites of Sxl with m6A modification sites, will be required to firmly prove the model (Wang, 2021).

    Molecular and genetic dissection of recursive splicing

    Intronic ratchet points (RPs) are abundant within long introns in the Drosophila genome and consist of juxtaposed splice acceptor and splice donor (SD) sites. Although they appear to encompass zero-nucleotide exons, it was recently clarified that intronic recursive splicing (RS) requires a cryptic exon at the RP (an RS-exon), which is subsequently always skipped and thus absent from mRNA. In addition, Drosophila encodes a smaller set of expressed exons bearing features of RS. This study investigated mechanisms that regulate the choice between RP and RS-exon SDs. First, analysis of Drosophila RP SD mutants demonstrates that SD competition suppresses inclusion of cryptic exons in endogenous contexts. Second, characterization of RS-exon reporters implicates exonic sequences as influencing choice of RS-exon usage. Using RS-exon swap and mutagenesis assays, it was shown that exonic sequences can determine RS-exon inclusion. Finally, evidence is provided that splicing can suppress utilization of RP SDs to enable RS-exon expression. Overall, multiple factors can influence splicing of Drosophila RS-exons, which usually result in their complete suppression as zero-nucleotide RPs, but occasionally yield translated RS-exons (Joseph, 2021).

    SRP54 mediates circadian rhythm-related, temperature-dependent gene expression in Drosophila

    Recent studies have shown that alternative splicing (AS) plays an important role in regulating circadian rhythm. However, it is not clear whether clock neuron-specific AS is circadian rhythm dependent and what genetic and environmental factors mediate the circadian control of AS. By genome-wide RNA sequencing, SRP54 was identified is one of the Clock (Clk) dependent alternative splicing factors. Genetic interaction between Clock and SRP54 alleles showed that the enhancement of the circadian phenotype increased with temperature, being strongest at 29 °C and weakest at 18 °C. The alternative splicing and differential gene expression profile of Clock and SRP54 overlapped with the circadian-related gene profiles identified in various genome-wide studies, indicating that SRP54 is involved in circadian rhythm. By analyzing of the RNA-seq results at different temperatures, it was found that the roles of Clock and SRP54 are temperature dependent. Multiple novel temperature-dependent transcripts not documented in current databases were also found (Li, 2022).

    Mutations in the splicing regulator Prp31 lead to retinal degeneration in Drosophila

    Retinitis pigmentosa (RP) is a clinically heterogeneous disease affecting 1.6 million people worldwide. The second-largest group of genes causing autosomal dominant RP in human encodes regulators of the splicing machinery. Yet, how defects in splicing factor genes are linked to the aetiology of the disease remains largely elusive. To explore possible mechanisms underlying retinal degeneration caused by mutations in regulators of the splicing machinery, mutations were induced in Drosophila Prp31, the orthologue of human PRPF31, mutations in which are associated with RP11. Flies heterozygous mutant for Prp31 are viable and develop normal eyes and retina. However, photoreceptors degenerate under light stress, thus resembling the human disease phenotype. Degeneration is associated with increased accumulation of the visual pigment rhodopsin 1 and increased mRNA levels of twinfilin, a gene associated with rhodopsin trafficking. Reducing rhodopsin levels by raising animals in a carotenoid-free medium not only attenuates rhodopsin accumulation, but also retinal degeneration. Given a similar importance of proper rhodopsin trafficking for photoreceptor homeostasis in human, results obtained in flies presented in this study will also contribute to further unravel molecular mechanisms underlying the human disease (Hebbar, 2021).

    Retinitis pigmentosa is a clinically heterogeneous group of retinal dystrophies, which affects more than one million people worldwide. It often starts with night blindness in early childhood, continues with the loss of the peripheral visual field (tunnel vision), and progresses to complete blindness in later life due to gradual degeneration of photoreceptor cells (PRCs). RP is a genetically heterogeneous disease and can be inherited as autosomal dominant (adRP), autosomal recessive (arRP) or X-linked (xlRP) disease. So far >90 genes have been identified that are causally related to non-syndromic RP. Affected genes are functionally diverse. Some of them are expressed specifically in PRCs and encode, among others, transcription factors (e.g., CRX, an otx-like photoreceptor homeobox gene), components of the light-induced signalling cascade, including the visual pigment rhodopsin (Rho/RHO in Drosophila/human), or genes controlling vitamin A metabolism (e.g., RLBP-1, encoding Retinaldehyde-binding protein). Other genes are associated with a more general control of cellular homeostasis, for example genes involved in trafficking or cell polarity (e.g. CRB1). Interestingly, the second-largest group of genes causing adRP, comprising 7 of 25 genes known, encodes regulators of the splicing machinery. So far, mutations in five pre-mRNA processing factor (PRPF) genes, PRPF3, PRPF4, PRPF6, PRPF8 and PRPF31, have been linked to adRP, namely RP18, RP70, RP60, RP13 and RP11, respectively. Pim1-associated protein (PAP1) and small nuclear ribonuclearprotein-200 (SNRNP200), two genes also involved in splicing, have been suggested to be associated with RP9 and RP33, respectively. The five PRPF genes encode components regulating the assembly of the U4/U6.U5 tri-snRNP, a major module of the pre-mRNA spliceosome machinery. Several hypotheses have been put forward to explain why mutations in ubiquitously expressed components of the general splicing machinery show a dominant phenotype only in the retina. One hypothesis suggests that PRCs with only half the copy number of a gene encoding a general splicing component cannot cope with the elevated demand of RNA-/protein synthesis required to maintain the exceptionally high metabolic rate of PRCs in comparison to other tissues. Hence, halving their gene dose eventually results in apoptosis. Although this model is currently favoured, other mechanisms, such as impaired splicing of PRC-specific mRNAs or toxic effects caused by accumulation of mutant proteins have been discussed and may contribute to the disease phenotype. More recent data obtained from retinal organoids established from RP11 patients showed that removing one copy of PRPF31 affects the splicing machinery specifically in retinal and retinal pigment epithelial (RPE) cells, but not in patient-derived fibroblasts or iPS cells (Hebbar, 2021).

    The observation that all adRP-associated genes involved in splicing are highly conserved from yeast to human allows use of model organisms to unravel the genetic and cell biological functions of these genes in order to obtain mechanistic insight into the origin of the diseases. In the case of RP11, the disease caused by mutations in PRPF31, three mouse models have been generated by knock-in and knock-out approaches. Unexpectedly, mice heterozygous mutant for a null allele or a point mutation that recapitulates a mutation in the corresponding human gene did not show any sign of retinal degeneration in 12- and 18-month-old mice, respectively. Further analyses revealed that the retinal pigment epithelium, rather than the PRCs, is the primary tissue affected in Prpf31 heterozygous mice. Other data show that homozygous PRPF31 mice are not viable. Morpholino-induced knockdown of zebrafish Prpf31 results in strong defects in PRC morphogenesis and survival. Defects induced by retina-specific expression of zebrafish Prpf31 constructs that encode proteins with the same mutations as those mapped in RP11 patients (called AD5 and SP117) were explained to occur by either haplo-insufficiency or by a dominant-negative effect of the mutant protein. In Drosophila, no mutations in the orthologue Prp31 have been identified so far. RNAi-mediated knockdown of Prp31 in the Drosophila eye has been shown to result in abnormal eye development, ranging from smaller eyes to complete absence of the eye, including loss of PRCs and pigment cells (Hebbar, 2021).

    In order to get better insights into the mechanisms by which Prp31 prevents retinal degeneration this study aimed to establish a meaningful Drosophila model for RP11-associated retinal degeneration. Therefore two mutant alleles were isolated of Prp31, Prp31P17 and Prp31P18, which carry missense mutations affecting conserved amino acids. Flies heterozygous for either of these mutations are viable and develop normally. Strikingly, when exposed to constant light, these mutant flies undergo retinal degeneration, thus mimicking the disease phenotype of RP11 patients. Degeneration of mutant PRCs is associated with accumulation and abnormal distribution of the visual pigment rhodopsin, Rh1, in PRCs. Reduction of dietary vitamin A, a precursor of the chromophore 11-cis-3-hydroxyretinal, which binds to opsin to generate the functional rhodopsin, mitigates both aspects of the mutant phenotype, rhodopsin accumulation and retinal degeneration. From this it is concluded that Rh1 accumulation and/or misdistribution reflect a degeneration-prone condition in the Prp31 mutant retina (Hebbar, 2021).

    The results reveal that mutations in the Drosophila orthologue Prp31 induce PRC degeneration under light stress, thus mimicking features of RP11-associated symptoms. Similar to those in human, mutations in Drosophila Prp31 are haplo-insufficient and lead to retinal degeneration when heterozygous. This is in stark contrast to mice heterozygous for Prpf31, which did not show any signs of PRC degeneration, but rather late-onset defects in the retinal pigment epithelium (Hebbar, 2021).

    By using three different genetic approaches this study provides convincing evidence that the knockdown of Prp31 is the cause of the retinal degeneration observed. (1) The two Prp31 alleles induced by TILLING (Prp31P17 and Prp31P18) carry missense mutations in conserved amino acids of the coding region, which are predicted to be damaging. (2) Flies heterozygous for any of three deletions, which completely remove the Prp31 locus, exhibit comparable phenotypes as flies heterozygous for Prp31 point mutations. (3) RNAi-mediated knockdown of Prp31 results in light-induced retinal degeneration. The results obtained suggest that the two missense mutations mapped in Prp31P17 and Prp31P18 are strong hypomorphic alleles. First, the two Drosophila alleles characterised in this study are hemizygous (Prp31/deficiency) and homozygous (in the case of Prp31P18) viable and fertile. Second, mutations in the two established Prp31 fly lines are missense mutations, one located N-terminal to the NOSIC domain in Prp31P17 (G90R) and the other in the Nop domain in Prp31P18 (P277L), which most likely result in a reduced function of the respective proteins. Whether protein levels are also decreased cannot be answered due to the lack of specific antibodies. The mutated amino acid residue in Drosophila Prp31P18 (P277L) lies within the snoRNA binding domain (NOP domain. Interestingly, many point mutations in human PRPF31, which are linked to RP11, have been mapped to the Nop domain. Similar as in yeast, the Nop domain in human PRPF31 is involved in an essential step in the formation of the U4/U6-U5 tri-snRNP by building a complex of the U4 snRNA and a 15.5K protein, thus stabilising the U4/U6 snRNA junction. The mutated proline in Drosophila Prp31P18 precedes a histidine (H278), which corresponds to amino acid H270 in the human protein. Mutations in H270 in the Nop domain of human PRPF31 result in a reduced affinity of PRPF31 to the complex formed by a stem-loop structure of the U4 snRNA and the 15.5K protein. Therefore, it is tempting to speculate that the Drosophila P277L mutation could similarly weaken, but not abolish the corresponding interaction of the mutant Prp31 protein with the U4/U6 complex. Further experiments are required to determine the functional consequences of the molecular lesions identified in Drosophila Prp31 (Hebbar, 2021).

    It was noticed that the retinal phenotype observed upon reduction of Prp31 is more variable than that observed upon loss of crb. This could be due to the fact that all Prp31 conditions analysed represent hypomorphic conditions, possibly retaining some residual protein function(s). However, the expressivity of the mutant phenotype is not increased in Prp31/deficiency flies (carrying only one mutant copy) in comparison to that of Prp31/+ flies, which carry one mutant and one wild-type allele. Interestingly, human RP11 patients heterozygous for mutations in Prpf31 show an unusually high degree of phenotypic non-penetrance and can even be asymptomatic. Various causes have been uncovered to explain this feature. These include a highly variable expression level of the remaining wild-type Prpf31 allele, possibly due to changes in the expression levels of trans-acting regulators. In addition, mutant PRPF31 proteins can form cytoplasmic aggregates in RPE cells, thus reducing the amount of protein entering the nucleus, or can impair overall transcription or splicing, as described in Prpf31 zebrafish models. Finally, mutations in unlinked genes have been suggested to modify the disease severity of patients (Hebbar, 2021).

    Not only in flies, but also in human, mutations in PRPF31 affect only the retina, despite the importance of this splicing regulator in all cells. Recently published data show that impaired PRPF31 function can affect the splicing of target genes in a cell-type specific manner. Strikingly, retinal cells isolated from RP11 patient-derived retinal organoids exhibit mis-splicing of genes that encode components of the splicing machinery itself. This was not observed in fibroblasts or iPS cells derived from the same patients. Similar results were obtained from the retina and the RPE of Prpf31/+ mice. Mutant RPE cells additionally revealed splicing defects in transcripts of genes with functions in ciliogenesis, cell polarity and cellular adhesion, which could explain the previously described RPE defects in these mice (Hebbar, 2021).

    In the retina of flies lacking one functional copy of Prp31, PRCs showed increased levels of Rh1, both in the rhabdomeres and in the cytoplasm, as revealed by immunostaining and confirmed by western blot analysis. However, increased Rh1 levels did not affect rhabdomere size or structure. This is in contrast to results obtained in the mouse, where transgenic overexpression of either wild-type bovine or human rhodopsin induced an increase in outer segment volume of rod PRCs. In several other Drosophila mutants, accumulation of Rh1 in endocytic compartments has been suggested to cause retinal degeneration due to its toxicity. For example, dominant mutations in Drosophila ninaE result in ER accumulation of misfolded Rh1 due to impaired protein maturation. This, in turn, causes an overproduction of ER cisternae and induces the unfolded protein response (UPR), which eventually leads to apoptosis of PRCs, both in flies and in mammals (Hebbar, 2021).

    Interestingly, mis-localisation of rhodopsin in human PRCs to sites other than the outer segment is a common characteristic of adRP induced by mutations in rhodopsin and is considered to contribute to the pathological severity. The current data suggest that increased accumulation of rhodopsin contributes to degeneration in Prp31 mutant retinas. Reduction of Rh1 by depletion of dietary carotenoid not only obliterated increased Rh1 immunoreactivity and opsin retention in perinuclear compartments in Prp31 mutants, but also reduced the degree of PRC degeneration. However, whether increased Rh1 accumulation in the rhabdomere or in the cytoplasm contributes to light-dependent PRC degeneration of Prp31 mutant flies needs to be explored in the future (Hebbar, 2021).

    The data further suggest that Prp31 regulates, directly or indirectly, Rh1 levels at a posttranscriptional level, since no increase of RNA levels was detected in heads of Prp31/+ flies. This result is different from that obtained in primary murine retinal cell cultures, where expression of a mutant Prpf31 gene reduced rhodopsin expression, as a result of impaired splicing of the rhodopsin pre-mRNA. Similarly, siRNA-mediated knockdown of PRPF31 function in human organotypic retinal flat-mount cultures (HORFC) reduced mRNAs encoding genes involved in phototransduction and photoreceptor structure, including rhodopsin. Interestingly, the Prp31 mutants described in this study show increased mRNA levels of an evolutionary conserved actin monomer binding protein called twinfilin (twf), which inhibits actin polymerisation. Knockdown of twf results in excessive cytoplasmic Rh1 staining, suggesting defects in its trafficking. In Prp31 mutants, an increase in rhabdomeric Rh1 was observed as well as increased twf mRNA. From this correlation it is hypothesised that upregulation of twf mRNA in Prp31 might be in part responsible for at least the rhabdomeric Rh1 accumulation. Rh1 also accumulates in the cytoplasm of Prp31 mutant PRCs. The current data exclude the role of Rab11-mediated targeting of Rh1 in this accumulation. Now, it remains to be determined if the deregulation of other trafficking routes or the upregulation of twf contributes to the increased Rh1 in the cytoplasm. In the future, it may be interesting to explore the link between increased Rh1 levels as observed in Drosophila Prp31 mutants, increased mRNA levels of twinfilin and impaired Rh1 trafficking. Additionally, a detailed transcriptome analysis should elucidate possible defects in transcription and/or splicing of target genes, thus also allowing a better understanding of the aetiology of the human disease (Hebbar, 2021).

    Reverse complementary matches simultaneously promote both back-splicing and exon-skipping

    Circular RNAs (circRNAs) play diverse roles in different biological and physiological environments and are always expressed in a tissue-specific manner. Especially, circRNAs are enriched in the brain tissues of almost all investigated species, including humans, mice, Drosophila, etc. Through large-scale neuron isolation from the first larval (L1) stage of C. elegans followed by RNA sequencing with ribosomal RNA depletion, the neuronal circRNA data in C. elegans were obtained. Hundreds of novel circRNAs were annotated with high accuracy. circRNAs were highly expressed in the neurons of C. elegans and were positively correlated to the levels of their cognate linear mRNAs. Disruption of reverse complementary match (RCM) sequences in circRNA flanking introns effectively abolished circRNA formation. In the zip-2 gene, deletion of either upstream or downstream RCMs almost eliminated the production of both the circular and the skipped transcript. Interestingly, the 13-nt RCM in zip-2 is highly conserved across five nematode ortholog genes, which show conserved exon-skipping patterns. Finally, through in vivo one-by-one mutagenesis of all the splicing sites and branch points required for exon-skipping and back-splicing in the zip-2 gene, this study showed that back-splicing still happened without exon-skipping, and vice versa. Through protocol optimization, total RNA obtained from sorted neurons is increased to hundreds of nanograms. circRNAs highly expressed in the neurons of C. elegans are more likely to be derived from genes also highly expressed in the neurons. RCMs are abundant in circRNA flanking introns, and RCM-deletion is an efficient way to knockout circRNAs. More importantly, these RCMs are not only required for back-splicing but also promote the skipping of exon(s) to be circularized. Finally, RCMs in circRNA flanking introns can directly promote both exon-skipping and back-splicing, providing a new explanation for the correlation between them (Cao, 2021).

    Wilms' Tumor 1-Associating Protein complex regulates alternative splicing and polyadenylation at potential G-quadruplex-forming splice site sequences

    Wilms' tumor 1-associating protein (WTAP) is a core component of the N6-methyladenosine (m6A)-methyltransferase complex, along with VIRMA, CBLL1, ZC3H13 (KIAA0853), RBM15/15B, and METTL3/14, which generate m6A, a key RNA modification that affects various process of RNA metabolism. WTAP also interacts with splicing factors; however, despite strong evidence suggesting a role of Drosophila WTAP homolog fl(2)d in alternative splicing (AS), its role in splicing regulation in mammalian cells remains elusive. This study demonstrates using RNAi coupled with RNA-seq that WTAP, VIRMA, CBLL1, and ZC3H13 modulate AS, promoting exon skipping and intron retention in AS events that involve short introns/exons with higher GC content and introns with weaker polypyrimidine-tract and branch points. Further analysis of GC-rich sequences involved in AS events regulated by WTAP, together with minigene assay analysis, revealed potential G-quadruplex formation at splice sites where WTAP has an inhibitory effect. this study also found that several AS events occur in the last exon of one isoform of MSL1 and WTAP, leading to competition for polyadenylation. Proteomic analysis also suggested that WTAP/CBLL1 interaction promotes recruitment of the 3'-end processing complex. Taken together, these results indicate that the WTAP complex regulates AS and alternative polyadenylation via inhibitory mechanisms in GC-rich sequences (Horiuchi, 2021).

    A splicing variant of Charlatan, a Drosophila REST-like molecule, preferentially localizes to axons

    Neuron-restrictive silencing factor (NRSF), also known as RE-1 silencing transcription factor (REST), has pivotal functions in many neuron-specific genes. Previous studies revealed that neuron-specific alternative splicing (AS) of REST produces divergent forms of REST variants and provides regulatory complexity in the nervous system. However, the biological significance of these variants in the regulation of neuronal activities remains to be clarified. This study revealed that Charlatan (Chn), a Drosophila REST-like molecule, is also regulated by neuron-specific AS. Neuron-specific AS produced six divergent variants of Chn proteins, one of which preferentially localized to axons. A small sequence of this variant was especially important for the axonal localization. These data suggest that some variants have roles beyond the transcriptional regulation of neuronal activities (Yamasaki, 2021).

    Two oppositely-charged sf3b1 mutations cause defective development, impaired immune response, and aberrant selection of intronic branch sites in Drosophila

    SF3B1 mutations occur in many cancers, and the highly conserved His662 residue is one of the hotspot mutation sites. To address effects on splicing and development, strains were constructed carrying point mutations at the corresponding residue His698 in Drosophila using the CRISPR-Cas9 technique. Two mutations, H698D and H698R, were selected due to their frequent presence in patients and notable opposite charges. Both the sf3b1-H698D and-H698R mutant flies exhibit developmental defects, including less egg-laying, decreased hatching rates, delayed morphogenesis and shorter lifespans. Interestingly, the H698D mutant has decreased resistance to fungal infection, while the H698R mutant shows impaired climbing ability. Consistent with these phenotypes, further analysis of RNA-seq data finds altered expression of immune response genes and changed alternative splicing of muscle and neural-related genes in the two mutants, respectively. Expression of Mef2-RB, an isoform of Mef2 gene that was downregulated due to splicing changes caused by H698R, partly rescues the climbing defects of the sf3b1-H698R mutant. Lariat sequencing reveals that the two sf3b1-H698 mutations cause aberrant selection of multiple intronic branch sites, with the H698R mutant using far upstream branch sites in the changed alternative splicing events. This study provides in vivo evidence from Drosophila that elucidates how these SF3B1 hotspot mutations alter splicing and their consequences in development and in the immune system (Zhang, 2021).

    The Hox transcription factor Ultrabithorax binds RNA and regulates co-transcriptional splicing through an interplay with RNA polymerase II

    Transcription factors (TFs) play a pivotal role in cell fate decision by coordinating gene expression programs. Although most TFs act at the DNA layer, few TFs bind RNA and modulate splicing. Yet, the mechanistic cues underlying TFs activity in splicing remain elusive. Focusing on the Drosophila Hox TF Ultrabithorax (Ubx), this work shed light on a novel layer of Ubx function at the RNA level. Transcriptome and genome-wide binding profiles in embryonic mesoderm and Drosophila cells indicate that Ubx regulates mRNA expression and splicing to promote distinct outcomes in defined cellular contexts. These results demonstrate a new RNA-binding ability of Ubx. The N51 amino acid of the DNA-binding Homeodomain is non-essential for RNA interaction in vitro, but is required for RNA interaction in vivo and Ubx splicing activity. Moreover, mutation of the N51 amino acid weakens the interaction between Ubx and active RNA Polymerase II (Pol II). These results reveal that Ubx regulates elongation-coupled sweet taste system of Drosophila promotes context-dependent devaluation of an egg-laying option that contains sucrose, an otherwise highly appetitive tastant, is unknown. This study reports that devaluation of sweetness/sucrose for egg-laying is executed by a sensory pathway recruited specifically by the sweet neurons on the legs of Drosophila First, silencing just the leg sweet neurons caused acceptance of the sucrose option in a sucrose versus plain decision, whereas expressing the channelrhodopsin CsChrimson in them caused rejection of a plain option that was "baited" with light over another that was not. Analogous bidirectional manipulations of other sweet neurons did not produce these effects. Second, circuit tracing revealed that the leg sweet neurons receive different presynaptic neuromodulations compared to some other sweet neurons and were the only ones with postsynaptic partners that projected prominently to the superior lateral protocerebrum (SLP) in the brain. Third, silencing one specific SLP-projecting postsynaptic partner of the leg sweet neurons reduced sucrose rejection, whereas expressing CsChrimson in it promoted rejection of a light-baited option during egg-laying. These results uncover that the Drosophila sweet taste system exhibits a functional division that is value-based and task-specific, challenging the conventional view that the system adheres to a simple labeled-line coding scheme (Carnesecchi, 2022).

    Mutations equivalent to Drosophila mago nashi mutants imply reduction of Magoh protein incorporation into exon junction complex

    Pre-mRNA splicing imprints mRNAs by depositing multi-protein complexes, termed exon junction complexes (EJCs). The EJC core consists of four proteins, eIF4AIII, MLN51, Y14 and Magoh. Magoh is a human homologue of Drosophila Mago nashi protein, which is involved in oskar mRNA localization in Drosophila oocytes. This study determined the effects of Magoh mutations equivalent to those of Drosophila mago nashi mutant proteins that cause mis-localization of oskar mRNA. It was found that Magoh I90T mutation caused mis-localization of Magoh protein in the cytoplasm by reducing its binding activity to Y14. On the other hand, G18R mutation did not affect its binding to Y14, but this mutation reduced its association with spliced mRNAs. These results strongly suggest that Magoh mutations equivalent to Drosophila mago nashi mutants cause improper EJC formation by reducing incorporation of Magoh into EJC (Oshizuki, 2022).

    Crystal structure of SFPQ-NONO heterodimer

    The Drosophila behavior/human splicing (DBHS) protein family is composed of the three members SFPQ, NONO and PSPC1. These proteins share a strong sequence and structural homology within the core-structured domains forming obligate homo- and heterodimers. This feature may lead to the simultaneous existence of six different dimeric complexes that sustain their function in many cellular processes such as pre-mRNA splicing, innate immunity, transcriptional regulation. In order to perform a complete structural analysis of all possible DBHS dimers,this study has solved the crystal structure of the missing DBHS heterodimer SFPQ-NONO at 3.0 Å resolution. Subtle changes were identfied in amino acid composition and local secondary structure of the NOPS region orientation that may modulate affinity between complexes. Interestingly this area is found mutated in aggressive skin cancers and adenocarcinomas (Schell, 2022).

    Origins and evolution of human tandem duplicated exon substitution events

    The mutually exclusive splicing of tandem duplicated exons produces protein isoforms that are identical save for a homologous region that allows for the fine tuning of protein function. Tandem duplicated exon substitution events are rare, yet highly important alternative splicing events. Most events are ancient, their isoforms are highly expressed, and they have significantly more pathogenic mutations than other splice event. This study analysed the physicochemical properties and functional roles of the homologous polypeptide regions produced by the 236 tandem duplicated exon substitutions annotated in the human gene set. Most important structural and functional residues in these homologous regions are maintained, and most changes are conservative rather than drastic. Three quarters of the isoforms produced from tandem duplicated exon substitution events are tissue specific, particularly in nervous and cardiac tissues, and tandem duplicated exon substitution events are enriched in functional terms related to structures in the brain and skeletal muscle. Considerable evidence was found for the convergent evolution of tandem duplicated exon substitution events in vertebrates, arthropods and nematodes. Twelve human gene families have orthologues with tandem duplicated exon substitution events in both Drosophila melanogaster and Caenorhabditis elegans. Six of these gene families are ion transporters, suggesting that tandem exon duplication in genes that control the flow of ions into the cell has an adaptive benefit. The ancient origins, the strong indications of tissue-specific functions, and the evidence of convergent evolution suggest that these events may have played important roles in the evolution of animal tissues and organs (Martinez-Gomez, 2022).

    Sex-lethal regulates back-splicing and generation of the sex-differentially expressed circular RNAs

    Conversely to canonical splicing, back-splicing connects the upstream 3' splice site (SS) with a downstream 5'SS and generates exonic circular RNAs (circRNAs) that are widely identified and have regulatory functions in eukaryotic gene expression. However, sex-specific back-splicing in Drosophila has not been investigated and its regulation remains unclear. This study performed multiple RNA analyses of a variety sex-specific Drosophila samples and identified over ten thousand circular RNAs, in which hundreds are sex-differentially and -specifically back-spliced. Intriguingly, w expression of Sxl, an RNA-binding protein encoded by Sex-lethal (Sxl), the master Drosophila sex-determination gene that is only spliced into functional proteins in females, promoted back-splicing of many female-differential circRNAs in the male S2 cells, whereas expression of a Sxl mutant (SXLRRM) did not promote those events. Using a monoclonal antibody, this study further obtained the transcriptome-wide RNA-binding sites of Sxl through PAR-CLIP. After splicing assay of mini-genes with mutations in the Sxl-binding sites, it esd revealed that Sxl-binding on flanking exons and introns of pre-mRNAs facilitates back-splicing, whereas Sxl-binding on the circRNA exons inhibits back-splicing. This study provides strong evidence that Sxl has a regulatory role in back-splicing to generate sex-specific and -differential circRNAs, as well as in the initiation of sex-determination cascade through canonical forward-splicing (Fan, 2023).

  • References

    Alexandrov, A., Colognori, D., Shu, M. D. and Steitz, J. A. (2012). Human spliceosomal protein CWC22 plays a role in coupling splicing to exon junction complex deposition and nonsense-mediated decay. Proc Natl Acad Sci U S A 109(52): 21313-21318. PubMed ID: 23236153

    Ashton-Beaucage, D., Udell, C. M., Lavoie, H., Baril, C., Lefrancois, M., Chagnon, P., Gendron, P., Caron-Lizotte, O., Bonneil, E., Thibault, P. and Therrien, M. (2010). The exon junction complex controls the splicing of MAPK and other long intron-containing transcripts in Drosophila. Cell 143: 251-262. PubMed ID: 20946983

    Ashwal-Fluss, R., Meyer, M., Pamudurti, N. R., Ivanov, A., Bartok, O., Hanan, M., Evantal, N., Memczak, S., Rajewsky, N. and Kadener, S. (2014). circRNA biogenesis competes with pre-mRNA splicing. Mol Cell 56: 55-66. PubMed ID: 25242144

    Barbosa, I., Haque, N., Fiorini, F., Barrandon, C., Tomasetto, C., Blanchette, M. and Le Hir, H. (2012). Human CWC22 escorts the helicase eIF4AIII to spliceosomes and promotes exon junction complex assembly. Nat Struct Mol Biol 19(10): 983-990. PubMed ID: 22961380

    Bradley, T., Cook, M. E. and Blanchette, M. (2015). SR proteins control a complex network of RNA-processing events. RNA 21(1):75-92. PubMed ID: 25414008

    Cao, D. (2021). Reverse complementary matches simultaneously promote both back-splicing and exon-skipping. BMC Genomics 22(1): 586. PubMed ID: 34344317

    Cao, T., Akhter, S. and Jin, J. P. (2022). Early Divergence of the C-Terminal Variable Region of Troponin T Via a Pair of Mutually Exclusive Alternatively Spliced Exons Followed by a Selective Fixation in Vertebrate Heart. J Mol Evol 90(6): 452-467. PubMed ID: 36171395

    Carrasco, J., Rauer, M., Hummel, B., Grzejda, D., Alfonso-Gonzalez, C., Lee, Y., Wang, Q., Puchalska, M., Mittler, G. and Hilgers, V. (2020). ELAV and FNE determine neuronal transcript signatures through exon-activated rescue. Mol Cell 80(1): 156-163. PubMed ID: 33007255

    Cheng, L., Zhang, Y., Zhang, Y., Chen, T., Xu, Y. Z. and Rong, Y. S. (2020). Loss of the RNA trimethylguanosine cap is compatible with nuclear accumulation of spliceosomal snRNAs but not pre-mRNA splicing or snRNA processing during animal development. PLoS Genet 16(10): e1009098. PubMed ID: 33085660

    Conn, S. J., Pillman, K. A., Toubia, J., Conn, V. M., Salmanidis, M., Phillips, C. A., Roslan, S., Schreiber, A. W., Gregory, P. A. and Goodall, G. J. (2015). The RNA binding protein Quaking regulates formation of circRNAs. Cell 160: 1125-1134. PubMed ID: 25768908

    Carnesecchi, J., Boumpas, P., van Nierop, Y. S. P., Domsch, K., Pinto, H. D., Borges Pinto, P. and Lohmann, I. (2022). The Hox transcription factor Ultrabithorax binds RNA and regulates co-transcriptional splicing through an interplay with RNA polymerase II. Nucleic Acids Res 50(2): 763-783. PubMed ID: 34931250

    Dong, H., Guo, P., Zhang, J., Wu, L., Fu, Y., Li, L., Zhu, Y., Du, Y., Shi, J., Zhang, S., Li, G., Xu, B., Bian, L., Zhu, X., You, W., Shi, F., Yang, X., Huang, J. and Jin, Y. (2022). Self-avoidance alone does not explain the function of Dscam1 in mushroom body axonal wiring. Curr Biol. PubMed ID: 35659864

    Erkelenz, S., Stankovic, D., Mundorf, J., Bresser, T., Claudius, A. K., Boehm, V., Gehring, N. H. and Uhlirova, M. (2021). Ecd promotes U5 snRNP maturation and Prp8 stability. Nucleic Acids Res. PubMed ID: 33444449

    Fan, Y. J., Ding, Z., Zhang, Y., Su, R., Yue, J. L., Liang, A. M., Huang, Q. W., Meng, Y. R., Li, M., Xue, Y. and Xu, Y. Z. (2023). Sex-lethal regulates back-splicing and generation of the sex-differentially expressed circular RNAs. Nucleic Acids Res. PubMed ID: 37070178

    Fernandez-Castillo, E., Barbosa-Santillan, L. I., Falcon-Morales, L. and Sanchez-Escobar, J. J. (2022). Deep Splicer: A CNN Model for Splice Site Prediction in Genetic Sequences. Genes (Basel) 13(5). PubMed ID: 35627292

    Fox-Walsh, K. L., et al. (2005). The architecture of pre-mRNAs affects mechanisms of splice-site pairing. Proc. Natl. Acad. Sci. 102(45): 16176-81. 16260721

    Gibilisco, L., Zhou, Q., Mahajan, S. and Bachtrog, D. (2016). Alternative splicing within and between Drosophila species, sexes, tissues, and developmental stages. PLoS Genet 12(12): e1006464. PubMed ID: 27935948

    Hansen, T. B., Jensen, T. I., Clausen, B. H., Bramsen, J. B., Finsen, B., Damgaard, C. K. and Kjems, J. (2013). Natural RNA circles function as efficient microRNA sponges. Nature 495: 384-388. PubMed ID: 23446346

    Haussmann, I. U., Bodi, Z., Sanchez-Moran, E., Mongan, N. P., Archer, N., Fray, R. G. and Soller, M. (2016). m6A potentiates Sxl alternative pre-mRNA splicing for robust Drosophila sex determination. Nature 540(7632): 301-304. PubMed ID: 27919081

    Hebbar, S., Lehmann, M., Behrens, S., Halsig, C., Leng, W., Yuan, M., Winkler, S. and Knust, E. (2021). Mutations in the splicing regulator Prp31 lead to retinal degeneration in Drosophila. Biol Open 10(1). PubMed ID: 33495354

    Hong, W., Zhang, J., Dong, H., Shi, Y., Ma, H., Zhou, F., Xu, B., Fu, Y., Zhang, S., Hou, S., Li, G., Wu, Y., Chen, S., Zhu, X., You, W., Shi, F., Yang, X., Gong, Z., Huang, J. and Jin, Y. (2021). Intron-targeted mutagenesis reveals roles for Dscam1 RNA pairing architecture-driven splicing bias in neuronal wiring. Cell Rep 36(2): 109373. PubMed ID: 34260933

    Horiuchi, K., Kawamura, T. and Hamakubo, T. (2021). Wilms' Tumor 1-Associating Protein complex regulates alternative splicing and polyadenylation at potential G-quadruplex-forming splice site sequences. J Biol Chem: 101248. PubMed ID: 34582888

    Joseph, B., Kondo, S. and Lai, E. C. (2018). Short cryptic exons mediate recursive splicing in Drosophila. Nat Struct Mol Biol. PubMed ID: 29632374

    Joseph, B., Scala, C., Kondo, S. and Lai, E. C. (2022). Molecular and genetic dissection of recursive splicing. Life Sci Alliance 5(1). PubMed ID: 34759052

    Kao, S. Y., Nikonova, E., Chaabane, S., Sabani, A., Martitz, A., Wittner, A., Heemken, J., Straub, T. and Spletter, M. L. (2021). A Candidate RNAi Screen Reveals Diverse RNA-Binding Protein Phenotypes in Drosophila Flight Muscle. Cells 10(10). PubMed ID: 34685485

    Kramer, M. C., Liang, D., Tatomer, D. C., Gold, B., March, Z. M., Cherry, S. and Wilusz, J. E. (2015). Combinatorial control of Drosophila circular RNA expression by intronic repeats, hnRNPs, and SR proteins. Genes Dev 29(20):2168-82. PubMed ID: 26450910

    Lam, B. J. and Hertel, K. J. (2002). A general role for splicing enhancers in exon definition. RNA 8(10): 1233-41. 12403462

    Lee, S., Wei, L., Zhang, B., Goering, R., Majumdar, S., Wen, J., Taliaferro, J. M. and Lai, E. C. (2021). ELAV/Hu RNA binding proteins determine multiple programs of neural alternative splicing. PLoS Genet 17(4): e1009439. PubMed ID: 33826609

    Li, Y., Yang, X., Zhao, Z. and Du, J. (2022). SRP54 mediates circadian rhythm-related, temperature-dependent gene expression in Drosophila. Genomics 114(6): 110512. PubMed ID: 36273743

    Liu, M., Li, Y., Liu, A., Li, R., Su, Y., Du, J., Li, C. and Zhu, A. J. (2016). The exon junction complex regulates the splicing of cell polarity gene dlg1 to control Wingless signaling in development. Elife 5:e17200. PubMed ID: 27536874

    Martinez-Gomez, L., Cerdan-Velez, D., Abascal, F. and Tress, M. L. (2022). Origins and evolution of human tandem duplicated exon substitution events. Genome Biol Evol. PubMed ID: 36346145

    Memczak, S., Jens, M., Elefsinioti, A., Torti, F., Krueger, J., Rybak, A., Maier, L., Mackowiak, S. D., Gregersen, L. H., Munschauer, M., Loewer, A., Ziebold, U., Landthaler, M., Kocks, C., le Noble, F. and Rajewsky, N. (2013). Circular RNAs are a large class of animal RNAs with regulatory potency. Nature 495: 333-338. PubMed ID: 23446348

    Miao, G., Guo, L. and Montell, D. J. (2022). Border cell polarity and collective migration require the spliceosome component Cactin. J Cell Biol 221(7). PubMed ID: 35612426

    Mohr, C., Kleiner, S., Blanchette, M., Pyrowolakis, G. and Hartmann, B. (2017). Sex-specific transcript diversity in the fly head Is established during pupal stages and adulthood and is largely independent of the mating process and the germline. Sex Dev [Epub ahead of print]. PubMed ID: 28273663

    Monedero Cobeta, I., Stadler, C. B., Li, J., Yu, P., Thor, S. and Benito-Sipos, J. (2018). Specification of Drosophila neuropeptidergic neurons by the splicing component brr2. PLoS Genet 14(8): e1007496. PubMed ID: 30133436

    Obrdlik, A., Lin, G., Haberman, N., Ule, J. and Ephrussi, A. (2019). The transcriptome-wide landscape and modalities of EJC binding in adult Drosophila. Cell Rep 28(5): 1219-1236. PubMed ID: 31365866

    Oshizuki, S., Matsumoto, E., Tanaka, S. and Kataoka, N. (2022). Mutations equivalent to Drosophila mago nashi mutants imply reduction of Magoh protein incorporation into exon junction complex. Genes Cells. PubMed ID: 35430764

    Pai, A. A., Henriques, T., McCue, K., Burkholder, A., Adelman, K. and Burge, C. B. (2017). The kinetics of pre-mRNA splicing in the Drosophila genome and the influence of gene architecture. Elife 6. PubMed ID: 29280736

    Pai, A. A., Paggi, J. M., Yan, P., Adelman, K. and Burge, C. B. (2018). Numerous recursive sites contribute to accuracy of splicing in long introns in flies. PLoS Genet 14(8): e1007588. PubMed ID: 30148878

    Pang, T. L., Ding, Z., Liang, S. B., Li, L., Zhang, B., Zhang, Y., Fan, Y. J. and Xu, Y. Z. (2021). Comprehensive Identification and Alternative Splicing of Microexons in Drosophila. Front Genet 12: 642602. PubMed ID: 33859668

    Prudencio, P., Savisaar, R., Rebelo, K., Goncalo Martinho, R. and Carmo-Fonseca, M. (2021). Transcription and splicing dynamics during early Drosophila development. RNA. PubMed ID: 34667107

    Rathore, O. S., Silva, R. D., Ascensao-Ferreira, M., Matos, R., Carvalho, C., Marques, B., Tiago, M. N., Prudencio, P., Andrade, R. P., Roignant, J. Y., Barbosa-Morais, N. L. and Martinho, R. G. (2020). NineTeen Complex-subunit Salsa is required for efficient splicing of a subset of introns and dorsal-ventral patterning. RNA. PubMed ID: 32963109

    Roignant, J. Y. and Treisman, J. E. (2010). Exon junction complex subunits are required to splice Drosophila MAP kinase, a large heterochromatic gene. Cell 143: 238-250. PubMed ID: 20946982

    Ruan, K., Zhu, Y., Li, C., Brazill, J.M. and Zhai, R.G. (2015). Alternative splicing of Drosophila Nmnat functions as a switch to enhance neuroprotection under stress. Nat Commun 6: 10057. PubMed ID: 26616331

    Salzman, J., Chen, R. E., Olsen, M. N., Wang, P. L. and Brown, P. O. (2013). Cell-type specific features of circular RNA expression. PLoS Genet 9: e1003777. PubMed ID: 24039610

    Schell, B., Legrand, P. and Fribourg, S. (2022). Crystal structure of SFPQ-NONO heterodimer. Biochimie 198: 1-7. PubMed ID: 35245601

    Skrajna, A., Yang, X. C., Bucholc, K., Zhang, J., Hall, T. M., Dadlez, M., Marzluff, W. F. and Dominski, Z. (2017). U7 snRNP is recruited to histone pre-mRNA in a FLASH-dependent manner by two separate regions of the Stem-Loop Binding Protein. RNA 23(6):938-951. PubMed ID: 28289156

    Stegeman, R., Hall, H., Escobedo, S. E., Chang, H. C. and Weake, V. M. (2018). Proper splicing contributes to visual function in the aging Drosophila eye. Aging Cell: e12817. PubMed ID: 30003673

    Stoiber, M. H., Olson, S., May, G. E., Duff, M. O., Manent, J., Obar, R., Guruharsha, K., Artavanis-Tsakonas, S., Brown, J. B., Graveley, B. R. and Celniker, S. E. (2015). Extensive cross-regulation of post-transcriptional regulatory networks in Drosophila. Genome Res [Epub ahead of print]. PubMed ID: 26294687

    Steckelberg, A. L., Altmueller, J., Dieterich, C. and Gehring, N. H. (2015). CWC22-dependent pre-mRNA splicing and eIF4A3 binding enables global deposition of exon junction complexes. Nucleic Acids Res 43(9): 4687-4700. PubMed ID: 25870412

    Tian, M. and Maniatis, T. (1992). Positive control of pre-mRNA splicing in vitro. Science 256(5054): 237-40. 1566072

    Teixeira, F. K., Okuniewska, M., Malone, C. D., Coux, R. X., Rio, D. C. and Lehmann, R. (2017). piRNA-mediated regulation of transposon alternative splicing in the soma and germ line. Nature 552(7684): 268-272. PubMed ID: 29211718

    Verma, D., Hegde, V., Kirkpatrick, J. and Carlomagno, T. (2023). The EJC disassembly factor PYM is an intrinsically disordered protein and forms a fuzzy complex with RNA. Front Mol Biosci 10: 1148653. PubMed ID: 37065448

    Wang, M., Branco, A. T. and Lemos, B. (2018). The Y chromosome modulates splicing and sex-biased intron retention rates in Drosophila. Genetics 208(3):1057-1067. PubMed ID: 29263027

    Wang, Q., Abruzzi, K. C., Rosbash, M. and Rio, D. C. (2018). Striking circadian neuron diversity and cycling of Drosophila alternative splicing. Elife 7. PubMed ID: 29863472

    Wang, Y., Zhang, L., Ren, H., Ma, L., Guo, J., Mao, D., Lu, Z., Lu, L. and Yan, D. (2021). Role of Hakai in m6A modification pathway in Drosophila. Nat Commun 12(1): 2159. PubMed ID: 33846330

    Wei, L., Lee, S., Majumdar, S., Zhang, B., Sanfilippo, P., Joseph, B., Miura, P., Soller, M. and Lai, E. C. (2020). Overlapping activities of ELAV/Hu family RNA binding proteins specify the extended neuronal 3' UTR landscape in Drosophila. Mol Cell 80(1): 140-155. PubMed ID: 33007254

    Westholm, J. O., et al. (2014). Genome-wide analysis of Drosophila circular RNAs reveals their structural and sequence properties and age-dependent neural accumulation. Cell Rep. 9(5): 1966-80. PubMed ID: 25544350

    Yamasaki, Y., Lim, Y. M., Minami, R. and Tsuda, L. (2021). A splicing variant of Charlatan, a Drosophila REST-like molecule, preferentially localizes to axons. Biochem Biophys Res Commun 578: 35-41. PubMed ID: 34536827

    Zhang, B., Ding, Z., Li, L., Xie, L. K., Fan, Y. J. and Xu, Y. Z. (2021). Two oppositely-charged sf3b1 mutations cause defective development, impaired immune response, and aberrant selection of intronic branch sites in Drosophila. PLoS Genet 17(11): e1009861. PubMed ID: 34723968

    Zygotically transcribed genes

    Home page: The Interactive Fly © 1995, 1996 Thomas B. Brody, Ph.D.

    The Interactive Fly resides on the
    Society for Developmental Biology's Web server.