logo Drosophila genes listed by biochemical function
RNA binding proteins and proteins involved in post-transcriptional regulation

The architecture of pre-mRNAs affects mechanisms of splice-site pairing

abstrakt
DEAD-box subfamily ATP-dependent helicase protein

Adar
double-stranded RNA adenosine deaminase

apontic
novel bZIP transcription factor and RNA-binding protein

Argonaute 1
PAZ domain protein involved in post-transcriptional gene silencing - mutants exhibit defects in the embryonic nervous system

armitage
RNA helicase involved in posttranscriptional gene silencing - mutations disrupt mRNA translational silencing of oskar in the oocyte and silencing of Stellate in male germ cells

arrest
ribonucleoprotein-type RNA-binding protein

Ars2
RNA-binding protein - key component of Drosophila antiviral immunity - interacts with Dcr-2 - required for siRNA-mediated silencing -
plays an essential role in miRNA-mediated silencing

Ataxin-2
multi-functional protein that binds DEAD box helicases of the Me31B family that associated with Argonaute and microRNA function -
required for microRNA function and synapse-specific long-term olfactory habituation -
assembles with polyribosomes and poly(A)-binding protein, a key regulator of mRNA translation

aubergine
related to eukaryotic translation initiation factor 2C - involved in post-transcriptional gene silencing

bag of marbles
functions as a translation repressor by interfering with translation initiationg

benign gonial cell neoplasm
cofactor of Bag of marbles that directly inhibits Pumilio repression of Nanos mRNA activity
to promote differentiation of germ line stem cells

Bicaudal C
RNA-binding protein that regulates expression of specific germline mRNAs by controlling their poly(A)-tail length

bicoid
transcription factor - homeodomain - RNA binding protein

boule
RRM motif protein involved in spermatogenesis

brain tumor
involved in post-transcriptional regulation of Hunchback mRNA

bruno (preferred name: arrest)
ribonucleoprotein-type RNA-binding protein

crooked neck
mRNA splicing factor that participates in the assembly and control of the splicing machinery

cup
translational repressor - represses oskar translation - physically interacts with Bruno

Dicer-1
ribonuclease III family, double-stranded RNA domain binding domain, DEAD/DEAH box helicase, PAZ domain -
an enzyme involved in degrading RNA - involved in double-stranded RNA interference (RNAi) and post-transcriptional gene regulation (PTGS)

Dicer-2
DEAD/DEAH box helicase - mutants are defective in processing small interfering RNAs

egalitarian
RNA binding protein - participates along with BicaudalD in transport of mRNA during oogenesis - salivary gland morphogenesis

Eukaryotic initiation factor 4E
binds to the mRNA 5' cap thus controlling a crucial step in translation initiation - required for cell growth -
promotes dedifferentiation of neuroblasts back to a stem cell-like state thus functioning as an oncogene -
a target of Ago2 in translational repression - functions as a splice factor for msl-2 and Sxl pre-mRNAs

embryonic lethal abnormal vision (common alternative name: elav)
RNA binding protein

exuperantia
a core component of a large protein complex involved in localizing mRNAs both within nurse cells and the developing oocyte

female sterile (3) homeless (preferred name: spindle E)
DE-H family of RNA-dependent ATPases

Fmr1
KH domain RNA-binding protein - homolog of mammalian Fragile X mental retardation gene - represses Futsch translation - mutants have synaptic structural defects

4EHP
Eukaryotic initiation factor 4E - a cap binding protein that inhibits hunchback, caudal and bicoid mRNA translation

gawky
RNA-binding protein - glycine-tryptophan (GW) repeat protein required for P-body integrity -
promotes mRNA deadenylation, decapping and degradation

half pint (preferred name: poly-U-binding splicing factor)
RNA recognition motif protein - functions in both constitutive and alternative splicing - required during oogenesis

Helicase at 25E
DExH/D RNA helicase required in the nucleus for mRNA export - required in the cytoplasm for remodeling
the transacting factors that dictate mRNA cytoplasmic destination

hephaestus
polypyrimidine tract binding protein - controls alternative splicing - required to attenuate Notch activity after ligand-dependent activation during wing development

held out wings
KH motif

hiiragi (also known as poly(A) polymerase)
involved in both nuclear and cytoplasmic polyadenylation of mRNA

IGF-II mRNA-binding protein
zipcode-binding protein - regulator of RNA transport in oocytes and neurons - regulates aging of the Drosophila testis stem-cell niche -
contributes to the localized expression of gurken mRNA in the oocyte - counteracts endogenous small interfering RNAs to stabilize mRNAs

let-7 (preferred name: microRNA encoding gene let-7)
encodes an RNA species involved in translational silencing of target mRNAs

loquacious
a component of a functional pre-miRNA processing complex - stimulates and directs pre-microRNA processing activity

maelstrom
HMG box protein - spindle class protein - potential regulator of RNA processing or subcellular localization

maleless
DEAH-box subfamily ATP-dependent helicase

maternal expression at 31B
A DEAD-box helicase, part of a ribonuclear protein complex, that restricts translation of oocyte-localizing RNAs -
in neurons Me31B acts to promote translational repression and/or mRNA degradation in response to miRNAs

mei-P26
a conserved translational regulator that facilitate the switch from proliferation to differentiation - associates with miRNA
pathway components to represses the translation of target mRNAs - cooperates with Bam, Bgcn and Sxl
to promote early germline development in the Drosophila ovary - a target of Vasa in promoting stem cell differentiation

modulo
RRM-containing domain - modifier of PEV promoting chromatin compaction and inactivation - controls cellular growth rate downstream of dMYC

musashi
RNP-1 and RNP-2 motifs

muscleblind
RNA splice factor involved in terminal muscle and eye differentiation - mutants model features of myotonic dystrophy

nanos
RNA binding protein in oocyte - zinc finger protein

Negative elongation factor E
RNA-binding protein - along with other factors, NELF causes polymerase to pause in the promoter proximal region of heat shock genes

Nucleolar protein at 60B
pseudouridylate synthase - enzymatically modifies ribosomal RNA - required for maintenance of germ-line stem cells

off-schedule
functions as translation initiation factor eIF4G during spermatogenesis to coordinate the initiation of meiotic division and differentiation

orb
RRM - RNA binding protein

orb2
cytoplasmic polyadenylation element-binding protein - RNA binding protein - forms amyloid-like oligomers enriched in the synaptic membrane - critical for the persistence of long-term memory

oskar
novel - assembles germ plasm - probably does not bind RNA directly

Pabp2
poly(A) binding protein - functions in nuclear polyadenylation - cytoplasmic PABP2 acts to shorten the poly(A) tails of specific mRNAs

Painting of fourth
RNA-binding protein that increases transcription output from chromosome 4, targets specific loci on the X chromosome

partner of drosha
double-stranded RNA-binding protein - essential for the biogenesis of canonical miRNAs - required for imaginal disc growth

pelota
RNA binding motif protein - controls germ-line stem cell self-renewal by repressing a differentiation pathway, possibly through regulating translation

Pcf11
RNA binding motif protein - dismantles elongation complexes by a Pol II C-terminal domain (CTD) dependent mechanism - forms a bridge between the CTD and RNA

pitchoune
DEAD-box RNA helicase

polyA-binding protein
RNA-binding protein involved in translational regulation and nonsense-mediated mRNA decay

poly-U-binding splicing factor
RNA recognition motif protein - functions in both constitutive and alternative splicing - required during oogenesis

pumilio
novel - binds hunchback mRNA

r2d2
double-stranded RNA-binding protein - bridges the initiation and effector steps of the Drosophila RNAi pathway
by facilitating siRNA passage from Dicer, which carrys out the initiation step, to RISC, which carrys out the effector step

sans fille (also known as U1AsnRNP)
splicing factor - sex determination

Sex lethal
RNA binding and splicing

smaug
RNA binding protein - repressor of NOS translation

squid
hnRNP D homolog

staufen
Double stranded RNA binding protein

spindle E
DE-H family of RNA-dependent ATPases

split ends
codes for RRM motif RNA-binding protein - mutants have defects in Notch and Egf receptor signaling resulting defects in cell-fate and in axon guidance

Spt6
transcription elongation factor implicated in RNA processing and degradation of improperly processed pre-mRNA

survival motor neuron
RNA binding protein involved, along with Gemins, in the assembly of the small nuclear ribonucleoproteins that constitute the spliceosome -
neuromuscular junction protein required in both neurons and muscle for normal junctional morphology

trailer hitch
conserved protein involved in mRNA localization that interfaces with the secretory pathway to promote efficient protein trafficking in the cell

transformer
RNA splice factor - Sex determination - productively spliced in the presence of Sex lethal

transformer 2
RNA splice factor - along with Transformer directs female specific splicing of doublesex RNA

twin
degrades mRNA poly(A) tails - CCR4 componenet of an enzyme complex catalyzing mRNA deadenylation

U1snRNP preferred name: sans fille
splicing factor - sex determination

Upstream of N-ras
RNA-binding protein that interacts with Rox RNAs to repress dosage compensation complex formation in female
and promote its assembly on the male X chromosome

U2 small nuclear riboprotein auxiliary factor 50
splicing factor required for the ATP-dependent association of U2 snRNP with pre-mRNA branchpoints - forms a heterodimer
with the small splice factor U2af38 - U2af50 interacts with the intronic 3' polypyrimidine tract - the small subunit functions
in recognition of the 3' AG dinucleotide

vasa
maternal - RNA helicase

virilizer
transmembrane protein regulating Sex lethal splicing

vreteno
Tudor domain protein involved in regulation of Piwi-interacting RNAs (piRNAs) in gonadal tissues thus regulating mobile genetic elements

ypsilon schachtel
cold-shock domain protein involved in post-transcriptional regulation of Oskar mRNA

The architecture of pre-mRNAs affects mechanisms of splice-site pairing

The exon/intron architecture of genes determines whether components of the spliceosome recognize splice sites across the intron or across the exon. Using in vitro splicing assays, this study demonstrates that splice-site recognition across introns ceases when intron size is between 200 and 250 nucleotides. Beyond this threshold, splice sites are recognized across the exon. Splice-site recognition across the intron is significantly more efficient than splice-site recognition across the exon, resulting in enhanced inclusion of exons with weak splice sites. Thus, intron size can profoundly influence the likelihood that an exon is constitutively or alternatively spliced. An EST-based alternative-splicing database was used to determine whether the exon/intron architecture influences the probability of alternative splicing in the Drosophila and human genomes. Drosophila exons flanked by long introns display an up to 90-fold-higher probability of being alternatively spliced compared with exons flanked by two short introns, demonstrating that the exon/intron architecture in Drosophila is a major determinant in governing the frequency of alternative splicing. Exon skipping is also more likely to occur when exons are flanked by long introns in the human genome. Interestingly, experimental and computational analyses show that the length of the upstream intron is more influential in inducing alternative splicing than is the length of the downstream intron. It is concluded that the size and location of the flanking introns control the mechanism of splice-site recognition and influence the frequency and the type of alternative splicing that a pre-mRNA transcript undergoes (Fox-Walsh, 2005).

Pre-mRNA splicing is an essential process that accounts for many aspects of regulated gene expression. Of the ~25,000 genes encoded by the human genome, >60% are believed to produce transcripts that are alternatively spliced. Thus, alternative splicing of pre-mRNAs can lead to the production of multiple protein isoforms from a single pre-mRNA, exponentially enriching the proteomic diversity of higher eukaryotic organisms. Because regulation of this process can determine when and where a particular protein isoform is produced, changes in alternative-splicing patterns modulate many cellular activities (Fox-Walsh, 2005).

The spliceosome assembles onto the pre-mRNA in a coordinated manner by binding to sequences located at the 5' and 3' ends of introns. Spliceosome assembly is initiated by the stable associations of the U1 small nuclear ribonucleoprotein particle with the 5' splice site, branch-point-binding protein/SF1 with the branch point, and U2 snRNP auxiliary factor with the pyrimidine tract. ATP hydrolysis then leads to the stable association of U2 snRNP at the branch-point and functional splice-site pairing (Fox-Walsh, 2005).

Intron size has been correlated with rates of evolution and the regulation of genome size. The exon/intron architecture has also been shown to influence splice-site recognition. For example, increasing the size of mammalian exons results in exon skipping. However, the same enlarged exons are included when the flanking introns are small. Thus, splice-site recognition is more efficient when introns or exons are small. Because, in the human genome, the majority of exons are short and introns are long, it is expected that the vast majority of splice sites in the human genome are recognized across the exon. Lower eukaryotes have a genomic architecture that is typified by small introns and flanking exons with variable length, suggesting that splice-site recognition occurs across the intron. Consistent with this model, expansion of small introns in yeast or Drosophila causes loss of splicing, cryptic splicing, or intron retention. Taken together, these observations suggest that splice sites are recognized across an optimal nucleotide length (Fox-Walsh, 2005).

It is unknown whether splice-site recognition across the intron or across the exon results in similar efficiencies of spliceosomal assembly and/or splice-site pairing. This study demonstrates that splice-site recognition across the intron ceases when the intron reaches a length between 200 and 250 nt. Because splice-site recognition is more efficient across the intron, alternative splicing is less likely for exons flanked by short introns. This influence is supported experimentally and by computational analyses of Drosophila and human alternative-splicing databases. It is concluded that the size and location of the flanking introns control the mechanism of splice-site recognition and influence the frequency and the type of alternative pre-mRNA splicing (Fox-Walsh, 2005).

Previous studies have suggested that genes with small introns tend to be recognized across the intron, and genes with large introns are recognized across the exon. To determine the distance at which recognition of splice sites switches from cross-intron interactions to cross-exon interactions occurs, advantage was taken of an in vitro kinetic splicing assay that was originally used to demonstrate that exonic splicing enhancers (ESEs), discrete sequences within exons that promote both constitutive and regulated splicing, activate both splice sites of an exon simultaneously (Lam, 2002). A number of pre-mRNAs were designed with intron lengths ranging from 120 to 425 nt. Within each set, the pre-mRNAs differ only in the presence or absence of a well characterized 13-nt ESE derived from the Drosophila doublesex and Drosophila fruitless pre-mRNAs. Each pre-mRNA harbors the same weak 5' and 3' splice sites that require the activities of ESEs for recognition in their natural context (Tian, 1992; Lam, 2003). Because splicing factors present in HeLa cell nuclear extracts activate the ESEs used (Lam, 2003), the presence of functional or mutant enhancer elements within each test substrate determine its splicing efficiency. If the splice sites are recognized across the exon, it is expected that the activation of the splice sites on each exon constitutes a different step during spliceosomal assembly, because the ESE located on each exon will only aid in the recognition of its weak splice site. Thus, the activities of the separate ESEs are expected to display synergistic kinetics, because the activation of each ESE accelerates an independent step during spliceosomal assembly. However, if the splice sites are recognized across the intron, the ESE located on each exon will aid in the recognition of both weak splice sites, because the recruited spliceosomal components define the entire intron within one step. In this scenario, the activities of the separate ESEs are expected to display additive kinetics, because the activation of each ESE accelerates the same rate-limiting step during spliceosomal assembly (Fox-Walsh, 2005).

In vitro splicing assays were performed with each of the four pre-mRNA sets over a 3-h time course to determine the apparent rates of splicing. Pre-mRNAs with an intron size of 120 nt display additive kinetics. Using Drosophila nuclear extract (Kc), it was possible to demonstrate additive kinetics for substrates containing the 120-nt intron; however, it was not possible to detect sufficient splicing for the substrates containing longer introns. These results are consistent with in vitro studies demonstrating that splicing of pre-mRNAs with long introns is supported in HeLa nuclear extract but not in Kc extract. The kinetics of pre-mRNAs containing an intron 200 nt or less in length are additive. This behavior indicates that the spliceosomal components required for the recognition of both splice sites are recruited to the intron simultaneously. However, constructs with introns >200 nt demonstrate synergistic kinetics. It is concluded that the change from splice-site recognition across the intron to splice-site recognition across the exon occurs when the intronic length is between 200 and 250 nt (Fox-Walsh, 2005).

The kinetic analysis demonstrates that the upstream 5' splice site and the downstream 3' splice site are recognized simultaneously across introns <200 nt. Significantly, in the absence of ESEs, splice-site recognition across the intron is a much more efficient process than splice-site recognition across the exon. Thus, splice-site recognition across the intron may be able to rescue the inclusion of internal exons harboring weak splice sites. To test this hypothesis, a series of pre-mRNA substrates containing three exons was designed for in vitro splicing analysis in which the internal exon contains splice sites that are insufficiently recognized in the absence of ESEs. The four substrates generated differed only in their ability to be recognized across each intron by changing the length of the intron from <200 to >250 nt, thus permitting or discouraging splice-site recognition across the intron. As expected, the internal exon is predominantly excluded when flanked by two long introns. However, significant inclusion of the internal exon is observed if one of the flanking introns is short enough to support splice-site recognition across the intron. In fact, two short introns increase exon inclusion ~30 times greater than two long introns (Fox-Walsh, 2005).

To estimate the fractions of splice sites that may be recognized through cross-intron interactions, the flanking-intron lengths were recored for every internal exon within the human and Drosophila genomes. Genome information was obtained from the Alternative Splicing Database (ASD), which contains information about the exon/intron structure and EST-verified alternative-splicing events of several thousand genes. Within the human genome, many exons are flanked by at least one short intron, creating two separate populations, separated roughly by the intron length that is proposed to represent the transition of splice-site recognition from across the intron to across the exon. As expected from previous intron-length analyses, a very different distribution is seen in the Drosophila genome, where ~85% of exons are flanked by at least one short intron. An overlay of the Drosophila and human genomes demonstrates that the minimum intron length in the human genome is at the same location that demarcates the maximum intron length of the major Drosophila exon population. This difference in genome constraint may reflect specific compositional variations between the Drosophila and human spliceosomes (Fox-Walsh, 2005).

Because splice-site recognition across the intron rescues exon inclusion, how intron length influences alternative splicing within the Drosophila and human genomes was investigated. To do so, the flanking-intron information of each exon was correlated with exon-skipping and alternative-splice-site-activation events reported in the ASD to compute the probability that an exon is involved in alternative splicing, without taking into consideration the contributions made by splice-site signal strength and splicing enhancers or silencers. Thus, the correlation simply tests whether the influence of the exon/intron architecture on alternative splicing is significant enough to be detectable amid all other splicing determinants. Computational analysis of the Drosophila genome supports a significant role for intron length in defining the likelihood of alternative splicing. A striking influence of the exon/intron architecture is observed for simple exon-skipping events. Exons flanked by very long introns are up to 90-fold more likely to be skipped than exons that are flanked by two short introns. Significantly, the most drastic increase in the probability of alternative splicing (>10-fold) was observed when the length of flanking introns increased from 225 to 525 nt. In agreement with the experimental results, a greater probability that an exon is alternatively spliced was observed when the upstream intron is long. This polarity could be the consequence of coupling pre-mRNA splicing to transcription by RNA polymerase II. Even in the category of alternative 5' or 3' splice-site activation, alternative splicing is up to 10-fold more likely for exons that are flanked by long introns. It is concluded that, in Drosophila, exon skipping is a rare event for exons flanked by short introns and that the length of the upstream intron is of greater importance than the length of the downstream intron in determining whether an exon will be involved in exon skipping (Fox-Walsh, 2005).

Within the human genome, a similar correlation between the exon/intron architecture and the probability of exon skipping is observed; however, the ~5-fold maximal variance calculated is significantly lower than that observed for Drosophila. As for Drosophila, the length of the upstream intron is more important in determining the frequency of alternative splicing. In the case of alternative 5' or 3' splice-site usage, the opposite distribution of alternative splicing is seen in the human genome. The activation of alternative splice sites is less likely if the flanking introns are long. It is concluded that exon/intron architecture influences the frequency and type of alternative splicing that an exon may undergo in the Drosophila and human genomes (Fox-Walsh, 2005).

These experiments support the existence of two different mechanisms for splice-site recognition, splice-site recognition across the intron, and splice-site recognition across the exon. Splice-site recognition across the intron ceases when the intron size reaches the threshold length of >200 nt. Importantly, splice-site recognition across the intron is more efficient and increases the inclusion of exons with weak splice sites. These results demonstrate that the distance between splice sites affects efficient spliceosomal assembly. Presumably, the pairing of cross-exon-defined splice sites requires the interaction between two sets of pre-spliceosomes across an intron of variable length. In contrast, splice-site recognition across the intron already identifies the splice sites that will be paired. It is also possible that the kinetics of splice-site pairing are slowed because longer introns associate with an increased number of hnRNP proteins. HnRNP proteins coat nascent pre-mRNAs and are thought to interfere with the splicing reaction. Therefore, larger introns may reduce splicing by decreasing the relative concentration of splicing components through competition with hnRNPs (Fox-Walsh, 2005).

Additive kinetics of splice-site activation demonstrate that splice-site recognition across the intron is achieved through the recruitment of a multicomponent complex that contains components of the splicing machinery required for 5' and 3' splice-site definition. Interestingly, the activation of a single ESE results in a significant increase in splicing activity, suggesting that ESEs influence splice-site activation of adjacent exons. As anticipated from ESE distance/activity correlations, this effect depends on intron length. Given the unique combination of splice sites and cis-acting elements, it is possible that the precise transition from splice-site recognition across the intron to splice-site recognition across the exon may vary for different substrates. The presence of strong splice sites and enhancers or silencers could modulate the cross-intron recognition by increasing or decreasing the strength of interaction between spliceosomal components and the pre-mRNA (Fox-Walsh, 2005).

The observation that increasing exon length decreases exon inclusion suggests that similar distance limitations exist for splice-site recognition across the exon. Approximately 80% of human exons are <200 bp in length, the average being 170 bp. Importantly, exon length is tightly distributed when compared with intron length. These results demonstrate that maintaining exon size in the human genome is more important to the architecture and evolution of a gene than is maintaining intron size. In contrast to the human genome, exon size varies much more than intron size in yeast. The maximum intron length of 182 nt lies well within the size limitations of splice-site recognition across the intron. Taken together, these considerations support the notion that the majority of splice sites in higher eukaryotes are recognized across the exon, whereas lower eukaryotes employ splice-site recognition across the intron (Fox-Walsh, 2005).

It is well established that several types of exon and intron elements influence splice-site choice. The most prominent include the exon/intron junction signals and splicing enhancers and silencers. The results show that the exon/intron architecture is an additional parameter that affects the efficiency of splice-site recognition and alternative pre-mRNA splicing. When compared in otherwise isogenic test substrates, splice-site recognition across the intron rescues the inclusion of a weak internal exon by >10-fold. Even though the computational analysis ignores the contributions made by variable splice sites, enhancers, and silencers, a striking increase in the probability of alternative splicing is observed for Drosophila exons, whose splice sites are recognized across the exon. Thus, the exon/intron architecture in Drosophila is a major determinant in governing the probability of alternative splicing. Within the human genome, a qualitatively similar trend was observed for exon-skipping events but with a reduced magnitude. One major difference between the Drosophila and human gene architecture is intron length. Human genes are dominated by long introns (87% of introns are >250 nt), whereas short introns are much more common in Drosophila (66% are <250 nt). One possible explanation for the small intron size in Drosophila could be the pressure to maintain a constrained genome size in these fast-replicating organisms (Fox-Walsh, 2005).

Alternative splicing is extensive in both species, supporting the argument that both species benefit from expanded proteomes generated from alternative splicing. However, genome analysis suggests that there are significant differences in the weight of the mechanisms by which alternative splicing can be induced. In Drosophila, intron length is a major determinant in promoting alternative splicing patterns. In the human, additional mechanisms of controlling alternative splicing may have gained more influence on intron expansion to maintain balanced levels of alternative splicing (Fox-Walsh, 2005).



References

Fox-Walsh, K. L., et al. (2005). The architecture of pre-mRNAs affects mechanisms of splice-site pairing. Proc. Natl. Acad. Sci. 102(45): 16176-81. PubMed ID: 16260721

Lam, B. J. and Hertel, K. J. (2002). A general role for splicing enhancers in exon definition. RNA 8(10): 1233-41. PubMed ID: 12403462

Tian, M. and Maniatis, T. (1992). Positive control of pre-mRNA splicing in vitro. Science 256(5054): 237-40. PubMed ID: 1566072


Drosophila genes listed by biochemical function

Home page: The Interactive Fly © 1995, 1996 Thomas B. Brody, Ph.D.

The Interactive Fly resides on the
Society for Developmental Biology's Web server.