Dscam


REGULATION

Alternative Splicing of Dscam

Drosophila Dscam gene encodes an axon guidance receptor that can express 38,016 different mRNAs by virtue of alternative splicing. The Dscam gene contains 95 alternative exons that are organized into four clusters of 12, 48, 33, and 2 exons each. Although numerous Dscam mRNA isoforms can be synthesized, it remains to be determined whether different Dscam isoforms are synthesized at different times in development or in different tissues. The alternative splicing of the Dscam exon 4 cluster, which contains 12 mutually exclusive alternative exons, has been investigated and found to be developmentally regulated. The most highly regulated exon, 4.2, is infrequently used in early embryos but is the predominant exon 4 variant used in adults. Moreover, the developmental regulation of exon 4.2 alternative splicing is conserved in D. yakuba. In addition, different adult tissues express distinct collections of Dscam mRNA isoforms. Given the role of Dscam in neural development, these results suggest that the regulation of alternative splicing plays an important role in determining the specificity of neuronal wiring. In addition, this work provides a framework to determine the mechanisms by which complex alternative splicing events are regulated (Celotto, 2001).

All of the exon 4 variants are very similar in size, ranging from 159 to 171 nucleotides (nt; 4.11 = 159 nt; 4.1, 4.2, 4.3, 4.5, 4.6, 4.7 = 162 nt; 4.9 = 168 nt; and 4.4, 4.8, 4.10, 4.12 = 171 nt). Likewise, the 12 possible RT-PCR products obtained using primers in the constant exons 3 and 5 differ from one another by only 12 nt. As a result, traditional methods such as agarose gel electrophoresis cannot be used to analyze the alternative splicing of exon 4. By separating the RT-PCR products on a SSCP gel, which separates molecules based on conformational differences, the majority of the 12 RT-PCR products could be distinguished from one another. The identity of each band was assigned in two ways: (1) standards were generated using PCR products generated from 12 cDNA clones containing each exon 4 variant spliced to exons 3 and 5 (lanes 2-13); (2) each band from a reaction was excised from the SSCP gel, cloned, and sequenced (Celotto, 2001)

Although the majority of the RT-PCR products migrate at a distinct position in the gel, the RT-PCR products obtained from some mRNAs comigrate. Specifically, RT-PCR products containing exons 4.3 and 4.12 migrate as a single band as do the RT-PCR products synthesized from mRNAs containing exons 4.5, 4.7, and 4.9. Using this method, the relative frequency with which the majority of the exon 4 variants are utilized within each RNA sample can be determined (Celotto, 2001).

To determine if Dscam alternative splicing is regulated, the frequency at which each Dscam exon 4 variant is utilized throughout development was measured. Total RNA harvested from flies at various stages of development was used as a template for RT-PCR reactions with primers in exons 3 and 5 and the RT-PCR products were resolved on SSCP gels. The frequency at which most of the exon 4 variants are utilized does not change significantly throughout development. However, the splicing of two exons, 4.2 and 4.8, appears to be highly regulated (Celotto, 2001).

Exon 4.2 displays the most striking developmental changes. Only ~1% of the Dscam transcripts in 0- to 12-hr embryos contain exon 4.2. However, in first instar larvae (L1), exon 4.2-containing transcripts make up ~20% of the total Dscam mRNAs. An analysis of RNA isolated hourly from embryos raised at 22° reveals that Dscam transcripts containing exon 4.2 first appear at hour 12, which corresponds to embryonic stage 15. The relative abundance of Dscam transcripts containing exon 4.2 remains high throughout the remainder of development and, in adults, ~44% of the total Dscam mRNAs contain exon 4.2. This represents a 40- to 50-fold increase in exon 4.2 utilization between embryos and adults (Celotto, 2001).

The expression of Dscam mRNAs containing exon 4.8 is the opposite of the expression pattern of exon 4.2-containing transcripts. Approximately 20% of all Dscam mRNAs in embryos contain exon 4.8. The abundance of exon 4.8-containing transcripts decreases throughout the remainder of development and in adults only ~1% of the total Dscam mRNAs contain exon 4.8. It is concluded that alternative splicing of some of the Dscam exon 4 variants is dramatically regulated throughout development (Celotto, 2001).

The diversity of Dscam proteins generated by alternative splicing is thought to play an important role in determining the specificity of neuronal wiring. One prediction of this model is that neurons in different tissues would express different Dscam isoforms to direct their axons to specific addresses. To begin testing this model, the relative abundance was examined of each exon 4-containing Dscam mRNA isoform in different adult tissues (Celotto, 2001).

RNA was harvested from antennae, heads, wings, and legs dissected from adult flies. These RNA samples were subjected to RT-PCR and the products were separated on SSCP gels. These results show that the collection of Dscam transcripts is significantly different in each body part examined. For example, whereas ~45% of the total Dscam transcripts in legs and ~42% of the transcripts in wings contain exon 4.2, only ~16% of Dscam transcripts isolated from heads contain exon 4.2 (lanes 2-4). Similar differences are observed for many of the other exon 4 variants. It is concluded that the alternative splicing of the Dscam exon 4 variants is regulated in a tissue-specific manner (Celotto, 2001).

The most striking change observed is the developmental regulation of exon 4.2, which is not utilized in early embryos. To determine whether the alternative splicing of exon 4.2 has been conserved in other Drosophila species, the genomic DNA encompassing the exon 3-5 region of the Dscam gene was cloned from D. yakuba, which is estimated to have diverged from D. melanogaster 7-15 million years ago. The sequence of the exon 4 region of the D. yakuba Dscam gene is similar to the D. melanogaster gene throughout its length. Like D. melanogaster, the D. yakuba Dscam gene contains 12 variants of exon 4. The exon 4 variants are on average 95% identical between D. melanogaster and D. yakuba at the nucleotide level. Most of the exonic nucleotide changes are silent third-position changes. As a result, the protein sequences encoded by these exons are nearly identical between the two species (Celotto, 2001).

As expected, the sequences of the intron are more divergent than the exons. The nucleotide sequence of the introns separating the exon 4 variants is an average of 82% identical between D. melanogaster and D. yakuba. The introns between exons 3 and 4.1 and between exons 4.12 and 5 are 78% and 77% identical between the two species (Celotto, 2001).

A comparison of the splice site sequences flanking each of the exon 4 variants revealsthat the 5' splice sites are more conserved between the two species than the 3' splice sites. For example, between the two species there are only 2 nucleotide changes among the 12 exon 4 5' splice sites, whereas there are 25 nucleotide changes among the 12 exon 4 3' splice sites (Celotto, 2001).

Whether the developmental pattern of exon 4.2 alternative splicing observed in D. melanogaster is conserved in D. yakuba was tested. As with D. melanogaster, it was found that D. yakuba Dscam transcripts containing exon 4.2 are not expressed in embryos but are expressed in both larvae and adults. The relative abundance of the exon 4.8-containing transcripts decreases throughout D. yakuba development as in D. melanogaster. However, in both cases, the magnitude of the changes is lower in D. yakuba than in D. melanogaster. It is concluded that a similar developmental pattern of Dscam exon 4-regulated alternative splicing occurs in both D. melanogaster and D. yakuba (Celotto, 2001).

Stochastic yet biased expression of multiple Dscam splice variants by individual cells

Drosophila Dscam is essential for axon guidance and has 38,016 possible alternative splice forms. This diversity can potentially be used to distinguish cells. The Dscam mRNA isoforms expressed by different cell types and individual cells were analyzed. The choice of splice variants expressed is regulated both spatially and temporally. Different subtypes of photoreceptors express broad yet distinctive spectra of Dscam isoforms. Single-cell RT-PCR has documented that individual cells express several different Dscam isoforms and allows an estimation of the diversity that is present. For example, it is estimated that each R3/R4 photoreceptor cell expresses 14-50 distinct mRNAs chosen from the spectrum of thousands of splice variants distinctive of the R3/R4 cell type. Thus, the Dscam repertoire of each cell is different from those of its neighbors, providing a potential mechanism for generating unique cell identity in the nervous system and elsewhere (Neves, 2004).

Transmembrane/juxtamembrane domain-dependent Dscam distribution and function during mushroom body neuronal morphogenesis

Besides 19,008 possible ectodomains, Drosophila Dscam contains two alternative transmembrane/juxtamembrane segments, respectively, derived from exon 17.1 and exon 17.2. Would specific Dscam isoforms mediate formation and segregation of axonal branches in the Drosophila mushroom bodies (MBs)? Removal of various subsets of the 12 different exon 4 variants does not affect MB neuronal morphogenesis, while expression of a Dscam transgene only partially rescues Dscam mutant phenotypes. Interestingly, differential rescuing effects are observed between two Dscam transgenes that each possess one of the two possible versions of exon 17. Axon bifurcation/segregation abnormalities are better rescued by the exon 17.2-containing transgene, but coexpression of both transgenes is required for rescuing mutant viability. Meanwhile, exon 17.1 targets ectopically expressed Dscam-GFP to dendrites while Dscam[exon 17.2]-GFP is enriched in axons; only Dscam[exon 17.2] affects MB axons. These results suggest that exon 17.1 is minimally involved in axonal morphogenesis and that morphogenesis of MB axons probably involves multiple distinct exon 17.2-containing Dscam isoforms (Wang, 2004).

MB alpha/ß neurons homozygous for the C22-1 deficiency undergo normal morphogenesis despite loss of three-quarters of Dscam isoforms. It was therefore wondered whether a single Dscam isoform was sufficient for supporting MB morphogenesis. This possibility was assessed by supplementing Dscam null mutant MB neurons with specific Dscam isoforms. One challenge in such rescuing experiments is to drive expression of Dscam transgenes in a physiologically relevant manner (Wang, 2004).

Fortunately, a 4.5 kb genomic fragment that lies immediately 5′ to the Dscam start codon appears to be sufficient for driving transgene expression in the endogenous Dscam expression pattern. Fusing the 4.5 kb Dscam genomic fragment with GAL4, it was observed that this GAL4 driver can selectively and efficiently induce expression of UAS-controlled transgenes in both the peripheral and central nervous systems through different developmental stages. To examine transgene expression in more detail and to compare its pattern directly with endogenous Dscam's protein distribution pattern, the 4.5 kb Dscam genomic fragment was then fused with a Dscam cDNA that had been modified to encode a chimeric protein with GFP at Dscam's carboxyl terminal. In the wandering larval CNS, Dscam-GFP is broadly enriched in neuropil-like structures. Interestingly, endogenous Dscam is distributed in a similar pattern, as revealed by immunostaining with an anti-Dscam peptide antibody, encouraging use of the isolated 4.5 kb genomic fragment in driving the expression of various engineered Dscam transgenes in the rescuing experiments (Wang, 2004).

Dscam's ectodomain contains three variable regions that are encoded by exon 4, exon 6, and exon 9, respectively. Analysis of expressed Dscam sequences has revealed differential expression of distinct exon 4 alternatives in different tissues and at different developmental stages. For instance, the most highly regulated exon, 4.2, rarely exists in early embryos but is the predominant exon 4 variant present in adult. However, no requirement was detected for any specific Dscam exon 4 variant during MB morphogenesis. Given that the usage of exon alternatives is most regulated in the exon 9 cluster, it will be interesting to determine whether, in contrast with exon 4, specific exon 9 alternatives are required for normal MB morphogenesis. Nevertheless, consistent with the notion that the identities of individual Dscams' ectodomains might not be critical for MB neuronal morphogenesis, it was found that Dscam isoforms with a fixed ectodomain can mediate divergent segregation of axonal branches in most Dscam mutant MB neurons. It is possible that Dscam isoforms with another ectodomain may not rescue Dscam mutant MB neurons' morphogenetic defects at all, but this possibility can be largely ignored since similar rescuing results have been obtained when they supplemented Dscam null mutant MB neurons with various single-isoform Dscam transgenes (Zhan, 2004). Based on these results, it is likely that Dscam proteins with different ectodomains are equally potent in governing MB neuronal morphogenesis. But multiple isoforms with distinct ectodomains are still needed to fully support normal MB morphogenesis, given that Dscam isoforms with one fixed ectodomain significantly but partially rescue Dscam mutant phenotypes. Similar arguments could explain why Dscam isoforms with one fixed ectodomain are sufficient for rescuing organism lethality but fail to mediate normal brain development in rescued Dscam flies. However, partial rescue can be alternatively explained by other possibilities. For instance, driven by an arbitrary Dscam promoter, Dscam cDNA-genomic hybrid transgenes might not be expressed in identical spatiotemporal patterns as endogenous Dscam. Nevertheless, the involvement of multiple distinct Dscam isoforms is further suggested by the demonstration (Zhan, 2004) that multiple distinct Dscam ectodomains are expressed in any given MB neuron examined via single-cell RT-PCR (Wang, 2004).

Dscam's transmembrane/juxtamembrane domain is encoded by either exon 17.1 or exon 17.2. Interestingly, three independent lines of experiments have all demonstrated the possible involvement of Dscam proteins with different transmembrane/juxtamembrane segments in the morphogenesis of dendrites versus axons. (1) Only one of the two Dscam cDNA-genomic hybrid transgenes, which specifically vary in exon 17, significantly rescues Dscam mutant axonal morphogenetic defects. (2) Ectopically expressed Dscam is either localized to dendrites or enriched in axons, depending on the exon 17-encoding juxtamembrane/transmembrane variable segment. (3) Ectopic expression of Dscam isoforms with different exon 17 alternatives disrupts different developmental processes. All of these results are consistent with the notion that exon 17.1-containing Dscam isoforms are selectively targeted to dendrites and, thus, have minimal effects on axonal morphogenesis. In vivo, it is possible that basic morphogenesis of MB axons exclusively involves exon 17.2-containing Dscam isoforms and that Dscams with exon 17.1, which are specifically targeted to dendrites, might regulate morphogenesis and/or functions of dendrites. Axonal morphogenesis normally takes place before complex dendritic elaboration followed by synapse formation. Interestingly, exon 17.2-containing Dscam isoforms, but not exon 17.1 Dscams, can rescue early larval lethality in Dscam mutant organisms. These results imply little involvement of exon 17.1-containing Dscams in early neuronal morphogenetic processes and indirectly suggest possible roles for dendritic-targeted Dscams in the maturation of dendrites and/or synapse formation and modulation. These notions are further supported by the fact that Dscams with exon 17.1 are required for helping exon 17.2-containing Dscam isoforms to rescue Dscam mutants into the adult stage. However, it remains to be shown that endogenous Dscam proteins with exon 17.1 are selectively localized in dendrites, and Dscam's roles in dendritic morphogenesis and/or functions remain to be elucidated. In addition, although several cell surface proteins are known to exhibit polarized distribution in neurons and their sorting signals are being gradually identified, the exon 17.1-encoding juxtamembrane/transmembrane domain likely carries a novel dendrite-targeting motif based on its amino acid composition. A possible axon-targeting signal is likewise present in the juxtamembrane/transmembrane segment encoded by the Dscam exon 17.2 (Wang, 2004).

In summary, this study shows that specific Dscam isoforms are either targeted to dendrites or enriched in axons, raising the possibility that every single neuron might have distinct sets of Dscam molecules located in dendrites versus axons. Thus, simply by coupling different Dscam ectodomains with exon 17.1 versus exon 17.2, individual neurons could simultaneously send different messages to and/or respond differentially to their upstream and downstream neurons. Formation and modulation of neuronal connections might be fine tuned through the regulation of the compositions of Dscam proteins across synapses. In addition, it is suggested that only exon 17.2-containing Dscam isoforms are involved in governing axonal morphogenesis during early development of the nervous system. Finally, although no single ectodomain appears to be indispensable and various ectodomains might be functionally exchangeable, normal development of the Drosophila brain probably needs multiple distinct Dscam ectodomains (Wang, 2004).

Alternative splicing of Drosophila Dscam generates axon guidance receptors that exhibit isoform-specific homophilic binding

Dscam is an immunoglobulin (Ig) superfamily protein required for the formation of neuronal connections in Drosophila. Through alternative splicing, Dscam potentially gives rise to 19,008 different extracellular domains linked to one of two alternative transmembrane segments, resulting in 38,016 isoforms. All isoforms share the same domain structure but contain variable amino acid sequences within three Ig domains in the extracellular region. Different isoforms exhibit different binding specificity. Each isoform binds to itself but does not bind or binds poorly to other isoforms. The amino acid sequences of all three variable Ig domains determine binding specificity. Even closely related isoforms sharing nearly identical amino acid sequences exhibit isoform-specific binding. It is proposed that this preferential homophilic binding specificity regulates interactions between cells and contributes to the formation of complex patterns of neuronal connections (Wojtowicz, 2004).

This study demonstrates that a set of 11 different Dscam isoforms show surprising homophilic binding specificity. Each isoform preferentially binds to itself over different isoforms. Should this binding property extend to the entire spectrum of Dscam isoforms, this would provide enormous potential for regulating interactions between neurites during the establishment of neuronal connections (Wojtowicz, 2004).

Several lines of evidence support the view that Dscam proteins on opposing cell surfaces bind to each other. Mammalian Dscams have been shown to promote cell aggregation when transfected into cultured mouse cells. Dscam mediates interactions between cells in vivo. The trajectory of interneurons overexpressing a single isoform of Dscam is disrupted upon encountering midline cells that also overexpress the same Dscam isoform. That this reflects direct interactions between Dscam proteins on opposing cell surfaces is supported by the biochemical experiments presented in this paper. Dscam binding has been localized to the N-terminal eight Ig domains. Since this region contains the three variable Ig domains, it raised the possibility that differences within these domains could modulate interactions between isoforms (Wojtowicz, 2004).

All Dscam isoforms tested exhibited preferential binding to self over other isoforms. Isoforms differing in any one of the three variable Ig domains do not bind to each other, or show marked differences in binding. While binding between different isoforms was undetectable in the pull-down assay from S2 cell extracts, the possibility remains that weak interactions between different isoforms exist that are below the limit of detection of this assay. Quantification of the sensitivity of the pull-down assay demonstrated that 10-fold less protein on the Western blots would not have been reliably detected. Therefore, if heterophilic binding occurs between the isoforms tested, it is significantly weaker than the isoform-specific homophilic interaction. That interactions between different isoforms do occur under milder conditions (e.g., no detergent) is underscored by the binding of two isoforms of Dscam differing in seven amino acids in the bead binding-to-cells assay. Hence, it is speculated that, while each isoform preferentially binds to itself, isoforms also exhibit a range of weaker binding interactions with other isoforms (Wojtowicz, 2004).

All three variable Ig domains played a crucial role in binding specificity; swapping any one resulted in a marked reduction or a complete loss of binding. It is anticipated that future biochemical and structural studies will provide insights into the molecular basis of isoform-specific recognition. In the meantime, the simplest model for the 'matching' of alternative Ig domains is that each variable Ig domain interacts with the same variable Ig domain in an opposing molecule. The binding of all three Ig domains is likely required to stabilize otherwise weak interactions between individual variable domains (Wojtowicz, 2004).

How might isoform-specific binding contribute to wiring the fly brain? While the notion that each neuron may express one or only a few isoforms that specify interactions with other neurons is attractive, recent studies argue that single neurons express multiple isoforms and that even neurons of the same class express different and largely nonoverlapping sets of them. Hence, it is highly unlikely that any two neurons will express an identical set of isoforms. This ensures that the only neurites that express an identical set of isoforms are those from the same neuron (Wojtowicz, 2004).

Studies on developing mushroom body (MB) neurons provide support for the view that interactions between identical isoforms play a crucial role in mediating interactions between two neurites of the same cell. MB neurons extend axons that bifurcate at a common branch point, and the resulting sister branches segregate to different pathways. A prominent feature of the loss-of-function phenotype in these neurons is a failure of sister branches to segregate. A simple model to account for this is that identical isoforms of Dscam on sister branches bind to each other and induce a contact-dependent repulsive interaction perhaps analogous to repulsive interactions between Eph receptors and ephrin ligands. Prior to bifurcation, MB axons project together within a fascicle, and, following bifurcation, each sister branch also extends within a fascicle with other MB axon branches. To allow fasciculation, it is likely important that the array of Dscam isoforms expressed on each MB axon is different from its neighbors. Indeed, expression analysis reveals that, as with other neuronal subclasses, individual MB neurons express multiple isoforms and largely nonoverlapping arrays of them. Hence, while the specific isoforms of Dscam expressed in MB neurons may be unimportant, it may be crucial that neighboring MB axons express different isoforms. In support of this view, expression of a single Dscam isoform in multiple MB neurons induces a dominant phenotype characterized by defasciculation of MB axons, whereas expression of a single isoform in a single mutant neuron rescues the defect in the segregation of sister branches. The notion that interactions between identical isoforms induces a repellent response is consistent with other loss- and gain-of-function studies (Wojtowicz, 2004 and references therein).

It seems unlikely that Dscam acts only in a cell-autonomous fashion to mediate interactions between processes of the same neuron. In the absence of Dscam, defasciculation of axons has been observed both in the developing mushroom body and in Bolwig's nerve. Perhaps weaker signals resulting from interactions between different isoforms or between a small fraction of identical isoforms expressed on different neurons may promote adhesive interactions leading to fasciculation. Interestingly, recent studies have argued that different levels of Eph/ephrin signaling result in qualitatively different responses; high levels induce contact-dependent repulsion, while lower levels promote contact-dependent attraction (Wojtowicz, 2004 and references therein).

In summary, it is proposed that the nature of the interactions between Dscam isoforms on the surface of neurites produces qualitatively or quantitatively different intracellular signals influencing the development of neurites. Signaling may be modulated by the number of identical isoforms shared by two neurites, the level of expression of each isoform, and the binding affinity or avidity of different isoforms. For instance, high signaling levels produced by interactions between neurites of the same neuron expressing an identical array of Dscam isoforms would induce repulsion. Conversely, lower signals produced by weaker interactions between neurites of different cells expressing few or no identical Dscam isoforms would promote growth along one another, thereby allowing fasciculation. Since other Ig superfamily proteins have been shown to interact with multiple proteins, it remains possible that other Dscam phenotypes may reflect interactions with additional cell surface or soluble ligands that may or may not exhibit isoform-specific interactions (Wojtowicz, 2004).

Alternative splicing of Dscam has been highly conserved over some 250 million years separating the fly, the mosquito, and the bee. This observation, combined with the biochemistry reported in this study and genetic data establishing a role for Dscam in neuronal connectivity, supports the hypothesis that Dscam isoforms function as molecular tags contributing to the formation of precise patterns of neuronal connections (Wojtowicz, 2004 and references therein).

While Dscam diversity has been highly conserved during insect evolution, the mouse and human Dscam genes do not undergo extensive alternative splicing. In addition to Dscam, there are a number of genes in the fly genome with arrays of three or more alternatives for a given exon that encode related amino acid sequences. It is striking, however, that no mammalian genes appear to share a similar arrangement, although genes containing only two alternatives for a given exon are common in the mammalian genome. This suggests that diversification of gene function in the mammalian genome has not occurred through the massive cassette-like strategy utilized to generate biochemically distinct isoforms of Drosophila Dscam. Other mechanisms may have evolved in mammals to generate comparable diversity in neuronal cell surface proteins. These may include the use of large families of related proteins encoded by separate genes (e.g., odorant receptors), smaller families of proteins used in a combinatorial fashion (e.g., CNRs, MHC class II, classical cadherins), gradients of receptors and ligands (e.g., Ephrins and Eph receptors), or a combination of multiple genes, alternative transcription start sites, and alternative splicing, as in the case of neurexins (Wojtowicz, 2004 and references therein).

It is concluded that Dscam plays a widespread role in regulating the formation of neuronal connections in Drosophila. Recent expression studies revealed that different neurons express different combinations of Dscam isoforms endowing each neuron with a discrete molecular identity. The biochemical studies described in this study demonstrate that different Dscam isoforms have striking differences in binding specificity. It is proposed that a general function of Dscam diversity is to promote repellent interactions between neurites from the same cell expressing the same array of Dscam isoforms in a cell-autonomous fashion. Differences in the arrays of isoforms expressed in different neurons may also contribute to the patterning of neuronal connections. Whether Dscam diversity is indeed crucial to patterning neuronal connections in flies awaits additional analyses in which the number and type of Dscam isoforms expressed in different neurons are systematically manipulated (Wojtowicz, 2004).

Extensive diversity of Ig-superfamily proteins in the immune system of insects

The extensive somatic diversification of immune receptors is a hallmark of higher vertebrates. However, whether molecular diversity contributes to immune protection in invertebrates is unknown. Evidence is presented that Drosophila immune-competent cells have the potential to express more than 18,000 isoforms of the Ig-superfamily receptor Down syndrome cell adhesion molecule (Dscam). Secreted protein isoforms of Dscam were detected in the hemolymph and hemocyte-specific loss of Dscam impaired the efficiency of phagocytic uptake of bacteria, possibly due to reduced bacterial binding. Importantly, the molecular diversity of Dscam transcripts generated through a mechanism of alternative splicing is highly conserved across major insect orders, suggesting an unsuspected molecular complexity of the innate immune system of insects (Watson, 2005).

Immunoglobulin-domain-containing proteins constitute the largest repertoire of surface receptors in animals and serve many functions in molecular recognition, cell adhesion and signaling. Most striking is the exceptional diversity of antigen-specific receptors of the adaptive immune system in higher vertebrates, which depends on somatic gene rearrangement and clonal selection. However, somatic rearrangement of highly diverse immune receptors has been considered to exist in a relatively small number of animal species restricted to the jawed vertebrates (Watson, 2005).

A single Drosophila Dscam gene has been identified as a member of the Ig-superfamily and its essential function in neuronal wiring has been characterized. Gene organization of Dscam comprises clusters of variable exons flanked by constant exons. Although mechanistically entirely different from somatic rearrangements, alternative splicing of the Dscam gene combines constant and variable exons by mutually exclusive splicing, and potentially generates as many as 19,008 different extracellular domains. Therefore, it is conceivable that a large protein-isoform repertoire with the potential for recognizing diverse ligands and epitopes could be generated. To explore this, a comparative and functional analysis of Dscam expression in immunecompetent cells of flies and other insects was undertaken (Watson, 2005).

Fat body cells and hemocytes (i.e., insect blood cells) constitute important cells of the insect immune system. Most proteins in insect hemolymph, the insect equivalent of blood serum, are produced in fat body cells which also secrete anti-microbial peptides that constitute an important component of the humoral immune defense. In contrast, hemocytes are involved in cellular defense strategies such as phagocytosis and wound repair (Watson, 2005 and references therein).

In situ hybridization of tissue from 3rd instar Drosophila larvae with a Dscam-specific probe revealed Dscam expression in fat body cells. For a comparison of Dscam expression in immune and neural tissue mRNA was isolated from larval hemocytes, fat body and brain tissue. Hemocyte-specific GFP expression allowed for the purification of hemocytes by Fluorescence Activated Cell Sorting (FACS). RT-PCR analysis and sequencing of ~50 cDNAs revealed that the majority of Dscam mRNAs in hemocytes, fat body and brain contain unique exon 4-6 combinations (Watson, 2005).

For a global assessment of alternative splicing in different cell types custom made oligo-arrays were used. Microarrays contained specific 50-mer oligo-probes for all alternatively spliced exons. Dscam mRNA sequences were amplified by RT-PCR and cDNAs were fluorescently labeled and hybridized to the microarrays. 59 of the 60 alternative exon 4 and exon 6 sequences were found to be expressed in all 3 cell types. In brain tissue 32 exon 9 sequences were expressed. However, only a subset (total of 14) was expressed in fat body and a slightly different subset (total of 15) was expressed in hemocytes. Based on relative expression levels, it is estimated that 80%-90% of all Dscam mRNAs in hemocytes and fat body contain either exon: 9.6, 9.9, 9.13, 9.30, or 9.31, demonstrating that exon 9 splice variants in fat body and hemocytes are distinct from those found in brain. Considering all of the alternative exons detected (12 exon 4, 47 exon 6, 16 exon 9, and 2 exon 17), it is calculated that this potentially allows for the generation of more than 18,000 diverse receptor isoforms in fat body cells and hemocytes (Watson, 2005).

Antibodies were raised against extracellular (D-ex1, D-ex2) and intracellular (D-cy) domains of Dscam. All antibodies recognized an ~210 kDa endogenous form of Dscam in extracts from cultured S2 cells, a cell line thought to be derived from embryonic hemocytes and shown to share many characteristics with hemocytes. A 210 kDa form of Dscam was also confirmed in purified larval hemocytes, fat body tissue, and at comparatively high levels in brain. Immunoprecipitations from fat body extracts revealed three Dscam forms possibly representing truncated forms generated by proteolytic cleavage. Unexpectedly, it was found that S2 cell conditioned medium contains a soluble Dscam protein of ~160 kDa and secreted Dscam protein of the same molecular weight is also present in hemolymph serum (Watson, 2005).

Liquid chromatography and tandem mass spectrometry (LC-MS/MS) directly confirmed that S2 cells secrete Dscam isoforms. Coverage of the secreted forms by the identified peptides amounts to more than 50% of the entire extracellular part of Dscam. Importantly, some of the identified peptides confirmed the presence of alternatively spliced sequences including five Ig- 2 sequences and at least twelve Ig-3 sequences. In agreement with the expression profiling of exon 9 (Ig 7) sequences, three distinct Ig7 domains (i.e., Ig76, Ig79 and Ig713) were identified, that correspond to most abundantly expressed exon 9 sequences. Considering the protein sequencing results, the presence of secreted Dscam in hemolymph, and the large pool of diverse Dscam mRNAs in fat body or hemocytes, it is possible that thousands of Dscam isoforms circulate in the hemolymph of Drosophila (Watson, 2005).

Attempts were made to determine if Dscam proteins are functionally required in immune-competent cells. However, since animals with homozygous amorphic mutations in Dscam die as embryos, it was not possible to directly test this in null mutant animals. Nevertheless, it was possible to purify GFP-labeled hemocytes from Dscam mutant larvae that carry a trans-allelic combination of hypomorphic (Dscam39) and amorphic (Dscam20) mutations. Immunoblotting showed that Dscam20/Dscam39 animals have a strong overall reduction in protein level. One important function of hemocytes is the ingestion of bacterial pathogens by phagocytosis. Wild type and Dscam deficient hemocytes were challenged with heat-killed fluorescently labeled E. coli and the number of hemocytes containing fluorescent bacteria was determined ('Phagocytic Index'). Normal hemocytes exhibited highly efficient phagocytosis and 85%-90% had taken up bacteria after 10 minutes. In contrast, only 55% of Dscam mutant cells had taken up bacteria (Watson, 2005).

To investigate more directly the possible role of Dscam in immune defenses, three questions were addressed. (1) Is Dscam cell autonomously required for phagocytosis in hemocytes? (2) Can antibodies that specifically bind extracellular Ig-domains of Dscam acutely interfere with phagocytosis? (3) Can Dscam isoforms directly bind to pathogens (Watson, 2005)?

Expression of double-stranded RNA (i.e., RNAi) was used to suppress Dscam expression in transgenic flies. A hemolectin promoter region-GAL4 fusion, termed Hml-GAL4, was used for activating expression exclusively in embryonic and larval hemocytes. Hemocytes with Dscam-specific knock-down showed a significantly reduced rate of phagocytosis, with less than 60% of the cells containing bacteria. This partial inhibition may reflect RNAi-mediated knock-down in only a subset of the highly heterogeneous cell population of larval hemocytes. Therefore S2 cells, which represent a less heterogeneous cell population also capable of phagocytosis were examined, and anti-Dscam antibodies were used to block Dscam function. It was reasoned that the short application of anti-Dscam antibodies, in contrast to continuous RNAi, may be less likely to influence general hemocyte characteristics or development. Treatment of S2 cells with polyclonal anti-Dscam serum D-ex1 result in a 30% decrease in the phagocytic index. It is possible that the anti-Dscam antibody may not directly block Dscam-bacteria interactions, or may have additional indirect influences on the process of phagocytosis. However, the reduction of phagocytosis is consistent with the loss-of-function in vivo analysis and in vitro binding studies. Taken together, partial but significant reduction in phagocytosis could be achieved by genetic inhibition of expression in hemocytes and by blocking Dscam protein interactions (Watson, 2005).

Flow cytometry was used to measure whether different Dscam isoforms are capable of binding directly to bacteria. Validity of a standard binding assay was tested using a polyclonal antibody that specifically recognizes E. coli epitopes, and the same assay was used to test binding of different recombinant Dscam isoforms. All isoforms contained C-terminal Fc tags, which were used for detection using fluorescently labeled protein A. Isoforms are designated by the combination of alternative variable Ig domains. Dscam-1.30.30-Fc and Dscam- 7.27.25-Fc contain all of the extracellular domains, whereas Dscam-7.27.13-Fc contains only the N-terminal 9 Ig plus the first FNIII domain. Dscam-7.27.25-Fc and Dscam-7.27.13-Fc could bind to live DH5alpha E. coli bacteria. Binding of Dscam-7.27.13-Fc to E. coli suggests that the 10 N-terminal domains containing all three variable Ig domains are sufficient for binding. In contrast, binding of isoform Dscam-1.30.30-Fc to E. coli is barely detectable, and therefore similar to Fc-peptides alone or control Ig-domains containing anti-heavy chain (mouse) antibodies. It is possible that lack of binding of Dscam-1.30.30-Fc is unique to just this isoform and it may not generally reflect the presence of distinct pools of binding and non-binding isoforms (Watson, 2005).

Therefore, it remains an important task to examine in future studies binding properties of other isoforms. Importantly, the molecular basis of Dscam binding to bacteria is presently unknown and an assessment of binding specificity will crucially depend on the identification of potentially distinct epitopes on bacteria (Watson, 2005).

Although the detailed molecular basis of Dscam function in immune-competent cells is not known, the results are consistent with the possibility that Dscam acts as a signaling receptor or co-receptor during phagocytosis. In addition, binding of Dscam isoforms to bacteria may reflect the possibility that diverse secreted Dscam isoforms are involved in opsonizing invading pathogens in the hemolymph. Comparative genomic analysis of Dscam-like sequences show high conservation of orthologous Dscam genes in Diptera and Hymenoptera orders. To explore Dscam expression and alternative splicing in other insect orders, Dscam gene structure and expression in the flour beetle Tribolium castaneum (Coleoptera) and the silk moth Bombyx mori (Lepidoptera) were examined. Orthologous genes were identified in both species and all Dscam-like domains were found to be highly conserved. The expression of alternative Dscam isoforms was confirmed by cloning and characterizing 32 cDNAs from Tribolium RNA (Tr-Dscam). Alternatively spliced mRNA segments of Tr-Dscam matched corresponding Ig-2, Ig-3 and Ig-7 segments of Drosophila Dscam. RT-PCR and sequencing of Dscam mRNA extracted from fat body tissue of Tribolium larvae revealed nine different isoform sequences (out of 16 cDNAs). These results suggest that expression of diverse Dscam isoforms in immunecompetent fat body cells is conserved among highly diverged insect species (Watson, 2005).

This study provides evidence for a potentially extensive repertoire of thousands of Ig-domain-containing proteins in immune-competent cells of insects, which represent an estimated 60% of metazoan species. Recently, novel and diverse receptor sequences have been identified in jawless vertebrates (lamprey), in protochordates (amphioxus), and in mollusks (freshwater snail). It has also been reported that a large class of scavenger receptors (with an estimated 1,200 scavenger receptor cystein-rich (SRCR) domains) are expressed in putative immune effector cells (coelomocytes) of echinoderms. Similarly, immune responses of crustaceans apparently utilize an extensive set of diverse antimicrobial peptides. Although most animals have not acquired adaptive immunity, this apparently broad conservation of receptor diversity strongly suggests important functions and future studies will have to further address whether the presence of diverse immune receptors in invertebrates increases the effectiveness of immune responses of individual animals. Alternatively, given the relative short life span of many invertebrates, it may be that immune receptor diversity is less important ontogenetically but rather enhances the adaptive potential of animal populations to changing environmental and pathogenic threats (Watson, 2005).

Mutually exclusive splicing of the insect Dscam pre-mRNA directed by competing intronic RNA secondary structures

Drosophila Dscam encodes 38,016 distinct axon guidance receptors through the mutually exclusive alternative splicing of 95 variable exons. Importantly, known mechanisms that ensure the mutually exclusive splicing of pairs of exons cannot explain this phenomenon in Dscam. Two classes of conserved elements have been identified in the Dscam exon 6 cluster, which contains 48 alternative exons -- the docking site, located in the intron downstream of constitutive exon 5, and the selector sequences, which are located upstream of each exon 6 variant. Strikingly, each selector sequence is complementary to a portion of the docking site, and this pairing juxtaposes one, and only one, alternative exon to the upstream constitutive exon. The mutually exclusive nature of the docking site:selector sequence interactions suggests that the formation of these competing RNA structures is a central component of the mechanism guaranteeing that only one exon 6 variant is included in each Dscam mRNA (Graveley, 2005).

Comparative sequence analysis was used to identify RNA sequence elements that could potentially be involved in the regulation of Dscam alternative splicing. The sequences of the Dscam genes of 16 different insects were extracted from GenBank as either preassembled genes or as individual sequence reads from the trace archives that were subsequently assembled into a contig covering the gene. The organisms analyzed consisted of 13 Dipteran species, including 11 Drosophila species (D. melanogaster, D. simulans, D. yakuba, D. erecta, D. ananassae, D. pseudoobscura, D. persimilis, D. willistoni, D. mojavensis, D. virilis, and D. grimshawi) and two mosquito species (Anopheles gambiae [malaria mosquito] and Aedes aegypti [yellow fever mosquito]), the Lepidopteran Bombyx mori (silk worm), the Hymenopteran Apis mellifera (honeybee), and the Coleopteran Tribolium castaneum (red flour beetle). Together these organisms encompass four major taxanomic groups of insects that last shared a common ancestor at least 300 million years ago. The Dscam genes of each organism can each potentially generate tens of thousands of isoforms by alternative splicing, though the exact number of alternative exons differs in most species (Graveley, 2005).

Multiple sequence alignment of the Dscam genes from these 16 species revealed a number of conserved intronic elements, the majority of which were located in the exon 6 cluster. The most highly conserved element in the entire Dscam gene, which is greater than 60,000 bp in D. melanogaster, is located in the intron between the constitutive exon 5 and exon 6.1 and will be referred to as the docking site. The docking site is a 66 nt sequence element in D. melanogaster that is 90%-100% identical in the 10 other Drosophila species examined. Moreover, the central 24 nt of the docking site is nearly invariant in all 16 species. The only exceptions are D. willistoni, which contains a 2 nt insertion at position 34 of the docking site, and A. mellifera, which contains two T to C transitions at positions 22 and 28. A docking site consensus sequence can be derived from the first 37 nt of the alignment (Graveley, 2005).

The second class of conserved elements that were identified will be referred to as selector sequences. The initial selector sequences were identified as relatively conserved sequences in the introns upstream of some of the exon 6 variants. Some of these elements are related to one another. For instance, the selector sequences upstream of exons 6.5, 6.19, and 6.43 all contain the sequence CAGGCAG, while the selector sequences upstream of exons 6.28, 6.36, and 6.44 contain sequences that deviate from CAGGCAG by only one nucleotide. However, this is not universally true since the exon 6.12 selector sequence does not contain this motif. By searching the remaining exon 6 cluster for sequences that are similar to but not identical to the initially identified selector sequences, a potential selector sequence was identified upstream of each exon 6 variant that was also similar in other Drosophila species. An alignment of the selector sequences located upstream of all 48 D. melanogaster exon 6 variants, together with some flanking sequence, revealed that all of the selector sequences overlap with one another to a certain extent. This alignment was used to generate a consensus selector sequence. Importantly, this consensus sequence does not resemble any known splicing regulatory elements or splice site sequences (Graveley, 2005).

Strikingly, the central 28 nt of the consensus selector sequence is complementary to the docking site consensus sequence. Moreover, all 48 predicted D. melanogaster selector sequences are complementary to the docking site. Because the selector sequences all overlap with one another to some extent, the docking site is predicted to interact with only one selector sequence at a time. Thus, the docking site:selector sequence interactions would simultaneously juxtapose exon 5 with the exon 6 variant that is to be included and could explain how the alternative splicing of these 48 exons is mutually exclusive (Graveley, 2005).

The interactions between the selector sequences and the docking site are supported by a number of observations. The docking site is nearly invariant, consistent with the notion that it engages in multiple mutually exclusive interactions. Any mutation in the docking site would affect the interaction with most, if not all, of the selector sequences and therefore interfere with the splicing of the entire exon 6 cluster. In contrast, mutations within a selector sequence would only affect the splicing of the downstream exon 6 variant. Consistent with this, the selector sequences are much less conserved than the docking site. Nonetheless, orthologous selector sequences that contain nucleotide differences can still form similar interactions with the docking site (Graveley, 2005).

Although the docking sites of the non- Drosophila species have diverged from the Drosophila docking site to some extent, potential selector sequences exist upstream of each exon 6 variant in these other species. This is particularly striking given the apparent high rate of recombination within the exon 6 region. The docking site in the honey bee A. mellifera is the most divergent from the docking site in D. melanogaster and contains two U to C changes in the most highly conserved portion. However, putative selector sequences exist upstream of each A. mellifera exon 6 variant and are predicted to interact with the A. mellifera docking sequence with a thermodynamic stability similar to those in D. melanogaster. Most importantly, the two nucleotides that are different in the A. mellifera docking site engage in base-pairing interactions in the majority of the predicted docking site:selector sequence secondary structures. Many of the docking site:selector sequence structures in all species other than the honeybee contain U-A base pairs at these positions, while C-G base pairs exist at these positions in the honeybee structures. This provides several independent examples of compensatory double mutations that maintain the structural integrity of the docking site:selector sequence interactions. Together, these observations strongly support a model in which the selector sequences interact with the docking site in a mutually exclusive manner (Graveley, 2005).

Several mechanisms have been identified that serve to guarantee that pairs of alternative exons are spliced in a mutually exclusive manner. However, none of the known mechanisms can explain how the alternative splicing of genes containing more than two mutually exclusive exons occurs such that only one exon is included. The Dscam gene is an extreme example of this since the exon 4, 6, and 9 clusters contain 12, 48, and 33 exons, respectively. This study describes the docking site and the selector sequences -- two classes of conserved sequence elements within the Dscam exon 6 cluster that have the potential to engage in base-pairing interactions. The mutually exclusive nature of the interactions of the selector sequences with the docking site suggests that the formation of these structures is a central component of the mechanism ensuring that only one of the exon 6 variants is included (Graveley, 2005).

It is quite intriguing that each of the Dscam mRNAs isolated from the fly contains only one of the 48 exon 6 variants despite the fact that each exon is flanked by what appear to be functional splice sites. Thus, the mechanism that exists to prevent multiple exon 6 variants from being included must operate with a high degree of fidelity. A protein has been identified in an RNAi screen that appears to function to prevent all of the exon 6 variants from being spliced together -- when depleted by RNAi, multiple, even adjacent, exon 6 variants are included in the mRNA and they are accurately spliced together (Y. Savva, J. Park, and B.R.G., unpublished data reported in Graveley, 2005). This finding demonstrates that the exon 6 variants are in fact capable of being spliced together but that protein factors exist that function to repress this reaction (Graveley, 2005).

Based on these two sets of observations, a model can be proposed to explain how the alternative splicing of the exon 6 cluster is mutually exclusive. A key component of this model is that a protein(s) acts to both repress the splicing of each exon 6 variant and to prevent the exon 6 variants from being spliced together. It is proposed that the selector sequence upstream of the exon 6 variant that is to be included interacts with the docking site and that this interaction somehow relieves the repression on the downstream exon 6 variant, and as a result, it can be spliced to exon 5. Finally, the exon 6 variant that is then spliced to exon 5 could only be spliced to exon 7 because the exon 6 variants downstream of the included exon would still be repressed. As a result, only one exon 6 variant would be included in the mRNA (Graveley, 2005).

Although the docking site:selector sequence interactions are strongly supported by their evolutionary conservation and some compensatory mutations in the honeybee A. mellifera, this model will obviously need to be experimentally tested with mutations and compensatory mutations that disrupt and restore the docking site:selector sequence interactions. Due to the size (14,000 bp in D. melanogaster) and complexity (48 exons) of the exon 6 cluster, several attempts have been made to generate minigene constructs that lack several of the alternative exons. However, none of the constructs made to date are accurately spliced in tissue culture cells. Thus, these experiments may need to be conducted in the fly using the entire exon 6 cluster or perhaps even the entire Dscam gene. Nonetheless, once a system is in place, it will be interesting to test whether the strength of the docking site:selector sequence interactions contribute to the frequency at which each exon 6 variant is used. At first glance, however, it does not appear that the predicted thermodynamic stability of each docking site:selector sequence interaction correlates with the frequency with which each exon 6 variant is used in flies. Moreover, contrary to what one would expect if splicing occurs cotranscriptionally and the docking site:selector sequence interactions are the driving force of exon 6 selection, the exon 6 variants closest to exon 5 are not chosen more frequently than other exons. Thus, the mechanism involved in selecting a specific exon 6 variant may be distinct from the interaction between the selector sequence and the docking site (Graveley, 2005).

It will also be interesting to determine precisely what the docking site:selector sequence interaction does. The structures of the docking site:selector sequence interactions are somewhat reminiscent of those that direct site-specific RNA editing. Though it is formally possible that some components of the RNA editing machinery could play a role in Dscam alternative splicing, RNAi depletion of ADAR does not affect alternative splicing of Dscam. An alternate possibility is that the docking site:selector sequence structures serve as binding sites for a protein that somehow inactivates the repression of the downstream exon 6 variant. An intriguing possibility is that the interaction juxtaposes the exon 6 variant to a splicing regulatory element upstream of the docking site. Interestingly, an additionally highly conserved sequence element is located immediately adjacent to the docking site that is predicted to form a 20 bp stem-loop structure that is supported by multiple compensatory mutations. However, the function and relevance of this stem-loop structure is not immediately obvious (Graveley, 2005).

Are competing base-pairing interactions a common mechanism that evolved to negotiate the splicing of genes containing multiple mutually exclusive exons? At first glance, it does not appear so. Conserved elements similar to the docking site and selector sequences are not readily apparent in either the exon 4 or exon 9 clusters of Dscam, nor in other genes containing multiple mutually exclusive exons (C. elegans unc-32 and D. melanogaster Myosin heavy chain, ATPα, GluClα, slowpoke, hephaestus, and Thiolester containing protein II). Thus, the use of competing base-pairing interactions may be unique to the Dscam exon 6 cluster. Moreover, additional experimental and comparative genomic work suggests that the mechanisms of mutually exclusive splicing of each cluster in Dscam are quite possibly different. This suggests that multiple distinct and independent mechanisms to ensure the mutually exclusive splicing of clusters of three or more exons may have evolved multiple times. This is not entirely surprising, however, since multiple, distinct mechanisms are known to exist to guarantee that only one exon is included when only two alternative exons need to be chosen from (Graveley, 2005).

Curiously, vertebrate genes that contain a region with more than two mutually exclusive exons have not been identified. This suggests that the vertebrate spliceosome may have lost the ability to negotiate pre-mRNAs containing more than two mutually exclusive exons. Alternatively, insects and worms (and perhaps other metazoans) may have evolved the ability to cope with the challenge of including only one alternative exon among a multitude of possible choices after they last shared a common ancestor with higher eukaryotes. Due to the fact that multiple mutually exclusive exons can be successfully used to generate such a tremendous diversity of proteins from a single gene, it is striking that genes with this organization are not more common in general and appear to be all together absent from vertebrates (Graveley, 2005).

The iStem, a long-range RNA secondary structure element required for efficient exon inclusion in the Drosophila Dscam pre-mRNA

The Drosophila Dscam gene encodes 38,016 different proteins, due to alternative splicing of 95 of its 115 exons, that function in axon guidance and innate immunity. The alternative exons are organized into four clusters, and the exons within each cluster are spliced in a mutually exclusive manner. This study describes an evolutionarily conserved RNA secondary structure that is called the Inclusion Stem (iStem) that is required for efficient inclusion of all 12 variable exons in the exon 4 cluster. Although the iStem governs inclusion or exclusion of the entire exon 4 cluster, it does not play a significant role in determining which variable exon is selected. Thus, the iStem is a novel type of regulatory element that simultaneously controls the splicing of multiple alternative exons (Kreahling, 2005).

To begin characterizing the RNA sequence elements involved in Dscam exon 4 alternative splicing, an exon 4 minigene was generated. A portion of the Dscam gene beginning in exon 3 and ending in the intron downstream of exon 5 was cloned into a Drosophila expression vector containing the inducible metallothionein promoter to generate pDscamWT. This vector was transiently transfected into Drosophila S2 cells, and transcription was induced by the addition of CuSO4. Splicing was then analyzed by RT-PCR using a primer in exon 5 and a minigene-specific primer that anneals upstream of exon 3. Analysis of these PCR products on denaturing polyacrylamide gels revealed that the majority of the transcripts contain one of the exon 4 variants. However, approximately 15% of the transcripts lacked an exon 4 variant and instead contained exon 3 spliced directly to exon 5. The profile of exon 4 variants that are utilized was analyzed both by sequencing cloned RT-PCR products and by resolving the RT-PCR products on a single-strand conformational polymorphism gel, which separates the molecules based on conformation rather than size. The majority of the transcripts from pDscamWT contain exons 4.12, 4.1, and 4.11, although exons 4.8, 4.10, 4.6, and 4.4 were also detectibly utilized. Thus, in S2 cells, transcripts derived from the minigene are spliced in a mutually exclusive manner and multiple exon 4 variants are selected. This system is therefore well suited for identifying and analyzing cis-acting sequences involved in Dscam exon 4 splicing (Kreahling, 2005).

RNA sequences required for various aspects of exon 4 splicing were identified by generating deletions throughout the entire minigene and testing their effects on splicing in transfection experiments. Several of these deletions had profound effects on exon 4 splicing. This study focuses on an interesting set of deletions that removed portions of the 1,412-nucleotide (nt) intron between exons 3 and 4.1. Deletion of a 670-nt fragment encompassing nt 422 to 1090 of the intron (pDscamDelta1) had no effect on splicing of the minigene. However, extending the 5' boundary of this deletion to position 224 of the intron (pDscamDelta2) resulted in a dramatic increase in exon 4 skipping, as compared to pDscamWT results. Additional deletions that progressively decrease the 3' boundary of the deletion defined a 105-nt region (pDscamDelta5) that, when deleted, also resulted in a significant increase in exon 4 skipping. In contrast, a slightly smaller 58-nt deletion encompassing nucleotides 224 to 280 (pDscamDelta6) displayed no more exon 4 skipping than the wild-type construct. Thus, an element located between nt 280 and 422 of this intron (defined by the deletion boundaries in pDscamDelta1 and pDscamDelta6) is required for efficient exon 4 inclusion (Kreahling, 2005).

The sequence of the 105-nt segment deleted in pDscamDelta5 contains a pyrimidine-rich region and resembles a 3' splice site. Although both U2 snRNP and U2AF can bind to this element in nuclear extracts, mutations that disrupt the binding of these splicing factors have little if any effect on exon 4 skipping. This led to a hypothesis that this element may not function as a protein binding site in vivo. Detailed sequence analysis revealed that a 27-nt segment of this element could potentially base pair with a sequence located 18 nt downstream of exon 3, forming a structure consisting of a 27-bp stem containing a 2-nt internal bulge and a 275-nt loop. It was hypothesized that if the sequence element initially identified functions to promote exon 4 inclusion by forming this stem-loop structure, disrupting this structure should affect the efficiency of exon 4 inclusion. This was tested by disrupting the stem with deletions that removed either the 5' or 3' half of the stem. Indeed, a dramatic increase was observed in exon 4 skipping—deleting the 27 nt that make up either the 5' or 3' half of the stem resulted in ~55-fold or ~60-fold increases in the ratio of exon 4 exclusion/inclusion, respectively. This suggests that disrupting the formation of this RNA secondary structure has a significant effect on exon 4 splicing and that this structure is important for efficiently including an exon 4 variant. Hereafter, this RNA structural element will be referred to as the iStem (Kreahling, 2005).

Due to the evolutionarily conserved proximity of the iStem to exon 3 and the fact that the iStem affects the inclusion of all 12 exon 4 variants equally, it seems most likely that the iStem acts on the 5' splice site of exon 3. One possibility is that the iStem promotes the assembly of a specific protein complex at the 5' splice site of exon 3 that confers upon exon 3 the ability to splice to one of the exon 4 variants. The iStem could do this by serving as a binding site for a splicing regulator or splicing regulatory complex (Kreahling, 2005).

What types of regulators could recognize the iStem? If a protein or complex interacts with the iStem, it would need to do so in a sequence-independent manner. An RNA interference screen was recently conducted to identify proteins that regulate Dscam alternative splicing. Although none of the Drosophila double-stranded RNA binding proteins tested had an impact on the splicing of exon 4, depletion of several DExH/D-box proteins resulted in an increase in exon 4 skipping (Park, 2004). One of these DExH/D-box proteins identified in the screen is Rm62, the Drosophila homolog of the human p68 helicase. Interestingly, p68 helicase has been shown to modulate the binding of U1 snRNP to 5' splice sites and functions as an alternative splicing regulator. Thus, it is possible that the iStem serves as a binding site for a DExH/D-box protein (such as Rm62) that interacts with U1 snRNP bound to the 5' splice site of exon 3, resulting in a complex that is competent to splice to one of the exon 4 variants. In the absence of the iStem, or the DExH/D-box protein, the complex would not assemble at the 5' splice site of exon 3 and, as a result, the exon 4 variants would be skipped. Testing this model will require the development of an in vitro splicing system for Dscam; such a system is currently unavailable (Kreahling, 2005).

Tracking the evolution of alternatively spliced exons within the Dscam family

The Dscam gene in the fruit fly Drosophila melanogaster contains twenty-four exons, four of which are composed of tandem arrays that each undergo mutually exclusive alternative splicing, potentially generating 38,016 protein isoforms. This degree of transcript diversity has not been found in mammalian homologs of Dscam. This study examines the molecular evolution of exons within this gene family to locate the point of divergence for this alternative splicing pattern. Using the fruit fly Dscam exons 4, 6, 9 and 17 as seed sequences, sixteen genomes were iteratively searched for homologs, and then phylogenetic analyses of the resulting sequences were performed to examine their evolutionary history. Homologs were found in the nematode, arthropod and vertebrate genomes, including homologs in several vertebrates where Dscam had not been previously annotated. Among these, only the arthropods contain homologs arranged in tandem arrays indicative of mutually exclusive splicing. No homologs to these exons were found within the Arabidopsis, yeast, tunicate or sea urchin genomes but homologs to several constitutive exons from fly Dscam were present within tunicate and sea urchin. Comparing the rate of turnover within the tandem arrays of the insect taxa (fruit fly, mosquito and honeybee), it was found the variants within exons 4 and 17 are well conserved in number and spatial arrangement despite 248-283 million years of divergence. In contrast, the variants within exons 6 and 9 have undergone considerable turnover since these taxa diverged, as indicated by deeply branching taxon-specific lineages. These results suggest that at least one Dscam exon array may be an ancient duplication that predates the divergence of deuterostomes from protostomes but that there is no evidence for the presence of arrays in the common ancestor of vertebrates. The different patterns of conservation and turnover among the Dscam exon arrays provide a striking example of how a gene can evolve in a modular fashion rather than as a single unit (Crayton, 2006).

Structural basis of Dscam isoform specificity

The Dscam gene gives rise to thousands of diverse cell surface receptors thought to provide homophilic and heterophilic recognition specificity for neuronal wiring and immune responses. Mutually exclusive splicing allows for the generation of sequence variability in three immunoglobulin ecto-domains, D2, D3 and D7. This study reports X-ray structures of the amino-terminal four immunoglobulin domains (D1-D4) of two distinct Dscam isoforms. The structures reveal a horseshoe configuration, with variable residues of D2 and D3 constituting two independent surface epitopes on either side of the receptor. Both isoforms engage in homo-dimerization coupling variable domain D2 with D2, and D3 with D3. These interactions involve symmetric, antiparallel pairing of identical peptide segments from epitope I that are unique to each isoform. Structure-guided mutagenesis and swapping of peptide segments confirm that epitope I, but not epitope II, confers homophilic binding specificity of full-length Dscam receptors. Phylogenetic analysis shows strong selection of matching peptide sequences only for epitope I. It is proposed that peptide complementarity of variable residues in epitope I of Dscam is essential for homophilic binding specificity (Meijers, 2007).

This study has provided a structural analysis of the recognition specificity of two variable immunoglobulin domains of Drosophila Dscam. Although the D1-D4 structures reported here contain only two variable domains, and it remains to be determined how D7 contributes to binding, biochemical analysis in the context of the full-length Dscam receptor is consistent with an essential contribution of the variable peptide segments of epitope I to the homophilic-binding specificity of Dscam. Swapping the peptide segment containing epitope I but not epitope II resulted in a full switch in binding specificity between two isoforms. This strongly suggests that in a Dscam dimer the matching epitope I peptides enable binding, and non-matching ones inhibit homophilic binding, thereby functioning as a specificity module. The strong sequence conservation of epitope I residues is consistent with a high evolutionary selection pressure preserving a limited set of homophilic-binding interfaces. Although an involvement of epitope II in binding of non-Dscam ligands has not been tested experimentally, the apparently faster-evolving sequence variability in epitope II would be consistent with immune receptor adaptations to dynamic alterations in host-pathogen interactions. It is therefore hypothesized that this structural separation of homophilic and heterophillic binding (that is potentially self and non-self recognition) in Dscam may have enabled the parsimonious use of the same gene in creating a large receptor diversity in both the nervous system and immune system (Meijers, 2007).

Protein Interactions

It has been proposed, based on mutational analyses of domain requirements for Dreadlocks (Dock) in axon guidance, that Dock interacts with upstream guidance signals in a redundant fashion through both SH3 and SH2 domains. The Dock SH2 domain interacts directly with Dscam. Binding is disrupted by pretreatment with alkaline phosphatase. The SH3 domains of Dock also directly interact with Dscam. Interactions between different SH3 domains and Dscam were assessed in a yeast two-hybrid assay and in GST pulldown experiments. In yeast, full-length Dock interacts strongly with the cytoplasmic domain of Dscam. Each of the three SH3 domains tested individually in yeast show a comparatively reduced level of interaction. This suggests Dock interacts through multiple SH3 domains with Dscam (Schmucker, 2000).

The interaction sites between different SH3 domains and Dscam were mapped. Two putative SH3 binding sites (PXXP1 and PXXP2) separated by 40 amino acids are found in the N-terminal portion of the Dscam cytoplasmic domain; a C-terminal polyproline sequence (PEPPP) is also present. Site-directed mutagenesis of the PXXP sites revealed that the first SH3 domain (SH3-1) binds preferentially to PXXP1 and the third SH3 domain (SH3-3) binds to PXXP2. GST-SH3-1 and GST-SH3-3 interact with the N-terminal half of the cytoplasmic tail of Dscam containing the PXXP sites, but only weakly to the C-terminal half encompassing the polyproline sequence. Conversely, the second SH3 domain (SH3-2) binds preferentially to the C-terminal polyproline motif. That PXXP1 and PXXP2 sequences are the primary interaction sites between Dock and Dscam is strongly supported by the marked reduction in interaction between Dock and the cytoplasmic domain of Dscam carrying point mutations in both these sites. Residual binding may be due to interaction between SH3-2 and the C-terminal polyproline sequence. In summary, these data indicate that Dscam binds directly to Dock through both SH3 and SH2 domains, consistent with genetic studies arguing for redundancy between these domains (Schmucker, 2000).

Dock, an adaptor protein that functions in Drosophila axonal guidance, consists of three tandem Src homology 3 (SH3) domains preceding an SH2 domain. To develop a better understanding of axonal guidance at the molecular level, the SH2 domain of Dock was used to purify a protein complex from fly S2 cells. Five proteins were obtained in pure form from this protein complex. The largest protein in the complex was identified as Dscam (Down syndrome cell adhesion molecule), which has been shown to play a key role in directing neurons of the fly embryo to correct positions within the nervous system. The smallest protein in this complex p63) has now been identified. p63 has been named DSH3PX1 because it appears to be the Drosophila ortholog of the human protein known as SH3PX1. DSH3PX1 is comprised of an NH(2)-terminal SH3 domain, an internal PHOX homology (PX) domain, and a carboxyl-terminal coiled-coil region. Because of its PX domain, DSH3PX1 is considered to be a member of a growing family of proteins known collectively as sorting nexins, some of which have been shown to be involved in vesicular trafficking. DSH3PX1 immunoprecipitates with Dock and Dscam from S2 cell extracts. The domains responsible for the in vitro interaction between DSH3PX1 and Dock were also identified. DSH3PX1 interacts with the Drosophila ortholog of Wasp, a protein component of actin polymerization machinery, and DSH3PX1 co-immunoprecipitates with AP-50, the clathrin-coat adapter protein. This evidence places DSH3PX1 in a complex linking cell surface receptors like Dscam to proteins involved in cytoskeletal rearrangements and/or receptor trafficking (Worby, 2001).


Dscam: Biological Overview | Evolutionary Homologs | Developmental Biology | Effects of Mutation

Home page: The Interactive Fly © 1997 Thomas B. Brody, Ph.D.

The Interactive Fly resides on the
Society for Developmental Biology's Web server.