InteractiveFly: GeneBrief

tarsal-less: Biological Overview | References

Gene name - tarsal-less

Synonyms - polished rice (pri)

Cytological map position - 87F14-87F14

Function - bioactive peptides

Keywords - short encoded peptides that convert Shavenbaby from a transcriptional repressor to an activator via the truncation of its N-terminal region, polycistronic message, epidermal differentiation, spatial pattern of trichomes, legs, trachea

Symbol - tal

FlyBase ID: FBgn0087003

Genetic map position - chr3R:9,638,841-9,640,372

Classification - peptides

Cellular location - unknown

NCBI links: NCBI Protein | GenBank

A substantial proportion of eukaryotic transcripts are considered to be noncoding RNAs because they contain only short open reading frames (sORFs). Recent findings suggest, however, that some sORFs encode small bioactive peptides. This study shows that peptides of 11 to 32 amino acids encoded by the polished rice (pri) sORF gene (tarsal-less) control epidermal differentiation in Drosophila by modifying the transcription factor Shavenbaby (Svb). Pri peptides trigger the amino-terminal truncation of the Svb protein, which converts Svb from a repressor to an activator. These results demonstrate that during Drosophila embryogenesis, Pri sORF peptides provide a strict temporal control to the transcriptional program of epidermal morphogenesis (Kondo, 2010).

Studies of eukaryotic genomes have revealed that a large proportion of genomic DNA produces atypical long transcripts, the functions of which are controversial. These transcripts contain only short open reading frames (sORFs, <100 codons) and thus are generally considered to be non-protein-coding RNAs (ncRNAs). However, there is growing evidence that the sORFs present in some ncRNAs are actually translated into small peptides, the abundance of which is probably greatly underestimated. Whereas sORF-encoded peptides may represent an overlooked repertoire of bioactive molecules, their functions and the mechanisms by which they operate are largely unknown (Kondo, 2010).

An evolutionarily conserved sORF gene, referred to as polished rice (pri) or tarsal-less (tal) has been identified in Drosophila and mille-pattes (mlpt) in Tribolium (Kondo, 2007; Galindo, 2007; Savard, 2006). pri mRNA is a polycistronic transcript that encodes four similar peptides, 11 to 32 amino acids in length, that play a redundant role in Drosophila embryogenesis. Embryos that lack pri display prominent defects, including the absence of trichomes and aberrant tracheal architecture. Reduced pri activity in imaginal development results in abnormal leg morphogenesis (Kondo, 2007; Pueyo, 2008). Similarly, mlpt knockdown in Tribolium leads to appendage defects and the transformation of segmental identity (Savard, 2006; Kondo, 2010 and references therein).

To gain insight into the molecular function of Pri peptides, this study focused on their role in trichome formation during Drosophila embryogenesis. Epidermis differentiation results in a pattern of smooth cells and cells that form apical extensions, called trichomes (ventral denticles and dorsal hairs). Modifications of the trichome pattern that have been examined in insects (resulting from laboratory-induced mutations or evolutionary diversification) are so far all attributable to changes in expression of shavenbaby (svb). Indeed, svb encodes a transcription factor that directly regulates the expression of target effectors, which are collectively responsible for trichome formation. Although the absence of pri results in trichome loss, the expression of svb is not altered in pri mutants. Reciprocally, pri is expressed normally in svb mutants, showing that svb and pri act in parallel in trichome formation. Expression of Svb target genes, such as miniature and shavenoid (Chanut-Delalande, 2006), is lost in pri mutants, whereas the expression of other epidermal genes is unaffected. The activity of isolated Svb-responsive enhancers was also strongly reduced in pri mutants. Therefore, pri is specifically required for the transcription of Svb downstream targets in trichome cells (Kondo, 2010).

How can Pri peptides regulate the expression of Svb target genes without affecting svb expression? The svb locus encodes three overlapping protein isoforms: Svb and the germline-specific proteins OvoA and OvoB. Ovo/Svb proteins all share the same DNA-recognition and transcriptional-activation domains but differ in their N-termini. The shortest isoform, OvoB, is a transcriptional activator and induces trichomes when artificially expressed in the epidermis. OvoA contains an extended N-terminal region, which switches its function toward active transcriptional repression and thus dominantly inhibits trichome formation. Svb contains a further N-terminal extension, compared to OvoA, and promotes the formation of ectopic trichomes like OvoB. To evaluate the specificity of Pri/Svb interaction, the influence of pri on the different Ovo/Svb isoforms was examined with respect to trichome formation. In wild-type embryos, seven rows of ventral cells per segment express svb and form trichomes. Upon its ectopic expression in smooth cells, Svb [or Svb:green fluorescent protein (GFP)] induced supernumerary trichomes in control embryos but not in pri mutants. In contrast, OvoB (or OvoB:GFP) was insensitive to pri, with ectopic trichomes forming in both control and pri mutant embryos. In the latter case, only OvoB-induced ectopic trichomes was observed and no Svb-dependent endogenous trichomes. These results show that whereas pri has no effect on the shorter OvoB isoform, pri peptides specifically control the ability of Svb to induce trichomes (Kondo, 2010).

Whether Pri peptides affect the synthesis or trafficking of Ovo/Svb proteins was examined. Using transgenic C-terminal GFP-fusions (proven functional as described above), it was observed that pri does not influence the production of Ovo/Svb proteins or their import to the nucleus. However, it different patterns of their intranuclear distribution were observed. Regardless of pri activity, throughout embryogenesis OvoA accumulated in discrete foci, and OvoB was distributed diffusely in the nucleoplasm. During stages 11 and 12, before pri is expressed in the epidermis, Svb formed intranuclear foci, like OvoA. At the onset of pri epidermal expression (stage 13 onwards), the nuclear distribution of Svb became diffuse. Therefore, Svb distribution changes from a pattern similar to the OvoA repressor to that of the OvoB activator, and the timing of this conversion correlates with the expression of pri. This redistribution of Svb was abolished in pri mutants, in which Svb remained in nuclear foci throughout embryogenesis. Thus, pri participates in the conversion of nuclear distribution of Svb from punctate to diffuse (Kondo, 2010).

The nonpunctuated, diffuse nuclear distribution of Svb in epidermal cells correlates with its ability to induce trichomes, suggesting that Svb redistribution coincides with active transcription of its targets. This hypothesis was explored using assays in Drosophila Schneider cells (S2 cells), which are of embryonic origin. Similarly to observations in embryos, the nuclear pattern of Svb was converted from punctate to diffuse in a pri-dependent manner in S2 cells. The transcriptional activity of Svb/Ovo was quantified using the Enh-m enhancer, which is directly activated by Svb in vivo. OvoB strongly stimulated the transcription driven by Enh-m, and OvoA repressed transcription, both with or without pri. In contrast, Svb behaved like OvoA in the absence of pri, but similar to OvoB, activated Enh-m in the presence of Pri peptides. Inactivation of the Svb-binding site of Enh-m suppressed this activation, indicating that pri is required for the direct activation of Enh-m by Svb. These results demonstrate that Pri peptides switch the transcriptional activity of Svb from that of a repressor accumulated in nuclear foci to a nucleoplasmic activator (Kondo, 2010).

To explore the mechanisms by which Pri peptides trigger this switch in Svb intranuclear distribution, whether Pri requires de novo synthesis of the Svb protein was examined. Using a photoactivatable-GFP (PA-GFP), it was observed that photoactivated Svb:PA-GFP switched from foci to diffuse distribution after the induction of pri expression. Therefore, the same Svb molecules are relocated within the nucleus, suggesting that the action of Pri peptides relies on posttranslational modifications of Svb. Accordingly, Western blot analysis showed that whereas the size of OvoA and OvoB proteins (including that of their minor species) were not affected by pri, Svb exhibited a differential electrophoretic mobility in a pri-dependent manner. In the absence of pri, Svb appeared slightly larger than OvoA, as deduced from the cDNA sequences. Upon pri expression, the Svb protein displayed a faster mobility, corresponding to a truncation of approximately 50 kD, without apparent modification in the size of svb mRNA. An antibody raised against the N-terminal Svb-specific region (anti-Svb1s) recognized only the larger Svb protein but not the truncated product formed upon pri expression. This truncated Svb protein was detected by antibodies to Ovo and GFP, showing that it lacks the N-terminal region but retains an intact C terminus. To further characterize Svb truncation, the truncated Svb protein was purified, and its N-terminal end was microsequenced. The N terminus of truncated Svb matches the sequence AAGHGR, which is located 56 amino acids upstream of the OvoB-initiating methionine and within a protein region that shows strong evolutionary conservation in insects. The corresponding DNA sequence displays synonymous nucleotide substitutions across species and lacks canonical or alternative initiation codons, further supporting the view that Svb truncation results from a posttranslational cleavage. Hence, the pri-induced truncated form of Svb contains the DNA-binding and activation domains but not the repression domain, explaining why it acts as a transcriptional activator (Kondo, 2010).

Consistent with this idea, a pri-dependent truncation of the endogenous Svb protein was observed during embryogenesis. In wild-type embryos, anti-Svb1s detected a transient nuclear signal in trichome cells, at stages 11 and 12, that disappeared at later stages. The loss of the Svb N-terminal region coincided with the onset of pri expression in the epidermis. Indeed, pri is required for Svb truncation in vivo -- as revealed by the persistence of anti-Svb1s signal in pri mutants -- throughout embryogenesis. It is concluded that Pri peptides convert Svb from a transcriptional repressor to an activator via the truncation of its N-terminal region (Kondo, 2010).

This study has demonstrated that 11- to 32-amino acid peptides encoded by sORFs orchestrate epidermal differentiation through the control of Svb transcriptional activity. At stages 11 and 12, svb is already expressed and restricted to presumptive trichome cells, in which the full-length Svb repressor probably prevents the premature expression of cellular effectors. At stages 13 and 14, the expression of pri in epidermal cells then triggers N-terminal truncation of the Svb protein, probably through a proteolytic release of the repressor domain, causing a rapid conversion of Svb function toward activation. Thus, although svb expression defines the spatial pattern of trichomes, the action of Pri peptides defines the temporality of trichome formation (Kondo, 2010).

Besides the mechanisms of epidermal differentiation, these studies suggest broader functions for Pri peptides. Although pri is also required for tracheal morphogenesis (Kondo, 2007), normal trachea were observed in svb mutant embryos, indicating that Pri peptides probably regulate additional developmental factors. Recent large-scale analyses indicate that thousands of unexplored transcripts are also probably encoding polypeptides of less than 100 amino acids in mice and humans. Future functional analyses should elucidate how small peptides encoded by transcripts improperly termed ncRNAs contribute to various biological processes including development and differentiation (Kondo, 2010).

The 11-aminoacid long Tarsal-less peptides trigger a cell signal in Drosophila leg development

The polycistronic and non-canonical gene tarsal-less encodes several short peptides 11 to 32 aminoacids long. tarsal-less is required for embryonic and imaginal development in Drosophila, but the molecular and cellular bases of its function are not known. This study shows that tarsal-less function triggers a cell signal. This signal has a range of 2-3 cells in Drosophila legs and may be provided directly by the Tarsal-less peptides. During leg development, this Tarsal-less signal implements the patterning activity of a tarsal boundary and regulates the transcription of several genes in a specific manner. Thus tarsal-less is necessary for the intercalation of the tarsal segments two to four and for the activation of the homeobox gene apterous, the Zinc-finger gene rotund and the bHLH-PAS gene spineless, and for the repression of the homeobox gene Bar and the putative transcription factor dacshund. These regulatory effects complement the known genetic scenario required for distal leg development and explain the requirements for tarsal-less in this process (Pueyoa, 2008).

Proximo-distal limb patterning is a stepwise process in which the growing limb is progressively subdivided into smaller domains of gene expression until all the appendage parts are specified. In Drosophila, the first model with molecular basis for this pattern formation relied on long-range morphogens (Wg and Dpp). However, the findings about the role of EGFR in distal leg development together with the current results show that, instead, a sequence of short-range patterning signals relays and elaborates the Wg an Dpp signal. Initially, Wg and Dpp activate the expression of dac and Dll. By the end of the second instar, continued signalling by Wg and Dpp activates the expression of EGFR ligands in the distal-most part of the disc. Consequently a gradient of EGFR activation is generated from this region whereby high levels of EGFR signalling activate the expression of pretarsal genes, whereas low levels of EGFR signalling activate the expression of Bar and later on restrict bab and rn to the presumptive tarsus. Activation of Bar leads to the expression of tal at the interface between dac and Bar-expressing cells. Tal expression in turn relays and elaborates the effects of EGFR signalling by activating rn and ss and repressing dac and Bar. Thus two sequential and interacting cascades control tarsal development, the EGFR Bar, bab cascade and the Bar tal rn, ss Bar cascade. Judged from the mutant phenotypes, rn and ss mostly promote growth and organisation of the tarsus whereas bab mostly plays a role in its patterning. The dependence of PD fates and patterning on dynamic signalling rather than on cell lineage restrictions allows cells to change fates by switching target gene activation. This plasticity may underlay the well-known regulatory properties of regenerating insect legs (Pueyoa, 2008).

Generation of a new fate in a developing primordium involves the activation of a new, specific, gene in a particular domain. Often, the new gene represses previously activated genes in this region. tarsal-less leg function illustrates these two principles, highlighting the role of patterning boundaries in the intercalation of new fates between pre-existing ones. By early third instar, the medial disc is divided in two regions by the expression of Dac proximally and Bar distally. Mutual repression between these two factors maintains these two populations of cells apart generating a boundary. This boundary subsequently acts as an organising center. A Bar-dependent signal activates tal in the Bar domain and also in adjacent Dac cells, possibly by overcoming Dac repression of tal. This local signal is not able to activate tal non-autonomously in the more distal pretarsal cells due to repression of tal by Al. Slightly later tal expression is repressed by the Bar homeodomain protein, while tal signalling activates the downstream genes ss and rn. The putative transcription factors Rn and Ss inhibit the expression of dac and Bar, generating a new territory of gene expression ready to grow into the tarsus. The medial disc region is now composed of three distinct tarsal territories: proximal (expressing Dac), medial, or tarsus T2-T3, (expressing Tal, Rn and Ss) and distal (expressing Bar) (Pueyoa, 2008).

Later interactions further subdivide these territories into smaller domains. For instance, ap expression in the fourth tarsal segment is activated by the combination of Tal signalling and Bar activation. Again the tarsal boundary between Tal and Bar is important since Tal signalling exert its non-autonomous effect into the adjacent Bar domain where ap will be activated. Distal spread of ap expression is restricted by bowl (Pueyoa, 2008).

Functional analysis of tal in leg development has shown a feature, non-autonomy, that indicates a possible molecular function for the Tal peptides in cell signalling. Thus, Tal activates the expression of downstream genes such as rn, ss and ap at distance. In addition, analyses of tal mutant clones in both differentiated legs and in imaginal discs indicate a range of non-autonomy of 2-3 cells. These results indicate that tal non-autonomy is a general feature of Tal function, since during embryogenesis tal transgenes can rescue the tal null phenotype even when expressed away from the tal-expressing cells (Kondo, 2007). In addition, morphogenetic effects have bee observed such as tissue folding that are also induced non-autonomously by ectopic tal expression. Thus tal function appears to be mediated by cell signalling. The lack of homology of tal to any known genes has not allowed it to be related to any known signalling mechanisms. In addition, genetic results suggest that tal is not a new member of the two signalling pathways involved in PD patterning, the EGFR and the Notch pathways. Therefore, it is possible that tal is itself part of a novel cell signalling mechanism (Pueyoa, 2008).

It is conceivable that tal triggers a cell signalling event indirectly by the activation or the release of a secondary cell signal. Tal peptides could bind an intracellular domain or protein complex, and trigger the release of a secondary signalling molecule. Alternatively, it can be postulated that the 11aa-long peptides might be acting directly as a cell signal. The Tal-dependent signal is local, not systemic, since the range of diffusion is limited and in the current experiments a source at the tal-expressing leg cells is observed. The developmental effect of the tal gene is instructive, in the sense that expression of tal is both necessary and sufficient to activate almost immediately expression of rn and ss. Thus the expression of Tal maps functionally very close to the tal-dependent signal itself (Pueyoa, 2008).

There exist numerous examples of short peptides involved in cell communication during development. For instance, neuropeptides and peptide hormones similar in size to Tal exist. However, neuropeptides and hormones are the product of a larger polypeptide precursor that has been cleaved whereas tal does not encode for a larger precursor, and it is not known if it is modified postranslationally. In addition neuropeptides and hormones contain a N-terminal signal peptide which is not present in Tal. Thus, although by size Tal could signal as a neuropeptide, its release mechanism must be different. Steroid hormones and similar polycyclic aromatic compounds can diffuse freely between cells across cell membranes to bind intracellular nuclear receptors. However, these molecules are very different in nature to Tal peptides. Finally, smaller molecules (ions, Lucifer yellow) can be transported between cells through channels (innexins) but Tal peptides are too large for this transport. Thus, if Tal is itself a new cell signal, it would need to rely on a non-canonical transport mechanism. Interestingly, such a possibility is presented by cell-penetrating peptides. These fragments of larger proteins can diffuse across cell membranes. This type of transport does not require canonical elements such as signal peptides, dynein or clathrin, but no directly translated natural examples have been discovered (Pueyoa, 2008).

Biochemical and cellular studies are needed to discriminate between these hypotheses. Such studies could potentially characterise a new type of cell signalling, which could have other examples as many other putative peptide-encoding genes with small open reading frames, similarly to tal, exist (Pueyoa, 2008).

Pri peptides are mediators of ecdysone for the temporal control of development

Animal development fundamentally relies on the precise control, in space and time, of genome expression. Whereas a wealth of information is available about spatial patterning, the mechanisms underlying temporal control remain poorly understood. This study shows that Pri peptides (see Tarsal-less), encoded by small open reading frames, are direct mediators of the steroid hormone ecdysone for the timing of developmental programs in Drosophila. A previously uncharacterized enzyme of ecdysone biosynthesis, Glutathione S transferase E14 (GstE14), was identified, and ecdysone was found to trigger pri expression to define the onset of epidermal trichome development, through post-translational control of the Shavenbaby transcription factor. Manipulating pri expression is sufficient to either put on hold or induce premature differentiation of trichomes. Furthermore, it was found that ecdysone-dependent regulation of pri is not restricted to epidermis and occurs over various tissues and times. Together, these findings provide a molecular framework to explain how systemic hormonal control coordinates specific programs of differentiation with developmental timing (Chanut-Delalande, 2014: 25344753).

Small peptide regulators of actin-based cell morphogenesis encoded by a polycistronic mRNA

Transcriptome analyses in eukaryotes, including mice and humans, have identified polyA-containing transcripts that lack long open reading frames (ORFs; >100 amino acids). These transcripts are believed most likely to function as non-coding RNAs, but their translational capacities and biological activities have not been characterized in detail. This study reports that polished rice (pri), which was previously identified as a gene for a non-coding RNA in Drosophila, is in fact transcribed into a polycistronic mRNA that contains evolutionarily conserved short ORFs that encode 11 or 32 amino acid-long peptides. pri was expressed in all epithelial tissues during embryogenesis. The loss of pri function completely eliminates apical cuticular structures, including the epidermal denticles and tracheal taenidia, and also causes defective tracheal-tube expansion. pri was found to be essential for the formation of specific F-actin bundles that prefigures the formation of the denticles and taenidium. Evidences is provided that pri acts non-cell autonomously and that four of the conserved pri ORFs are functionally redundant. These results demonstrate that pri has essential roles in epithelial morphogenesis by regulating F-actin organization (Kondo, 2007).

Previous studies of the Drosophila genome identified 45 putative polyA-containing RNAs lacking ORFs longer than 100 amino acids. One of the non-coding RNA candidates, MRE29, showed a unique expression pattern during embryogenesis, where it was expressed in seven stripes during early development and then in metameric and bilateral clusters during mid-embryogenesis. After a transient intermission at stage 11, the expression of MRE29 resumed in epithelial cells, with intense expression in the presumptive denticle cells, developing tracheal system, and foregut and hindgut primordia. These expression patterns suggest that MRE29 may function in epithelial cells that secrete cuticle for the exoskeleton (Kondo, 2007).

To determine the function of the gene, deletions were generated that completely removed the MRE29 transcription unit. These deficiencies were embryonic lethal, whether homozygous or transheterozygous, and the embryos showed characteristic cuticular phenotypes: the denticle belts and dorsal hairs were completely lost, and the head skeleton and posterior spiracles were deformed. Because of this smooth-cuticle phenotype, this gene was named polished rice (pri) (Kondo, 2007).

Because the assembly of filamentous actin (F-actin) is important for denticle formation, the F-actin organization was examined in pri loss-of-function mutant embryos. In wild-type embryos (Oregon-R), F-actin accumulates at the growing cell protrusions of denticle cells, with a concomitant loss of F-actin from the cytoplasm and cell boundaries. In contrast, in pri mutant embryos, the specific F-actin accumulation at the site of denticle formation was lost, but the signal at the cell boundaries remained intense in the presumptive denticle cells, suggesting that pri is required for the assembly of F-actin in the denticles. Time-lapse recording of the F-actin marker GFP-moesin revealed that the prospective denticle-secreting cells become tightly packed at the end of stage 13 and start accumulating patches of F-actin at their cell boundaries, which form a triangular outgrowth by stage 16. In the pri mutants, the cell packing was normal, but the assembly of the F-actin patches was not observed at any stage. These results indicate that pri is essential for F-actin assembly at the site of denticle outgrowth (Kondo, 2007).

The fate of Drosophila denticle cells is specified by segment-polarity genes and shavenbaby (svb), which encodes a zinc finger-type transcription factor that is a master regulator of denticle formation; it functions by modulating the expression of a number of F-actin-regulating genes. To determine whether the pri gene affects the fate of denticle cells, the expression of these genes was examined in pri mutants. pri deficiency did not affect the expression of svb, wingless or engrailed. Similarly, the expression of pri was not affected in svb mutant embryos, in which most denticles were lost. These genetic results suggest that pri is not essential for the normal fate determination of denticle cells, but instead regulates F-actin organization during denticle formation cooperatively with, or in parallel to, svb (Kondo, 2007).

The pri mutant embryos also showed abnormalities of the tracheal system. Cell labelling with a UAS-lacZ transgene driven by the trachea-specific breathless-GAL4 and luminal staining with a chitin-binding probe (CBP) showed a loss of tracheal network integrity and an irregular tube diameter. In addition, MRE29 the mutant embryos lacked the taenidial folds (tracheal cuticle in a corrugated structure that increases the mechanical strength of the tracheal tubing (Kondo, 2007).

Because F-actin organization is also important for taenidium formation, F-actin organization during tracheal development was examined in control and pri mutant embryos. In control animals, GFP-moesin driven by btl-GAL4 revealed a series of parallel F-actin bundles that were aligned perpendicular to the long axis of tracheal tubules — a pattern that was reminiscent of the taenidial folds, which form later. In contrast, in pri mutants, these characteristic F-actin structures were completely lost and the F-actin was disorganized, suggesting that pri is required for the F-actin bundling that precedes the formation of the taenidial folds (Kondo, 2007).

The dorsal trunk of the trachea undergoes a dramatic increase in the diameter of the apical lumen between stages 14 and 16. Time-lapse recordings of control embryos revealed rapidly changing F-actin spots at the apical surface of tracheal cells before and during the early stages of this tube expansion. At stage 15, parallel F-actin bundles appeared; these bundles were subsequently replaced by short-life spotted signals when tracheal tube dilation was completed at stage 16. In pri mutants, the F-actin spots never reorganized to form parallel F-actin bundles. Furthermore, the diameter of the pri mutant dorsal trunk was normal at stage 14, but had an irregular tube diameter at stage 16 (Kondo, 2007).

Unlike the F-actin organization, markers for adherens junction (E-cadherin) and septate junction (Discs large and Fasciclin 3) were normal in both the controls and pri mutants. The pri mutants secreted chitin and formed the cuticular exoskeleton, but they completely lacked denticles and hairs. These phenotypes are distinct from those observed in tracheal-tube expansion mutants that are defective in chitin synthesis or modification, suggesting that pri represents a new class of genes that are essential for the proper dilation of the dorsal trunk and that function by regulating the organization of F-actin. Despite their striking phenotypes, pri mutants did not show defects in any other F-actin-based phenomena, such as the filopodia formation of tracheal tip cells, dorsal closure, mitosis or tight packing of denticle cells. It is concluded that pri is essential for specific F-actin organization at the apical membrane of epithelial cells (Kondo, 2007).

pri was originally identified as a putative non-coding RNA gene. Nevertheless, it was possible that some of its short ORFs encoded functional peptides. To assess this, and to determine whether the physiological role of the pri gene in F-actin organization was due to the activity of the RNA or of the encoded peptides, the deduced amino-acid sequences of ten ORFs in the pri transcript of Drosophila melanogaster, which ranged from 7-49 amino acids, were compared with those in the syntenic regions of pri genes in related Drosophila species. The deduced amino-acid sequences of ORFs 1 to 4 and the 5' half of ORF5 were highly conserved, although their nucleotide sequences diverged. In addition, the deduced amino-acid sequences of ORFs 1-4 contained a common septamer motif, LDPTGQY or LDPTGTY. These observations led to a hypothesis that the pri transcript functions as an mRNA for small peptides (Kondo, 2007).

To determine whether the conserved ORFs 1-5 could be translated, a series of constructs were made in which the GFP ORF was fused to each conserved ORF in the full-length pri transcript. These expression constructs were introduced into Drosophila S2 cells. Using fluorescence microscopy, substantial expression of the fusion proteins of GFP was detected with ORFs 1-4, but the GFP fusion with ORF5 was not expressed. Western blot analysis confirmed that the GFP-ORF1-4 fusion proteins had the predicted relative molecular mass. These results suggest that the first methionines of the pri ORFs 1-4 are functional initiator codons, and therefore the pri transcript can function as a polycistronic mRNA (Kondo, 2007).

To determine the conserved pri ORFs 1-4 encoded functional gene products, attempts were made to rescue the pri phenotypes by ectopically expressing the pri ORFs. In addition to full-length pri cDNA, pri ORF1 (its peptide is identical to that encoded by ORF2), ORF3 and ORF4 were individually expressed in the epithelial cells of pri mutant embryos, and the denticle and tracheal phenotypes were examined. In each case, both denticle formation and tracheal integrity were completely rescued. When frame-shift mutations were introduced into ORFs 1-4 in full-length pri transcripts, these phenotypes were not rescued. These results suggest that the major function of pri can be carried out by any one of these four 11- and 32-amino acid peptides. Furthermore, when pri was expressed in a subset of epithelial cells using patched-GAL4 (ptc-GAL4), denticle formation was completely rescued, not only in pri-expressing cells but also in their neighbours, suggesting that pri functions in a non-cell autonomous manner (Kondo, 2007).

Recently, Savard (2006) reported that the mille-pattes (mlpt) gene in Tribolium, which shows extensive similarity to the pri gene, functions as a gap gene in Tribolium embryos. Although it is unclear whether the short ORFs in mlpt are translated, the similarity of the deduced amino-acid sequences and the gene structure between the species strongly suggest that the small peptides encoded by mlpt are functional gene products. Therefore, these similar peptides seem to have different roles in Drosophila and Tribolium, which may reflect rapid evolutionary diversification of their roles (Kondo, 2007).

This study has presented evidence that pri encodes small peptides that are required for specific F-actin assembly at the sites of denticle and taenidia formation. Mis-coordination of tracheal-tube dilation in pri mutants and non-cell autonomy of pri activity in denticle formation suggest a role for PRI peptides in cell communication. Although this study does not reveal the molecular mechanisms of the actin-bundle formation promoted by the pri-encoded small peptides, their small size may facilitate movement across the cell membrane and affect known biochemical pathways of F-actin regulation, which are under control of svb in denticle formation. Further analysis should determine the activity of these unique molecules and shed light on the functions of other small peptides in animal development (Kondo, 2007).

Peptides encoded by short ORFs control development and define a new eukaryotic gene family

Despite recent advances in developmental biology, and the sequencing and annotation of genomes, key questions regarding the organisation of cells into embryos remain. One possibility is that uncharacterised genes having nonstandard coding arrangements and functions could provide some of the answers. This study characterizes tarsal-less (tal), a new type of noncanonical gene that had been previously classified as a putative noncoding RNA. tal controls gene expression and tissue folding in Drosophila, thus acting as a link between patterning and morphogenesis. tal function is mediated by several 33-nucleotide-long open reading frames (ORFs), which are translated into 11-amino-acid-long peptides. These are the shortest functional ORFs described to date, and therefore tal defines two novel paradigms in eukaryotic coding genes: the existence of short, unprocessed peptides with key biological functions, and their arrangement in polycistronic messengers. The discovery of tal-related short ORFs in other species defines an ancient and noncanonical gene family in metazoans that represents a new class of eukaryotic genes. The results open a new avenue for the annotation and functional analysis of genes and sequenced genomes, in which thousands of short ORFs are still uncharacterised (Galindo, 2007).

A number of candidate smORFs are present in the tal transcript. These are referred to according to their sequence and position from 5' to 3' as 1A, 2A, 3A, AA, and B. The type-A ORFs (1A, 2A, 3A, and AA) include a conserved LDPTGXY motif of 7 aa, and this motif is very strongly conserved in the cDNA of homologous genes that were identified in other arthropods. ORF 1A and 2A encode an identical 11-aa peptide. ORF 3A encodes another 11-aa peptide very similar to 1A. ORF AA encodes a 32-aa peptide whose N- and C-termini each contain a LDPTGXY motif. ORF-B encodes a 49-aa peptide without known domains other than a poly-Arg stretch and is somehow weakly conserved in other insects. The results show that translation of an RNA containing small ORFs (smORFs) of just 11 aa is required for several important processes during development. Although the tal cDNA contains several copies of the type-A ORFs related by a common LDPTGXY domain, a construct containing just one of them is fully functional. Small peptides are known to have important biological functions, most clearly in endocrine and neural communication, but in all described cases, these peptides are mature, cleaved products of a longer ORF. The originality of the tal gene is thus 2-fold. First, smORFs of just 33 nucleotides are fully functional and capable of translation. Second, the carefully regulated local expression of these peptides in complex patterns (as opposed to a systemic release) has important developmental functions. Genetic and molecular analysis show that the tal genomic region contains specific regulatory sequences spread out over a minimum of 25 Kb (Galindo, 2007).

It was noticed that tal expression and function are often associated with tissues undergoing changes of shape such as folding and invagination. The development of the fly leg is directed by a regulatory cascade involving cell signals and region-specific transcription factors. Regulatory interactions between these identity-conferring transcription factors refine and stabilise the final pattern. This pattern is then translated into morphogenetic movements and position-specific cell differentiation programs. tal seems to be an important part of the leg developmental process and to act as a link between patterning and morphogenesis. In contrast, the transient ring of tal expression appears in the precise time and place to control tarsal patterning, by promoting rn expression and by being involved in further regulatory interactions with other leg-patterning genes. tal also controls folding of the leg tissue independently of these effects. In the wild-type leg imaginal discs, a complex morphogenetic process involving the appearance of extra folds within the tarsal furrow, in correlation with leg growth, is apparent. In tal mutants, this morphogenetic process is compromised, whereas in excess-of-function experiments, ectopic expression of tal induces the appearance of ectopic folds in legs. In the mutant discs, cells undergo an apico-basal constriction, but the tarsal furrow never widens into a fold; the appearance of further tarsal sub-folds is precluded, and the presumptive tarsal region does not grow. In the embryo, tal expression is found in tissues of ectodermal origin that undergo an invagination without compromising their epithelial organisation, such as the foregut (and later on in its derivatives, the proventriculus and the pharynx), the hindgut, the developing trachea, and the spiracles. In mutant embryos, head involution is slow, the pharynx is short and misplaced, and tracheal fusion is incomplete. The loss of denticles in the epidermis does not seem based on alterations of the segmental patterning cascade, but on cell morphology defects that do not involve defects in apico-basal cell polarity or epidermal integrity. Altogether, these results suggest that tal is required for the control of cell movements during tissue morphogenesis. Further research beyond the scope of this initial study should identify the cellular and molecular targets of this function (Galindo, 2007).

The results provide experimental evidence for function and translation of the type-A ORFs. These include the in vitro and in vivo translation assays, functional rescues, and sequence analysis. The results therefore imply that tal is polycistronic, because several ORFs can be translated from a single RNA molecule. The question arises of how this can be accomplished in an eukaryotic gene, but the literature provides a possible mechanism. Polycistronic genes are known in eukaryotes including Drosophila, and so in principle, all tal ORFs could be potentially translated simultaneously. Experimental evidence supports three models for translation of polycistronic messengers in eukaryotes, namely 'internal ribosomal entry sites (IRES)', 'leaky scanning,' and 'reinitiation'. There are clear rules backed by experimental data concerning the DNA sequences and transcript structure involved in each of these models. The tal RNA sequence seems to exclude both the IRES and the leaky scanning possibilities. There is not enough space for IRES between the tal ORFs, and the initiation consensi are stronger in the 5' ORFs than in the 3' ones, the opposite of conditions favourable for leaky scanning. However, polycistronic translation of type-A ORFs in the tal transcript is possible under the reinitiation model because their spacing is between 40 and 200 bp, and the short type-A ORFs (1A to 3A) are much shorter than 35 aa. In all cases studied, the presence of 5' ORFs has a dramatic impact on the rate of translation of the 3' ones, leading in certain conditions, to total blockage of 3' translation. Accordingly, an in vitro translation experiment shows a diminishing amount of protein arising from each ORF, with highest levels produced by 1A, and lowest by AA. It would be expected, by virtue of its conserved common domain, that these translated type-A peptides will share the same functions. The presence of repeated or similar ORFs is perhaps a device to ensure enough translation of LDPTGXY-containing peptides. This hypothesis coincides with the results of structure/function analysis, which shows that a single artificial type-A ORF suffices to provide tal function (Galindo, 2007).

These conclusions are further corroborated by the discovery of tal homologous genes in other insects. These genes contain repeated copies of type-A ORFs in varying number from two (crustaceans and primitive insects) to 11 (Bombyx mori), and an evolutionary trend towards accumulation of more type-A ORFs, including duplications of the entire gene, is apparent. The aa sequence of these type-A ORFs is very strongly conserved in their core domain LDPTGXY. The spacing between ORFs is most compatible with the reinitiation model described above. Not only sequence, but also functionality is conserved, as indicated by the rescue of Drosophila mutants with a Bombyx cDNA. The resilience and long age of the evolutionary history of this gene family suggest, not a recently evolved curiosity of some insects, but a peptide with ancestral and current importance (Galindo, 2007).

All available data suggest that the weakly conserved ORF-B is spurious or nonfunctional. In Drosophila, functional analysis fails to identify any essential function for ORF-B, and both in vitro and in vivo studies fail to detect its translation. This is in agreement with the fact that the 5' presence of several type-A ORFs with strong initiation contexts, allied to the weakness of the context for ORF-B, does not favour the translation of ORF-B. Furthermore, the size of the ORF AA is 32 aa, near the limit of 35 aa required for continued downstream reinitiation at ORF-B. In agreement with this sequence analysis, ectopic expression of the Bombyx Bm-wds construct containing an ORF-B in Drosophila does not produce any additional phenotypes when compared to those produced by the Drosophila constructs, indicating that the Bombyx ORF-B is not functional either. It is surmised that the weak conservation of ORF-B sequences is either related to some functional requirement (other than translation) for the nucleotide sequence in the region of the transcript, or pure chance (Galindo, 2007).

The conservation of aa sequences has been suggested as evidence for the translation of three type-A ORFs and one ORF-B in a homologous gene called milles-pattes (mlpt) found in the flour beetle Tribolium castaneum (Savard, 2006). These ORFs are of a similar small size as in Drosophila, but again such aa conservation is not conclusive evidence. In the absence of a biochemical and functional analysis of these different ORFs like the one presented in this study, it is difficult to guess which ORFs are translated and mediate the function of mlpt. The ORF-B of mlpt has been deemed the main functional element of the gene due to its longer length, but in fact, the available data belie this interpretation and favour the conclusion of ORF-B as nonfunctional. The ORF-B of mlpt has no Kozak consensus at all, and its start codon overlaps with the stop codon of the previous 5' type-A ORF, a situation that seems most unlikely to lead to ORF-B translation, even by a mechanism of readthrough as postulated. Readthrough and ribosome codon slippage always proceed by skipping bases forward, rather than backwards as would be needed here. Further, ORF-B aa conservation is rather weak. Although Savard identified a 'poly-Arg' conserved domain in alignments of selected sequences from species of only three insect orders, this conservation disappears when the comparisons are extended to further orders such as in the sequence analysis. It is noted that (1) 'orphan' AUG codons are not a rare occurrence, and (2) that the nucleotide sequence in the ORF-B region is thymidine-poor, which produces a bias in its conceptual translation towards certain amino acids, including Arg. In addition, this analysis shows that tal genes without ORF-B exist, and in fact, an ORF-B is only present in some genes from holometabolous insects (Galindo, 2007).

RNA interference (RNAi) analysis of the function of the whole mlpt transcript identifies several functions that seem homologous to the one identified in Drosophila, in particular the tarsal-promoting function, and a requirement in the tracheal system. However, Savard also identified a 'gap' and homeotic segmentation phenotypes that the current expression and functional data results show to be absent in Drosophila. This functional difference might be due to the different modes of early embryonic development in Drosophila and Tribolium, which also involve a different complement of gap and maternal genes. To clarify whether this segmentation function is ancestral, but has been lost in Drosophila, or whether it is a recently arisen specialization of Tribolium, will require the functional characterisation of tal in other insects (Galindo, 2007).

All sequenced and annotated genomes contain genes and transcripts without known function, sequence homologies, or even known protein domains. In particular, an increasing number of RNA transcripts are being classified as 'noncoding' on the basis of not having ORFs longer than 50-100 aa. Furthermore, genomes contain hundreds of thousands of similarly smORFs that are systematically eliminated from gene annotations for statistical reasons. cDNA libraries and expressed sequence tag (EST) collections also discriminate against small cDNAs, perhaps losing many potential transcripts as well. In the rare cases in which smORFs have been identified in longer, polycistronic messengers, studies have centred on the regulatory effect of the 5' smORFs and resulting peptides on a standard, longer 3' ORF. Thus, the possibility of smORFs producing peptides with important, independent functions has been largely overlooked outside of yeast, in which there is firm evidence for their existence. This study identified tal as a functional gene encoding only smORFs, which are translated. The tal type-A peptides define an ancient gene family with at least a crustacean representative (in Daphnia), and thus is not restricted to insects and is older than 440 million years (the estimated time for the origin of insects). It is suspected that this new gene family may in fact be a representative of a new and widespread class of genes and that more genes encoding smORFs, either alone or in polycistronic messengers, await isolation and characterisation. This analysis shows that a good cross-species sample of sequences is required to predict noncanonical peptide-coding genes, but also that these predictions must be validated by functional data, because in its absence, wrong predictions can be made. It is expected that a combination of bioinformatic and functional methods tailored to the search of peptides and smORFs will identify and characterize more new gene products and eukaryotic coding genes. Preliminary results in Drosophila, yeast, and Hydra suggest that hundreds of such genes may exist (Galindo, 2007).


Search PubMed for articles about Drosophila Tarsal-less

Chanut-Delalande, H., Fernandes, I., Roch, F., Payre, F. and Plaza, S. (2006). Shavenbaby couples patterning to epidermal cell shape control. PLoS Biol. 4: e290. PubMed ID: 16933974

Chanut-Delalande, H., Hashimoto, Y., Pelissier-Monier, A., Spokony, R., Dib, A., Kondo, T., Bohere, J., Niimi, K., Latapie, Y., Inagaki, S., Dubois, L., Valenti, P., Polesello, C., Kobayashi, S., Moussian, B., White, K. P., Plaza, S., Kageyama, Y. and Payre, F. (2014). Pri peptides are mediators of ecdysone for the temporal control of development. Nat Cell Biol 16: 1035-1044. PubMed ID: 25344753

Galindo, M. I., et al. (2007) Peptides encoded by short ORFs control development and define a new eukaryotic gene family. PLoS Biol. 5: e106. PubMed ID: 17439302

Kondo, T., et al. (2007). Small peptide regulators of actin-based cell morphogenesis encoded by a polycistronic mRNA. Nat. Cell Biol. 9: 660-665. PubMed ID: 17486114

Kondo, T., et al. (2010). Small peptides switch the transcriptional activity of Shavenbaby during Drosophila embryogenesis. Science 329(5989): 336-9. PubMed ID: 20647469

Pueyoa, J. I. and Couso, J. P. (2008). The 11-aminoacid long Tarsal-less peptides trigger a cell signal in Drosophila leg development. Dev. Biol. 324: 192-201. PubMed ID: 18801356

Savard, J., Marques-Souza, H., Aranda, M. and Tautz, D. (2006). A segmentation gene in tribolium produces a polycistronic mRNA that codes for multiple conserved peptides. Cell 126(3): 559-69. PubMed ID: 16901788

Biological Overview

date revised: 15 February 2015

Home page: The Interactive Fly © 2011 Thomas Brody, Ph.D.

The Interactive Fly resides on the
Society for Developmental Biology's Web server.