araucan and caupolican: Biological Overview | Evolutionary Homologs | Regulation | Developmental Biology | Effects of Mutation | References

Gene name - araucan and caupolican

Synonyms -

Cytological map position - 69D1-D6

Function - Transcription factor

Keyword(s) - wing pattern formation

Symbol - ara and caup

FlyBase ID:FBgn0015904 and FBgn0015919

Genetic map position - 3-

Classification - Homeodomain Pbx class

Cellular location - nuclear

Araucan NCBI links: | Entrez Gene | HomoloGene

Caupolican links: | Entrez Gene | |


Drosophila is the organism of choice for uncovering and analyzing genetic basis of development. Analysis of the iroquois mutation, and the discovery of three homeodomain proteins associated with what has become known as the Iroquois group, illustrates the power of Drosophila genetics. The iroquois mutation was isolated in a search for mutations that alter the pattern of macrochaetae. The gene-dose titration method is based on the idea that changing the gene dosage of two interacting genes may sometimes result in an abnormal phenotype, even when changing the gene dosage of either of the genes individually has no detectable effect. In the isolation of iroquois, mutations were selected which alter the pattern of bristles when doubly heterozygous with the deletion Df(4)M62f. This deletion removes the cubitus interruptus gene which is involved in the patterning of adult sense organs in addition to its role in the segmentation of the embryo. Two mutations were isolated which proved to be allelic, and these were termed iro1 and iro2 (Leyns, 1995).

Initially, two transcriptions units were detected within the iroquois locus. Following the practice of honoring Amerindian tribes at this locus, the genes have been named araucan and caupolican. Caupolicán was an Araucanian chief and a leader of the Indian resistance to the Spanish invaders of Chile. From 1553 to 1558 Caupolicán fought the Spanish, ultimately taking sole command of the Indian resistance. After several victories, Caupolicán suffered three disastrous routs at the hands of forces led by Don Garcia Hertado de Mendoza, losing more than 6,000 men in one of the defeats. Caupolicán was barbarously murdered in 1558. Although Caupolicán was a man of great skill and valor, his fame rests primarily on the verses dedicated to him by the poet Alonso de Ercilla y Zúñiga in his long poem La Araucana. The poet was with the army of Mendoza and apparently a witness to the deeds of Caupolicán.

Because the iroquois locus codes for more than one protein, the name iroquois, as in the iro complex (IROC), is reserved for the locus (Gómez-Skarmeta, 1996a). The proteins turn out to possess homeodomains similar to those of Drosophila Extradenticle and mammalian Pbx1. Subsequently the gene mirror, closely linked to ara and caup, was identified. Mirror also proves to be a homeoprotein, and is the third PBX-class homeoprotein of the IROC complex (McNeill, 1997).

The expression pattern of ara and caup in the wing confirms, amplifies, and clarifies an idea that is deeply rooted in the developmental biology of Drosophila: gene expression establishes a prepattern that determines the size, shape and number of sensory organ mother cells (SMC) in imaginal discs, the larval structures that give rise to most of the adult epidermis (Ghysen, 1988). The epidermis of the wings and most of the mesothorax is derived from a pair of imaginal wing discs. In mature larvae, each wing disc displays approximately 20 distinct proneural clusters that develop into precisely positioned SMCs. The pattern of proneural clusters in the wing is constructed piece-meal by enhancer-like, cis-regulatory elements regulating achaete and scute expression into complex spatio-temporal patterns (Gómez-Skarmeta, 1995).

But what establishes the domains of transcription for achaete and scute? Part of this pattern is established by the action of ara and caup on achaete and scute enhancers. But this merely begs the question of what establishes the pattern for ara and caup transcription. ara-caup expression in the wing is restricted to two symmetrical patches, one at each side of the dorsoventral compartment border. ara-caup expression in these patches is necessary for the specification of the prospective vein L3 and associated sensory organs. Here, ara-caup expression is mediated by the Hedgehog signal through its induction of high levels of Cubitus interruptus in anterior cells near the AP compartment border. The high levels of CI activate decapentaplegic expression, and together, CI and DPP positively control ara-caup. It is unlikely that Optomotor blind and Spalt mediate the dpp requirement because mutant cells lacking either omb or spalt can still differentiate vein L3. Another candidate is the zinc finger protein encoded by the schnurri locus, which acts in the Dpp signaling pathway and whose removal suppresses vein formation. dpp by itself is insufficient to account for ara-caup expression. wingless is expressed in a narrow strip of cells straddling the DV compartment boundary of the wing disc, corresponding to the prospective wing margin. The dorsal and ventral ara-caup L3 patches are separated by a gap that corresponds to the cells that accumulate detectable amounts of WG. Clones of mutant wg expressing cells spanning the gap between the L3 patches extend these patches toward the DV border, leaving a narrow gap of only one or two cell diameters. Thus WG represses ara-caup expression at the prospective wing margin domain. Therefore, one aspect of the prepattern in wing discs is established by segment polarity genes, which in turn target homeodomain proteins (ara and caup), which regulate the proneural genes achaete and scute (Gómez-Skarmeta, 1996b).

The Iroquois Complex Is Required in the Dorsal Mesoderm to Ensure Normal Heart Development in Drosophila

Drosophila heart development is an invaluable system to study the orchestrated action of numerous factors that govern cardiogenesis. Cardiac progenitors arise within specific dorsal mesodermal regions that are under the influence of temporally coordinated actions of multiple signaling pathways. The Drosophila Iroquois complex (Iro-C) consists of the three homeobox transcription factors araucan (ara), caupolican (caup) and mirror (mirr). The Iro-C has been shown to be involved in tissue patterning leading to the differentiation of specific structures, such as the lateral notum and dorsal head structures and in establishing the dorsal-ventral border of the eye. A function for Iro-C in cardiogenesis has not been investigated yet. Loss of the whole Iroquois complex, as well as loss of either ara/caup or mirr only, affect heart development in Drosophila. The data indicate that the GATA factor Pannier requires the presence of Iro-C to function in cardiogenesis. A detailed expression pattern analysis of the members of the Iro-C revealed the presence of a possibly novel subpopulation of Even-skipped expressing pericardial cells and seven pairs of heart-associated cells that have not been described before. Taken together, this work introduces Iro-C as a new set of transcription factors that are required for normal development of the heart. As the members of the Iro-C may function, at least partly, as competence factors in the dorsal mesoderm, these results are fundamental for future studies aiming to decipher the regulatory interactions between factors that determine different cell fates in the dorsal mesoderm (Mirzoyan, 2013).

Tissue patterning requires the spatial and temporal coordinated action of signals providing instructive or permissive cues that result in the specification of different cell types and their subsequent differentiation into different lineages. This analyses of Iro-C deficient embryos demonstrate that ara/caup and mirr are required in the dorsal mesoderm for normal heart development. The heart phenotypes could be caused by alterations of the fine balance of the interactions between factors of the cardiac signaling network. In early stage Drosophila embryos the mesoderm is patterned along the anterior-posterior (AP) axis with cardiac and somatic mesodermal domains alternating with visceral mesodermal domains. The tin-positive mesoderm is specified as cardiac and somatic mesoderm under the influence of combined Dpp and Wg signaling. Subsequently, the cardiac and somatic mesodermal domains are further subdivided by the action of the Notch pathway and MAPK signaling activated by EGFR and FGFR. The Eve-expressing cell clusters that give rise to pericardial and DA1 somatic muscle cells, as well as the Doc expression pattern, distinguish the cardiac and somatic mesodermal domain from the visceral mesodermal domain. The early expression pattern of Ara/Caup and Mirr at stages 10/11 suggests a role for Iro-C in patterning the dorsal mesoderm along the AP axis. Consistent with their previously described functions in other developmental contexts, members of the Iro-C may integrate signaling inputs and interact with other transcription factors to specify different dorsal mesodermal derivatives. Activation of the Iro-C by the EGFR pathway is required for the specification of the notum. Mirr was shown to interpret EGFR signaling by eliciting a specific cellular response required for patterning the follicular epithelium. During Drosophila eye development, mirr expression can be regulated by Unpaired, a ligand that activates JAK/Stat signaling. In fact, the JAK/Stat signaling pathway has only recently been added to the signaling pathways that function in Drosophila cardiogenesis. In chromatin immunoprecipitation experiments caup was identified as a target of Stat92E, which is the sole transcriptional effector of the JAK/Stat signaling pathway in Drosophila. Interestingly, the increase of Odd-pericardial cells and the additional Tin-expressing cells that were the characteristic phenotypes in ara/caup (iroDFM1) and in mirr (mirre48) mutants are highly similar to the phenotypes in stat92E mutants described by Johnson (Johnson, 2011). Also, as described for stat92E mutants, cell adhesion defects were noticed in a number of embryos as determined by the distant location of some Tin-expressing cells from the forming heart tube. As for establishing a possible link between JAK/Stat and Iro-C in the dorsal mesoderm and specifically in cardiogenesis, it would be necessary to determine for example whether caup and mirr can rescue the heart phenotype of stat92E mutants. Also, it would be interesting to compare the expression of the other crucial heart marker genes, Tup, Doc and Pnr, in stat92E mutants at early stages to determine to what extent the phenotypes of embryos mutant for Iro-C and for JAK/Stat signaling are similar (Mirzoyan, 2013).

Members of the Iro-C were shown to be positively or negatively regulated by signaling pathways that play crucial roles in heart development. Conversely, the Iro-C factors can also regulate the activity of at least one of these pathways. Specifically, Ara/Caup, as well as Mirr were shown to regulate the expression of the glycosyltransferase fringe and as a consequence modulate Notch signaling activity in the eye. In the dorsal mesoderm, the lateral inhibitory function of Notch signaling establishes the proper number of heart and muscle progenitors. Given the fact that Iro-C can regulate Notch activity it may be that the loss of Iro-C leads to an imbalance of progenitor cell specification resulting in an abnormal number of heart cells. Further studies are required to decipher the molecular mechanism by which Iro-C could integrate diverse signaling inputs and thereby function in the specification and differentiation of the different dorsal mesodermal derivatives (Mirzoyan, 2013).

To determine whether Iro-C can be positioned into the early transcriptional network that determines a cardiac lineage, this study investigated the interdependency between crucial cardiac factors and Iro-C during cardiogenesis. Analyses of the expression of Ara/Caup and Mirr in tin346, Df(3L)DocA, pnrVX6 and tupisl-1 embryos demonstrated the dependency of Ara/Caup and Mirr on all four factors. The strongest loss of Ara/Caup and Mirr expression was observed in tin346 and Df(3L)DocA mutants, which clearly places tin and Doc upstream of Ara/Caup and Mirr. In tupisl-1 and in pnrVX6 mutant embryos, Ara/Caup and Mirr were strongly downregulated, however regarding Ara/Caup, some expression remained in segmental patches suggesting a different level of regulation. The currently available data indicates a positive and a negative regulatory effect of pnr on Iro-C. Whereas pnr restricts Iro-C expression in the dorsal ectoderm and in the wing disc, there is also evidence that pnr can positively regulate Iro-C in the wing disc. Whether Pnr activates or represses Iro-C appears to depend on the presence of U-shaped (Ush), a protein that modulates the transcriptional activity of Pnr. In the wing disc it was shown that an Iro-C-lacZ (IroRE2-lacZ) construct was activated in cells that contained Pnr but were devoid of Ush. The data demonstrate that in the dorsal mesoderm, the expression of Ara/Caup and Mirr depends on pnr. Additionally this analyses show that pnr expression is independent of Iro-C. This finding is intriguing with respect to the downregulation of Tup and Doc in Iro-C mutants. Pnr is required for the maintenance of Doc and for the initiation and/or maintenance of Tup. Since Iro-C mutants exhibit a reduction in Doc-positive cells despite the presence of pnr, members of the Iro-C appear to be required independently or in addition to pnr to maintain expression of Doc. This could be investigated by expressing ara, caup and/or mirr in the mesoderm of pnr mutants to determine whether these factors are able to restore Doc expression. Alternatively, it may be that Iro-C is required indirectly meaning that its main function is to provide a molecular context in which Pnr can be active. For example, it is known that Ush can bind to Pnr thereby inactivating Pnr function. It is conceivable that the absence of Iro-C affects the spatial expression of Ush. If, in the absence of Iro-C, the expression domain of Ush shifts into the Pnr expression domain, Ush could bind to Pnr and inactivate it in the region where Pnr is required to maintain the expression of Tup and Doc. Adding to the complexity of the interpretation of the observed phenotypes is the finding that the majority of embryos that are mutant for ara/caup or for mirr were characterized by supernumerary Tin-positive cells in the cardiac region by stage 11/12. This phenotype could still be observed at later stages when the heart tube forms. The additional Tin-positive cells are pericardial cells as determined by the expression of Prc around the Tin-expressing cells. Also, no increase was observed of Dmef2-positive myocardial cells. Hence, the data suggests a different level of regulation of Tin by the Iro-C. Similar to the findings of Johnson (2011), it may be that Iro-C is normally required to restrict Tin expression at an early stage. The regulation of Tin expression can be divided into four phases. The phenotype this study observed occurs when Tin expression becomes restricted to the myo- and pericardial cells in the cardiac region. In summary, the data adds Iro-C to tin, pnr, Doc and tup whose concerted actions establish the cardiac domains in the dorsal mesoderm. Further studies are required to re-evaluate the current understanding of the interactions between factors of the cardiac transcriptional network (Mirzoyan, 2013).

According to the expression pattern of Ara/Caup and Mirr it was possible to distinguish between an early and late role for these factors, the latter being a role in the differentiation of heart cells (Mirzoyan, 2013).

This analyses of the expression of Ara/Caup and Mirr during embryogenesis led to the identification of hitherto unknown heart-associated cells. Seven pairs of Ara/Caup and Mirr expressing cells and seven pairs of Mirr only expressing cells were detected that were located along the dorsal vessel. No co-expression was detected with any of the known pericardial cell markers. Because there are seven pairs of these cells segmentally arranged, it was tempting to speculate that these cells may function, for example, as additional attachment sites for the seven pairs of alary muscles. The alary muscles attach the heart to the dorsal epidermis and their extensions can be visualized by Prc. Due to the lack of markers little is known about the development of the alary muscles. Previous work demonstrated that the alary muscles attach to the dorsal vessel in the vicinity of the Svp pericardial cells and, in addition, more laterally to one of two distinct locations on the body wall. Maybe it is the Mirr-positive cells that identify the more lateral locations. Clearly, a detailed analysis is needed to identify the function of the Ara/Caup- and Mirr- as well as Mirr-expressing cells that are positioned along the heart tube and whose existence has now been revealed. Additionally, on each side at the anterior end of the dorsal vessel four pericardial cells were identified that co-express Ara/Caup and Eve. Their location at the anterior tip of the heart is intriguing. Further analysis is required to unambiguously determine whether these cells are, for example, the wing heart progenitor cells or the newly identified heart anchoring cells. It is also possible that they represent a yet undefined, novel subpopulation of pericardial cells. In any case, this finding suggests that Ara/Caup plays a role in the diversification of pericardial heart cell types. Future experiments aim to determine the developmental fate of these cells (Mirzoyan, 2013).

Taken together, this investigation of a role for Iro-C in heart development introduces ara/caup and mirr as additional components of the transcriptional network that acts in the dorsal mesoderm and as novel factors that function in the diversification of heart cell types (Mirzoyan, 2013).

The results show that the role of the Iro-C and its individual members, respectively, appears to be rather complex and awaits in-depth analyses. Nevertheless, this work raises important questions regarding the current understanding of interactions between the well-characterized transcription factors that will be addressed in future studies (Mirzoyan, 2013).


The two genes are transcribed in the same direction on the chromosome, separated by 12 kb. ara transcripts consist of a principle 3.0 kb form and two other less abundant forms (1.8 and 0.6 kb). caup is present in two equally abundant transcripts of 4.7 and 3.4 kb, present throughout development, and a 1.3 kb mRNA, characteristic of adults. The differently sized transcripts open the possibility of alternative processing or transcription initiation/termination sites (or both) for both genes (Gómez-Skarmeta, 1996a).

Exons - 5 for ara and 4 for caup


Amino Acids - 716 for Ara and 693 for Caup

Structural Domains

Six of the seven amino acids most conserved in a sample of 246 homeodomain proteins are found in the Ara and Caup sequences. Predictions of secondary structure indicate a conservation of the four alpha helices characteristic of most homeodomains. The homeodomains most similar to those of Ara and Caup (80% identity) have been found in the human sequence R46202 and in several mouse proteins (up to 100% identity). The next most similar homeodomains, those of the human Pbx and Drosophila Extradenticle proteins, have only 37% identity. Ara and Caup homeodomains are nearly identical. The homeodomain of Drosophila Mirror (McNeill, 1997) has 57 out of 60 amino acids identical to Caup and Ara. In the amino-terminal domain, these proteins possess a highly conserved Notch interaction domain, which has been proposed to be involved in protein-protein interactions. The Notch interaction domain of the homeodomain proteins is homologous to a stretch of amino acids in Xenopus, rat and Drosophila Notch. This putative protein interaction domain is similar to the central part of the epidermal growth factor repeats of the Notch protein. C-terminal to the novel homology region are two potential phosphorylation sites for mitogen-activated kinase (Rolled in Drosophila). Mirror, Caup and Ara also share a novel homology region that is found halfway between the homeodomains and C-terminus of the proteins (Gómez-Skarmeta, 1996a and McNeill, 1997).

A new Caenorhabditis elegans homeobox gene, ceh-25, is described that belongs to the TALE superclass of atypical homeodomains, which are characterized by three extra residues between helix 1 and helix 2. ORF and PCR analysis reveals a novel type of alternative splicing within the homeobox. The alternative splicing occurs such that two different homeodomains can be generated, which differ in their first 25 amino acids. ceh-25 is an ortholog of the vertebrate Meis genes and it shares a new conserved domain of 130 amino acids with them. A thorough analysis of all TALE homeobox genes was performed and a new classification is presented. Four TALE classes are identified in animals: PBC, MEIS, TGIF and IRO (Iroquois); two types in fungi: the mating type genes (M-ATYP) and the CUP genes; and two types in plants: KNOX and BEL. The IRO class has a new conserved motif downstream of the homeodomain. For the KNOX class, a conserved domain, the KNOX domain, was defined upstream of the homeodomain. Comparison of the KNOX domain and the MEIS domain shows significant sequence similarity revealing the existence of an archetypal group of homeobox genes that encode two associated conserved domains. Thus TALE homeobox genes were already present in the common ancestor of plants, fungi and animals and represent a branch distinct from the typical homeobox genes (Burglin, 1999).

araucan and caupolican: Evolutionary Homologs | Regulation | Developmental Biology | Effects of Mutation | References

date revised: 26 May 97 

Home page: The Interactive Fly © 1995, 1996 Thomas B. Brody, Ph.D.

The Interactive Fly resides on the
Society for Developmental Biology's Web server.