knot/collier: Biological Overview | Evolutionary Homologs | Regulation | Developmental Biology | Effects of Mutation | References

Gene name - knot

Synonyms - collier (col)

Cytological map position - 51C1--51C1

Function - transcription factor

Keywords - head, gap gene

Symbol - kn

FlyBase ID: FBgn0014142

Genetic map position -

Classification - EBF/Olf-1 homolog, HLH protein

Cellular location - nuclear



NCBI links: Precomputed BLAST | Entrez Gene | UniGene | HomoloGene
BIOLOGICAL OVERVIEW

Segmentation of the Drosophila embryo is based on a cascade of hierarchical gene interactions initiated by maternal morphogens. These interactions define spatially restricted domains of zygotic gene expression within the blastoderm. Although the hierarchy of the segmentation genes that subdivide the trunk is well established, patterning in the head is less well understood. Seven head segments can be assigned on the basis of metameric patterns of segment-polarity gene expression and internal sensory organs. The domains of expression for head gap-like genes broadly overlap; their posterior margins are out of phase by one segment. These observations, taken together with the lack of pair-rule gene expression in the head, have led to the suggestion that head gap genes act in a combinatorial manner, simultaneously determining segmental borders in the head and segmental identities (Crozatier, 1996 and references).

collier (preferentially called knot because the gene was discovered an the basis of a mutant phenotype prior to being name collier) expression at the blastoderm stage is restricted to a single stripe of cells corresponding to part of the intercalary and mandibular segment primordia, possibly parasegment 0. There is a striking similarity between the early stripe of collier expression and the position of a specific mitotic domain at cycle 14, mitotic domain 2 (MD2). Mitotic domains are defined as groups of cells that enter mitosis 14 both synchronously and out of synchrony with other groups of cells (Foe, 1989). The pattern of string (stg) transcription anticipates the pattern of cycle 14 mitoses. At the onset of gastrulation, string and collier are simultaneously expressed in a group of cells that correspond to MD2, suggesting that these cells not only share a mitotic fate, but also share a specific gene expression program. It is thought that col and stg respond to the same patterning information and act in parallel, with col assigning a specific gene-expression program in cells in MD2. stg and col expression in MD2 is concomitant and both require buttonhead (Crozatier, 1996).

Reduction of col activity in early gastrula embryos by antisense RNA expression results in a specific lack of head structures derived from intercalary and mandibular segments. Almost all antisense expressing embryos fail to hatch. However, 80% develop to the point of making a cuticle; in these embryos, the only defects consistently observed are in the head (cephalopharyngeal) skeleton. There is an absence or drastic reduction of the lateral gräten, one of the head skeleton elements. All other skeletal structures appear normal. The lateral gräten are though to originate from the mandibular segment. In antisense expressing embryos, the Engrailed intercalary spot is either reduced or is sometimes missing, whereas the more posterior mandibular Engrailed stripe appears to be unaffected (Crozatier, 1996).

It is suggested that Col may act as a 'second-level regulator' of head patterning. This study, together with the recent characterization of crocodile (a gene required for the formation of structures derived from the intercalary segment, the posterior wall of the pharynx and the ventral arm of the cephalopharyngeal skeleton, indicates that a complex network of transcription factors acts downstream of head gap genes in controlling morphogenesis in the embryonic head. The structural conservation of Col during evolution raises the questions of its conservation of function in head specification and its interactions with other factors conserved between insects and vertebrates (Crozatier, 1996).

Collier is required for formation of a subset of somatic muscles. During Drosophila embryogenesis, mesodermal cells are recruited to form a stereotyped pattern of about 30 different larval muscles per hemisegment. The formation of this pattern is initiated by the specification of a special class of myoblasts, called founder cells, that are uniquely able to fuse with neighbouring myoblasts. The COE transcription factor Collier plays a role in the formation of a single muscle (muscle DA3[A] in the abdominal segments; DA4[T] in the thoracic segments T2 and T3). Col expression is first observed in two promuscular clusters (in segments A1-A7), corresponding to two progenitors and then their progeny founder cells, but its transcription is maintained in only one of these four founder cells, the founder of muscle DA3[A]. This lineage-specific restriction depends on the asymmetric segregation of Numb during the progenitor cell division and involves the repression of col transcription by Notch signaling. In col mutant embryos, the DA3[A] founder cells form but do not maintain col transcription and are unable to fuse with neighbouring myoblasts, leading to a loss-of-muscle DA3[A] phenotype. In wild-type embryos, each of the DA3[A]-recruited myoblasts turns on col transcription, indicating that this conversion, accomplished by the DA3[A] founder cell, induces the ënaiveí myoblasts to express founder cell distinctive patterns of gene expression, activating col itself. Muscles DA3[A] and DO5[A] (DA4[T] and DO5[T] respectively) derive from a common progenitor cell, the DA3[A]/DO5[A] progenitor. However, ectopic expression of Col is not sufficient to switch the DO5[A] to a DA3[A] fate. Together these results lead to a proposal that specification of the DA3[A] muscle lineage requires both Col and at least one other transcription factor, supporting the hypothesis of a combinatorial code of muscle-specific gene regulation controlling the formation and diversification of individual somatic muscles (Crozatier, 1999a).

The col-expressing promuscular clusters and progenitor cells have a distinctive position, as defined relative to morphological landmarks and ectodermal Engrailed (En) expression. The DA3[A]/DO5[A] progenitor cell lies underneath the anterior epidermal compartment, whereas the DT1[A]/DO4[A] progenitor cell lies on the anterior edge of the posterior compartment, consistent with mapping of the primordium for the somatic mesoderm. Since Wingless (Wg) and Hedgehog (Hh) signaling have been shown to be required for mesoderm segmentation and formation of a subset of muscle founder cells, col expression was analyzed in wg and hh mutant embryos. At stage 10, both mutant embryos show changes in mesodermal col expression: rather than being restricted to specific clusters in the anterior compartment, it appears almost continuous along the anteroposterior axis. Therefore, both wg and hh signalings appear to restrict col transcription to specific clusters. Lack of Wg or Hh activity does not seem, however, to impede specification of the DA3[A]/DO5[A] progenitor, which is singled out in the mutant as well as the wild-type embryos. It was noticed, however, that, while the DA3[A]/DO5[A] progenitor appears to be specified normally, more than one cell is singled out from the DT1[A] /DO4[A] cluster in hh mutant embryos (Crozatier, 1999a).

Following establishment of the promuscular clusters, specification of the progenitors is controlled by lateral inhibition, a cell-cell interaction process mediated by the neurogenic genes Notch (N) and Delta (Dl)). In both N and Dl mutant embryos, promuscular Col expression is initiated normally but fails to become restricted to a single cell per cluster, similar to observations previously made for the expression of lísc. As a consequence, a hyperplasic expression of Col is observed from stage 11. Since it is expressed in promuscular clusters and segregating muscle progenitors, lísc has been proposed to play a role in muscle progenitor selection similar to the role of achaete and scute in neuroblast specification. However, in embryos lacking lísc activity, selection of the Col-expressing progenitors occurs normally at stage 11 and muscle DA3[A] forms as in wild type (Crozatier, 1999a).

A key event in the generation of the muscle diversity is the asymmetric division of progenitor cells. The distinction between sibling muscle founder cells depends on the differential distribution of the membrane-associated protein Numb (Nb), under the control of inscuteable (insc). One proposal for the action of Nb in determining differences in cell fate is that it biases the N-mediated cell-cell interactions by inhibiting Notch activity, so that this interaction becomes, in turn, asymmetric. The formation of muscles DA3[A] and DO5[A] was analyzed in insc, nb and N mutant embryos. In insc mutant embryos, the DA3[A] muscle is duplicated at the expense of DO5[A] in most of the segments. Although not 100% penetrant, the DA3[A] duplication phenotype always correlates with the absence of DO5[A], indicating a transformation of DO5[A] into DA3[A]. The reciprocal phenotype is observed in nb embryos: muscle DA3[A] is missing whereas muscle DO5[A] is duplicated. By analogy with the sensory organ precursor (SOP) lineage, this finding suggests that the DA3[A] founder cell is the cell that inherits Nb. Absence, or duplication, of DA3[A] in nb and insc mutant embryos, respectively, indicates that Nb function is required for specifying the DA3[A] cell fate. This conclusion is supported by the DA3[A] duplication phenotype observed in embryos mutant for sanpodo (spdo), another gene that acts antagonistically to nb in the Notch-mediated determination of alternative cell fates and encodes a tropomodulin-like protein (Crozatier, 1999a).

The question was then raised as to how nb and col functions relate to one another in specifying DA3[A] . col transcription was examined in insc and nb mutant embryos, using the col intronic probe. col transcription is controlled by Notch signaling in the establishment of the DA3[A]/DO5[A] lineage. In wild-type embryos at late stage 12, only one founder cell (DA3[A]) maintains col transcription; in insc mutant embryos, two cells do so. Conversely, no founder cell continues to transcribe col in nb mutant embryos. These data indicate that Nb determines the choice between the DA3[A] and DO5[A] cell fates, by allowing col transcription to persist in the DA3[A] founder cell. In N mutant embryos, a large disruption of the muscle pattern occurs, as a result of the cumulative effects of overproduction of muscle progenitor cells, lack of myoblast fusion, disorganization of the muscle epidermal attachment sites as well as, possibly, lack of a signal to the mesoderm emanating from the epidermis. Despite this cumulative phenotype, N mutant embryos were used to analyse the role of N in establishing the DA3[A] cell fate, taking advantage of the perdurable expression of the col-lacZ transgene. In N embryos, there is a large increase in the number of muscle cells that express high levels of Col and beta-gal at stage 11, resulting from the defective progenitor selection. beta-gal expression persists in these cells up to stage 16, suggesting that they have adopted a DA3[A] fate. All together, and based on the recent finding that Notch is required to maintain progenitor-specific gene expression in one sibling founder cell and repress it in the other, a comparison of the patterns of col expression between wild-type and insc, nb or N mutant embryos indicates that the restriction of col transcription to a single founder cell is under the control of Notch signaling, at two successive levels: Notch activity is first required for restricting col expression to a single cell per cluster during the progenitor selection process. Afterward, Notch signaling is necessary to restrict col transcription to only one of two sibling founder cells and distinguish between the DA3[A] and DO5[A] fates (Crozatier, 1999a).

While col activity is absolutely required for the formation of muscle DA3[A] , it remained uncertain whether it is sufficient to convert the DO5[A] into a DA3[A] muscle. To address this question, Col was ectopically expressed at different time points during embryonic development, using a heat-shock col construct, and the DA3[A] fate was followed with P[col5-lacZ] expression. The A2-A7 pattern of muscles of heat-shock-treated embryos was visualised by double immunostaining for beta-gal and myosin heavy chain at stage 16. Ectopic Col expression induced at 4-5 hours AEL (stage 7-9), i.e., before singling out of the progenitor cell has occurred, does not alter much the final muscle pattern. These data indicate that ectopic col expression is not sufficient by itself to either switch cell fate between DO5[A] and DA3[A] or change the cell fate of other muscle precursors (Crozatier, 1999a).

The vestigial muscle phenotype has not been reported so far, precluding a comparison with the col mutant phenotype. Nevertheless, suggestive evidence that vg might be involved in regulation of col expression is provided by the ectopic col-lacZ expression observed in conditions of heat-shock-induced ubiquitous col expression. It is interesting to note that all three muscles in which col-lacZ is ectopically activated (muscles DA2[A] , VL1[A], VL2[A]) also express vg. All together, these results support the involvment of a combinatorial code of muscle-identity genes expressed in muscle progenitors and controlling the diversification of the somatic muscles. How does Col interact with the myogenic pathway in controlling formation of the DA3[A] muscle, or, put another way, determining the specific targets of Col in this process remains a challenging question. Col belongs to a small family of non-basic HLH transcription factors, the COE proteins, which are highly conserved during evolution. One Xenopus member of this family, XCoe2, is involved in the specification of primary neurons and that Xcoe2 activity is subject to feed-back regulation by lateral inhibition. The present report raises the interesting possibility that, beyond an apparent diversity of function, regulation of Xcoe2 expression during primary neuron formation and col during embryonic muscle formation reflect the existence of an evolutionary conserved pathway linking Notch signaling and col/Xcoe2 function in binary cell decisions in vertebrates and invertebrates (Crozatier, 1999a).


GENE STRUCTURE

The 3.4 kb and 3.9 kb cDNAs differ from one another by 465 nucleotides (between positions 2098 and 2563), which are removed by a developmentally regulated alternative splicing event. The two Col isoforms have the same 528 amino-terminal amino acids but their sequences differ at the carboxy-terminal ends. In Col isoform 1, the C-terminal region of sequence divergence is 47 amino aids long, and in Col isoform 2 this region is 29 amino acids long (Crozatier, 1996).
cDNA clone length - 3.9 kb and 3.4 kb

Bases in 5' UTR - 512

Bases in 3' UTR - 1268


PROTEIN STRUCTURE

Amino Acids - 575 and 557

Structural Domains

Two protein regions are particularly well conserved between Col and EBF/Olf-1 (Early B-cell factor). The first one, which is 210 amino-acids long and lies between residues 59 and 269 of Col, shows 86% identity and corresponds to the DNA-binding domain of EBF. The second region (Col residues 297 to 431) shows 89% identity and partially overlaps a region of EBF sufficient for homodimerization in vitro. There is a consensus helix-1-loop-helix-2 motif in this region, which is also conserved in the rodent proteins. A second potential helix-2 reported in EBF/Olf-1 is absent in Col. The HLH dimerization motif is not preceded by a basic region, consistent with the presence of the independent DNA binding domain (between residues 59 and 269). The C-terminal region is rich in alanine, serine and threonine residues and probably represents a transcription activation domain (Crozatier, 1996).

The col gene maps to the chromosomal region 51C1,2. In order to establish its molecular organization, approx. 45 kb of overlapping genomic DNA were isolated covering the col transcription unit and the relevant regions were sequenced. The col transcription unit consists of 12 exons and 11 introns spanning a genomic region of about 30 kb. Introns separate the coding regions for each Col functional domain, defined by biochemical dissection of EBF and sequence conservation during evolution. These are the DNA binding domain (aa 59 to 288), the homodimerization domain (aa 289 to 429), and a putative transactivation domain at its carboxy-terminal end. However, additional introns split the Col DNA binding and homodimerization domains, despite their extensive primary sequence conservation in all COE proteins identified so far, from nematode to vertebrates. Within the homodimerization domain, the helix-loop-helix (HLH) motif is encoded by a single exon, exon 9. Finally, the genomic structure of col indicates that the two predicted Col embryonic protein isoforms, which differ in their carboxy-terminal protein coding region, result from alternative splicing of exon 11 (Crozatier, 1999b).


knot/collier: Evolutionary Homologs | Regulation | Developmental Biology | Effects of Mutation | References

date revised: 15 March 99

Home page: The Interactive Fly © 1995, 1996 Thomas B. Brody, Ph.D.

The Interactive Fly resides on the
Society for Developmental Biology's Web server.