Targets of Activity

A major challenge in interpreting genome sequences is understanding how the genome encodes the information that specifies when and where a gene will be expressed. The first step in this process is the identification of regions of the genome that contain regulatory information. In higher eukaryotes, this cis-regulatory information is organized into modular units [cis-regulatory modules (CRMs)] of a few hundred base pairs. A common feature of these cis-regulatory modules is the presence of multiple binding sites for multiple transcription factors. Transcription factor binding sites have a tendency to cluster; the extent to which they do can be used as the basis for the computational identification of cis-regulatory modules. By using published DNA binding specificity data for five transcription factors active in the early Drosophila embryo, genomic regions containing unusually high concentrations of predicted binding sites were identified for these factors. A significant fraction of these binding site clusters overlap known CRMs that are regulated by these factors. In addition, many of the remaining clusters are adjacent to genes expressed in a pattern characteristic of genes regulated by these factors. One of the newly identified clusters, mapping upstream of the gap gene giant (gt) was tested; it acts as an enhancer that recapitulates the posterior expression pattern of gt (Berman, 2002).

The transcription factors Bicoid (Bcd), Caudal (Cad), Hunchback (Hb), Krüppel (Kr), and Knirps (Kni) act at very early stages of Drosophila development to define the anterior-posterior axis of the embryo. Bcd and Cad are maternal activators broadly distributed in the anterior and posterior portions of the embryo, respectively. Hb, Kr, and Kni are zinc-finger gap proteins that act primarily as repressors in specific embryonic domains. Sequences of previously described binding sites were collected for these five factors present in the cis-regulatory regions of known target genes. The binding sequences for each factor were aligned by using the motif-assembly program, and the binding specificities of each factor were modeled with position weight matrices (PWMs). PWMs are a useful way to represent binding specificities and provide a statistical framework for searching for novel instances of the motif in genome sequences (Berman, 2002).

A freely available program PATSER was used to search the genome for sequences that match these PWMs, and a web-based visualization tool, CIS-ANALYST ( was devised to display the location of predicted binding sites along with genome annotations in selected genomic regions. PATSER assigns a score to each potential site that reflects the agreement between the site and the corresponding PWM. These scores approximate the free energy of binding between the factor and site, and CIS-ANALYST uses a user-defined cutoff parameter to eliminate predicted low-affinity sites (Berman, 2002).

Using CIS-ANALYST, the distribution of Bcd, Cad, Hb, Kr, and Kni binding sites were examined in a 1-Mb genomic region surrounding the well-characterized eve locus at a site_p value of 0.0003. At this relatively high-stringency value, most experimentally verified binding sites are retained; at more restrictive values, many of these sites would be lost (Berman, 2002).

To investigate whether binding site clustering could help to explain the specificity of these factors for eve, a simple notion of binding site clustering was incorporated into CIS-ANALYST, allowing searches for segments of a specified length containing a minimum number of predicted binding sites. When the 1-Mb region surrounding eve was searched for dense clusters of predicted high-affinity sites (at least 13 Bcd, Cad, Hb, Kr, or Kni sites in a 700-bp window), three discrete regions were identified. Strikingly, these three clusters are all adjacent to eve, and overlap the previously characterized stripe 2, stripe 3 + 7, and stripe 4 + 6 enhancers (Berman, 2002).

To generalize and quantify these promising results, a broader collection of 19 well-defined CRMs from 9 Drosophila genes known to be required for proper embryonic development was compiled. Each of these CRMs is sufficient to direct the expression of a distinct anterior-posterior pattern in early embryos; genetic evidence suggests that each CRM is regulated by at least one of the following: Bcd, Cad, Hb, Kr, and Kni. Mutation and in vitro DNA binding studies completed on a subset of the CRMs provide evidence for a direct regulatory relationship. The same clustering criteria that were successful for identifying CRMs in eve (700-bp regions with at least 13 predicted binding sites) identified clusters overlapping 14 of these 19 known CRMs (Berman, 2002).

A search of the entire genome for 700-bp windows containing at least 13 predicted binding sites identified 133 clusters in addition to the 19 described above, or ~1 per 700 kb of noncoding sequence. As expected, when more stringent clustering criteria are used, both the number of known CRMs recovered and the number of novel clusters identified decrease. The novel clusters identified with a density of at least 15 binding sites per 700 bp, a level at which half of the known CRMs are still recovered, were further examined. Binding site plots for the 22 novel clusters identified at this high stringency condition, and 6 additional novel clusters identified with an equally stringent search by using only Bcd, Hb, Kr, and Kni have been published as supporting information on the PNAS web site). Twenty-three of these 28 clusters fall in regions between genes, whereas the remaining 5 fall in introns. There are therefore 49 genes that either contain a novel cluster of binding sites or flank an intergenic region that does. The expression patterns of these 49 genes in early embryos were examined by whole-mount RNA in situ hybridization and DNA microarray hybridization. At least 10 of the 28 clusters were adjacent to a gene that showed localized anterior-posterior expression in the syncitial or cellular blastoderm stages, consistent with early regulation by maternal effect or gap transcription factors. Although the numbers are small, this is significantly more than the 1 or 2 expected if the positions of clusters had been chosen at random (Berman, 2002).

One of these clusters is located ~2 kb upstream of the gap gene giant (gt). During cellularization, gt is expressed in two broad domains, one in the anterior and one in the posterior portion of the embryo. The pattern of expression of the posterior expression domain is determined by the activities of Cad, Hb, and Kr. However, the cis-regulatory sequence controlling this posterior expression pattern has not been precisely identified. Whether this cluster of binding sites might be the gt posterior enhancer was evaluated. A 1.1-kb fragment containing this cluster was placed in a reporter construct containing the eve minimal promoter fused to a lacZ reporter gene. The expression pattern of this construct largely recapitulates the early expression pattern of the gt posterior expression domain. In the absence of Kr function, the anterior border of the gt posterior domain shifts anteriorly, indicating repression by Kr. The construct containing the gt posterior enhancer exhibits a similar shift in the absence of Kr (Berman, 2002).

Posterior hairy stripe boundaries are established by gap protein repressors unique to each stripe. Krüppel limits the anterior expression limits of both stripes 5 and 6 and is the only gap gene to do so, indicating that stripes 5 and 6 may be coordinately positioned by the KR repressor. Binding sites for the KR protein were located in both stripe enhancers. The stripe 6 enhancer contains higher affinity KR-binding sites than the stripe 5 enhancer, which may allow for the two stripes to be repressed at different KR protein concentration thresholds. The Knirps activator binds to the stripe 6 enhancer and there appears to be a competitive mechanism of KR with KNI for repression of stripe 6 (Langeland, 1994).

Gap genes Kruppel (Kr), knirps (kni), and tailless (tll) control the expression of the pair-rule gene hairy (h) by activating or repressing independent cis-acting units that generate individual stripes. KR activates stripe 5 and represses stripe 6, KNI activates stripe 6 and represses stripe 7, and TLL activates stripe 7. KR and KNI proteins bind strongly to h control units that generate stripes in areas of low concentration of the respective gap gene products and weakly to those that generate stripes in areas of high gap gene expression. These results indicate that KR and KNI proteins form overlapping concentration gradients that generate the periodic pair-rule expression pattern (Pankratz, 1990).

The expression of the pair-rule gene hairy in seven evenly spaced stripes along the longitudinal axis of the Drosophila blastoderm embryo is mediated by a modular array of separate stripe enhancer elements. The minimal enhancer element, which generates reporter gene expression in place of the most posterior h stripe 7 (h7-element), contains a dense array of binding sites for factors providing the trans-acting control of h stripe 7 expression as revealed by genetic analyses. The stripe seven enhancer is found in a minimal 932 bp region from a 1.5 kb DNA fragment of the h upstream region. The h7-element mediates position-dependent gene expression by sensing region-specific combinations and concentrations of both the maternal homeodomain transcriptional activators, Caudal and Bicoid, and of transcriptional repressors encoded by locally expressed zygotic gap genes. Zygotic caudal expression is not required for activation. Caudal and Bicoid, which form complementing concentration gradients along the longitudinal axis of the embryo, function as redundant activators, indicating that the anterior determinant Bicoid is able to activate gene expression in the most posterior region of the embryo. The spatial limits of the h stripe-7 domain are brought about by the local activities of repressors that prevent activation. The spatial limit of h7 is significantly altered in the gap mutants tailless, knirps and kruppel, but not in embryos lacking either hunchback, giant or huckebein. There are seven binding sites for Bcd, twenty-three for caudal, five for Kruppel, fourteen for Knirps, eight for Hunchback and five for Tailless. In the absence of both cad and bcd, activation still occurs. Thus, a third activator, likely to be Kr, must function in such embryos. It is thought that Kr acts as both a repressor and an activator within the h7 element depending on its concentration. The posterior border is set in response to Tll activity under the control of the terminal maternal organizer system. The anterior border of the expression domain is due to repression in response to Kni. The results suggest that the gradients of Bicoid and Caudal combine their activities to activate segmentation genes along the entire axis of the embryo (La Rosee, 1997).

Drosophila pair-rule gene expression, in an array of seven evenly spaced stripes along the anterior-posterior axis of the blastoderm embryo, is controlled by distinct cis-acting stripe elements. In the anterior region, such elements mediate transcriptional activation in response to (1) the maternal concentration gradient of the anterior determinant Bicoid and (2) repression by spatially distinct activities of zygotic gap genes. In the posterior region, activation of hairy stripe 6 has been shown to depend on the activity of the gap gene knirps, suggesting that posterior stripe expression is exclusively controlled by zygotic regulators. The zygotic activation of hairy stripe 6 expression is preceded by activation in response to maternal caudal activity. Thus, transcriptional activation of posterior stripe expression is likely to be controlled by maternal and zygotic factors as has been observed for anterior stripes. To establish the potential of Cad and Kni to interact with the cis-acting DNA that mediates hairy stripe 6-like expression in the embryo, in vitro footprinting experiments were performed with the 532 bp hairy stripe 6-element DNA. Cad and Kni bind to thirty six in vitro binding sites, some of which overlap, throughout the element. The sequence of the Cad and Kni binding sites matches the consensus described for each of the two proteins. Most of the potential Cad and Kni binding sites are close to or overlapped by binding sites for Kruppel (eight sites), Hunchback (eight sites), and Tailless (five sites). Tests using fragments of the 532 bp enhancer and of another element, 284-HT, show that sequences mediating activation of reporter expression are not maintained within a minimal activation element but instead are dispersed throughout the enhancer (Hader, 1998).

Krüppel represses giant in the central domain thus assuring separated anterior and posterior expression (Kraut, 1991).

knirps expression is repressed by tailless activity, whereas it is directly enhanced by Kr activity. Thus, Kr activity is present throughout the domain of kni expression and forms a long-range protein gradient, which in combination with kni activity is required for abdominal segmentation of the embryo. A construct containing 4.4 Kb of kni upstream sequence located at -0.9 Kb from the start of transcriptiongives the correct spatial pattern of expression in the anterior and posterior domains. Kr responsive elements in kni reside in this 4.4 kb fragment and more precisely in 800 bp fragment located at the 3' end of the 4.4 Kb. Two adjacent Kr protein binding regions are present approximately in the middle of this 0.8 Kb fragment. More sites become protected in footprint experiments when using higher concentration levels of the Kr protein (Pankratz, 1989)

The Krüppel binds to the sequence AAGGGGTTAA. Binding sites are present for KR upstream of the two hunchback promoters. These could mediate the repression of hb by KR and perhaps allow hb to influence its own expression. A 10 Kb genomic DNA fragment contains the hb coding sequence and both promoters. The proximal promoter directs early zygotic expression of hb in the anterior part of the embryo The distal hb promoter is transcribed maternally and also directs later zygotic expression . This latter fragment contains the KR binding sites. 300 bp upstream of the transcription start of the 2.9 kb transcript are sufficient for normal regulation of the expression of this transcript. The two KR binding sites are located at -676 and -359 bp from the proximal hb promoter (Treisman, 1989).

Analysis of the initial paired expression suggests that the gap genes hunchback, Krüppel, knirps and giant activate paired expression in stripes. Specifically, in Krüppel mutants stripes 2 and 3 are replaced by a broader stripe posterior to the wild-type stripe 2. Stripes 5 and 6 are replaced by a stipe that is located posterior to wild type stripe 5 (Gutjahr, 1993).

A 480 bp region of the even-skipped promoter is both necessary and sufficient to direct a stripe of LacZ expression within the limits of the endogenous eve stripe 2. The maternal morphogen Bicoid and the gap proteins Hunchback, Krüppel and Giant all bind with high affinity to closely linked sites within this small promoter element. Forming the posterior border of the stripe involves a delicate balance between limiting amounts of the BCD activator and the KR repressor (Small, 1992).

The entire functional even-skipped locus of Drosophila is contained within a 16 kilobase region. As a transgene, this region is capable of rescuing even-skipped mutant flies to fertile adulthood. Detailed analysis of the 7.7 kb of regulatory DNA 3' of the transcription unit reveals ten novel, independently regulated patterns. Most of these patterns are driven by non-overlapping regulatory elements, including ones for syncytial blastoderm stage stripes 1 and 5, while a single element specifies both stripes 4 and 6. Expression analysis in gap gene mutants shows that stripe 5 is restricted anteriorly by Krüppel and posteriorly by giant, the same repressors that regulate stripe 2. Consistent with the coregulation of stripes 4 and 6 by a single cis-element, both the anterior border of stripe 4 and the posterior border of stripe 6 are set by zygotic hunchback, and the region between the two stripes is ‘carved out’ by knirps. Thus the boundaries of stripes 4 and 6 are set through negative regulation by the same gap gene domains that regulate stripes 3 and 7, but at different concentrations (Fujioka, 1999).

Gap proteins Krüppel and Hunchback to function as transcriptional regulators in cultured cells. Both proteins bind to specific sites in a 100-bp DNA fragment located upstream of the segment polarity gene engrailed, which also contains functional binding sites for a number of homeo box proteins. The Hunchback protein is a strikingly concentration-dependent activator of transcription, capable of functioning both by itself and also synergistically with the pair-rule proteins Fushi tarazu and Paired. In contrast, Krüppel is a transcriptional repressor that can block transcription induced either by Hunchback or by several different homeo box proteins (Zuo, 1991).

To understand the nature of the regulatory signals impinging on the second promoter of the Antennapedia gene (Antp P2), analysis of its expression in mutants and in inhibitory drug injected embryos has been carried out. Products of the zygotically-active segmentation genes ftz, hb, Kr, gt and kni then act as activators or repressors of Antp P2 in a combinatorial fashion. The timing of these events, and their positive versus negative nature, is critical for generating the expression patterns normal for Antp (Riley, 1991).

Expression of the abdominal-A and Abdominal-B genes of the BX-C of Drosophila is controlled by a cis-regulatory promoter and by distal enhancers called infraabdominal regions. The activation of these regions along the anteroposterior axis of the embryo determines where abdominal-A and Abdominal-B are transcribed. There is spatially restricted transcription of the infraabdominal regions (infraabdominal transcripts) that may reflect this specific activation. These iab regions are named after the abdominal segments they control, iab-2 through iab-7 regulating abdominal segments 2 through 7 corresponding to parasegments 7 through 12 respectively. The Abdominal-B (Abd-B) gene of the bithorax complex (BX-C) of Drosophila controls the identities of the fifth through seventh abdominal segments and segments in the genitalia (more precisely, parasegments 10-14). The gap genes hunchback, Krüppel, tailless and knirps control abdominal-A and Abdominal-B expression early in development. The gradients of the Hunchback and Krüppel products seem to be key elements in this restricted activation (Casares, 1995).

Krüppel and Knirps act through the infraabdominal 5 fragment of Abd-B to limit anterior Abd-B expression and regulate the graded Abd-B domain respectively. Both hunchback and Polycomb are required for Abd-B silencing (Busturia, 1993).

The gap genes hunchback, Krüppel, tailless and knirps control abdominal-A and Abdominal-B expression early in development. The restriction of abdominal-A and Abdominal-B transcription is preceded by (and requires) the spatially localized activation of regulatory regions, which can be detected by the distribution of infraabdominal transcripts. The activation of these regions requires no specific gap gene. Instead, a general mechanism of activation, combined with repression by gap genes in the anteroposterior axis, seems to be responsible for delimiting infraabdominal active domains. The gradients of the hunchback and Krüppel products seem to be key elements in this restricted activation (Casares, 1995).

The anterior Abdominal-B expression limit is apparently determined by Krüppel repression, whereas the Knirps repressor may be responsible for the graded Abd-B expression within the Abd-B domain. iab-5 and two other fragments called MCP and FAB show region-specific silencing activity: they suppress at a distance beta-gal expression mediated by a linked heterologous enhancer. Silencing requires hunchback as well as Polycomb function and evidently provides maintenance of Abd-B expression limits throughout embryogenesis (Busturia, 1993).

Spatial boundaries of homeotic gene expression are initiated and maintained by two sets of transcriptional repressors: the gap gene products and the Polycomb group proteins. DNA elements and trans-acting repressors that control spatial expression of the Abdominal-A (ABD-A) homeotic protein have been investigated. Analysis of a 1.7-kb enhancer element [iab-2(1.7)] from the iab-2 regulatory region shows that both Hb and Kruppel (Kr) are required to set the Abd-A anterior boundary in parasegment 7. DNase I footprinting and site-directed mutagenesis show that Hb and Kr are direct regulators of this iab-2 enhancer. The single Kr site can be moved to a new location 100 bp away and still maintain repressive activity, whereas relocation by 300 bp abolishes activity. These results suggest that Kr repression occurs through a local quenching mechanism. The gap repressor Giant (Gt) initially establishes a posterior expression limit at PS9, which shifts posteriorly after the blastoderm stage. This iab-2 enhancer contains multiple binding sites for the Polycomb group protein Pleiohomeotic (Pho). These iab-2 Pho sites are required in vivo for chromosome pairing-dependent repression of a mini-white reporter. However, the Pho sites are not sufficient to maintain repression of a homeotic reporter gene anterior to PS7. Full maintenance at late embryonic stages requires additional sequences adjacent to the iab-2(1.7) enhancer (Shimell, 2000).

The gap gene product Kr is required to set the iab-2(1.7) anterior expression border. However, since Kr is not expressed anterior to PS5, some other factor must also be required to repress the iab-2(1.7) enhancer in anterior regions. A likely candidate is the Hb protein, which has been shown to be important for repressing the bx and pbx enhancers anterior to PS6. To examine whether Hb plays a role in setting the iab-2(1.7) anterior expression boundary, this construct was crossed into both hb and osk mutant backgrounds. Loss of zygotic hb caused a slight broadening of the initial expression band, indicating an anterior shift in the expression pattern of this enhancer. The presence of maternal Hb likely minimizes the anterior shift in these zygotic hb mutant embryos. Consistent with this view, it has been found that, in an osk mutant background, in which the maternal level of Hb is uniform throughout the embryo, expression from the iab-2(1.7) enhancer is completely abolished. These findings suggest that, as with the bx and pbx enhancers, Hb is important for setting the initial anterior limit of iab-2 enhancer function (Shimell, 2000).

Hb, Kr, and Gt have been classified as short-range repressors whose range of action is limited to approximately 50 to 150 bps. Two major mechanisms of short-range repression are: competitive binding to an overlapping activator binding site, and quenching, which entails interference with function of locally bound activators. Since studies on Hb, Kr, and Gt action have focused primarily on their control of pair-rule genes such as eve, it was of interest to address mechanisms used by these repressors in the alternative context of a homeotic gene regulatory region. The in vitro binding analysis identified five discrete Hb sites on the iab-2(1.7) fragment. One of these sites, Hb2, overlaps extensively with one of the Eve binding sites. Since Eve acts as an activator of iab-2(1.7) expression, Hb may repress by competing with Eve for direct binding to this site. Evidence for a direct competition mechanism has been described for Hb repression through the bx and bxd/pbx control regions of the Ubx homeotic gene. In these cases, the anterior boundary is in PS6 rather than PS7, and Hb competes with Ftz rather than Eve. However, mutational analysis shows that Hb sites other than Hb2 also contribute to iab-2 repression. These additional sites could promote Hb competition with Eve by assisting Hb binding at Hb2 through cooperative interactions. Similarly, the single Gt binding site in iab-2(1.7) overlaps another Eve binding site, suggesting that Gt may also repress by direct competition with Eve in posterior parasegments. In contrast, the single Kr binding site (Kr1) does not overlap Eve sites. A distinct Kr mechanism is also supported by the ability of Kr1 to repress even when relocated 100 bp away from its normal position in the iab-2(1.7) fragment. This flexibility, together with failure of Kr repression when Kr1 is further relocated by 300 bp, is consistent with a short-range quenching mechanism. These results argue against Kr repression by direct interference with basal transcription factors, since 300 bp is small compared to the 20-kb distance between the iab-2 enhancer and the abd-A promoter. Previous studies using a synthetic regulatory region have shown that Kr can repress by a quenching mechanism in vivo (Shimell, 2000).

Any proposed mechanism for Kr action through iab-2, however, must account for the variability of Kr repression within its own expression domain. Specifically, Kr represses the iab-2 enhancer in PS3 and PS5 where Kr concentrations are low, but it does not repress in PS7 where Kr concentrations are high. This observation suggests that simple occupancy of the Kr1 site is not sufficient for iab-2 repression and that another factor acts in concert with Kr. The likely partner is Hb since Kr repression of iab-2 is limited to parasegments that accumulate significant levels of both Kr and Hb. In this view, repression just anterior to PS7 requires both Kr and Hb, whereas repression in more anterior parasegments, where Hb levels are highest, is mediated by Hb alone. Kr-Hb synergy could involve direct contact since the two proteins have been shown to interact when bound to DNA. Whether Kr synergizes with Hb by augmenting Hb binding to DNA in a cooperative manner or by recruiting additional corepressors is not clear. Kr, but not Hb, functions together with the corepressor dCtBP (Shimell, 2000 and references therein).

After Hb and Kr decay during early gastrulation, the repressed state is propagated through later stages of development by the PcG proteins. How the transition from early gap repressors to long-term PcG repressors occurs at the molecular level is not known. Two basic models have been proposed: (1) direct recruitment, and (2) chromatin recognition. Model (1): The gap gene products, especially Hb, have been proposed to help recruit PcG proteins directly to specific DNA sites. Based upon its early time of action, a role for the PcG protein Extra sex combs (Esc) as a molecular bridge between the two sets of repressors has been suggested. However, direct interactions between Esc and gap repressors have not been reported. A better candidate for such a molecular link is dMi-2, which binds directly to Hb and behaves genetically as an enhancer of PcG repression. In its simplest form the direct recruitment model is unlikely because the iab-2, bx, and pbx enhancers all contain Hb sites but do not effectively recruit PcG proteins. These elements fail to maintain A-P boundaries of expression and are unable to attract PcG proteins to sites on chromosomes. Furthermore, the continuous requirement for PRE sequences during development shows that DNA site recognition by PcG proteins can occur long after Hb and Kr have decayed. Model (2):The second model proposes that PcG proteins recognize some feature of silenced chromatin, rather than particular gap repressors. This model is supported by patterns of PcG-dependent silencing that reflect patterns of early gene activity rather than the distributions of gap proteins. In this view, PcG proteins sense the transcriptional off state and then assemble locally to imprint this state through later stages. These two models are not mutually exclusive. Both the Hb-interacting protein dMi-2 and the Kr-interacting protein dCtBP have mammalian homologs that interact with histone deacetylases. Perhaps the gap repressors work by targeting these deacetylases, whose action alters the local acetylation state of the histone tails. This could provide a feature of silenced chromatin that is recognized by PcG proteins and that promotes their association at nearby PREs (Shimell, 2000 and references therein).

Kr activity is required for the establishment of Antennapedia third thoracic segment expression and is involved in restricting Abd-B products within the abdominal segments eight and nine (Harding, 1987). Krüppel, caudal and cut are expressed in the Malpighian tubules before and during differentiation. Two of the genes,Krüppel and cut, are known to be required for development of the tubules. The absence of maternal and zygotic caudal function reduces their normal growth and elongation. Normal Krüppel function, which is known to be required for caudal expression, is also required for cut expression, while cut and caudal are expressed independently of each other. Cell type transformations of Malpighian tubules were studied by examining the effects of mutations on the expression of markers specific to Malpighian tubules, hindgut, or midgut of normal embryos. Loss of Krüppel activity confers hindgut characteristics on those cells that normally form the Malpighian tubules with all markers tested. Loss of cut function alters the expression of some markers but not others. The pathway of tissue specific gene regulation, apparently, branches beyond Krüppel to form at least a cut and a caudal branch (Liu, 1992).

A target gene of Kr, termed knockout (ko), was isolated by virtue of Kruppel in vitro binding sites in the ko promoter. Loss and gain of function experiments show that Kr activity maintains ko expression in a subset of muscles. ko encodes a novel protein expressed in several embryonic tissues, including Kr-expressing muscles. knockout is initially expressed in stage 10 embryos in the pharynx and the esophageal region, in groups of cells in the developing CNS, in the distal part of the Malpighian tubule and in the dorsal vessel. It is expressed in specific muscle precursor cells, both transiently and persistently depending on the location, until mature muscle fibers have been formed. Cis-acting elements controlling muscle expression of ko are spread over 15 kb of DNA. While Kr is required for a distinct subset of muscles, it is also necessary in others where its role is not decisive. That is, ko expression is variably affected in certain muscle precursors of Kr mutants. ko is not ectopically expressed in Kr gain of function experiments. Movements of embryos deficient for ko activity are uncoordinated. Their muscle pattern is normal, but the patterns of neuromuscular innervation are specifically disarranged. The results suggest that the Kr target gene ko is required for proper innervation of specific muscles by RP motoneurons (Hartmann, 1997a).

After blastoderm formation, Kruppel is expressed in various spatially and temporally restricted patterns in the developing embryo, including a subset of muscle precursors. By virtue of Kruppel in vitro binding sites, a putative Kr target gene, termed KrT95D, has been identifed. It encodes a novel protein that contains evolutionarily conserved regions. KrT95D is expressed in spatially restricted patterns throughout embryogenesis. Kr and KrT95D expression overlap in several locations, including muscle precursor cells, the tip cell of the Malpighian tubules and the ventral midline cells of the central nervous system. Results from the analysis of the KrT95D expression pattern in Kr loss-of-fuction and Kr gain-of-function embryos suggest that Kr activity is not essential for KrT95D expression in most locations of the embryo, except in the muscle precursors VO5 (Hartmann, 1997b).

By examining expression of arc in different mutant embryos, it was determined that transcription factors known to be required for patterning and maintenance of various developing epithelia control arc expression in those domains. tll and hkb, which are required to pattern the posterior 15% of the embryo, control arc expression in the posterior midgut primordium. fkh, which appears to act as a maintenance, or permissive, transcription factor, is required for expression of arc throughout the gut. byn, which is required for hindgut development and specifies its central domain (the large intestine), controls expression of arc in the elongating hindgut. Kr and cut, required for evagination and extension of the Malpighian tubule buds control expression of arc in the tubule primordia (Liu, 2000).

Genome-wide mapping of in vivo targets of the Drosophila transcription factor Krüppel

Krüppel (Kr), a member of the gap class of Drosophila segmentation genes, encodes a DNA binding zinc finger-type transcription factor. In addition to its segmentation function at the blastoderm stage, Krüppel also plays a critical role in organ formation during later stages of embryogenesis. To systematically identify in vivo target genes of Krüppel, DNA fragments were isolated from the Krüppel-associated portion of chromatin and they were used to find and map Krüppel-dependent cis-acting regulatory sites in the Drosophila genome. Krüppel binding sites are not enriched in Krüppel-associated chromatin and the clustering of Krüppel binding sites, as found in the cis-acting elements of Krüppel-dependent segmentation genes used for in silico searches of Krüppel target genes, is not a prerequisite for the in vivo binding of Krüppel to its regulatory elements. Results obtained with the newly identified target gene(s) ken and barbie, together referred to as ken indicate that Krüppel represses transcription and thereby restricts the spatial expression pattern of ken during blastoderm and gastrulation (Matyash, 2004).

To establish whether the newly identified candidate genes are indeed regulated in a Krüppel-dependent fashion, focus was placed on ken. The reason for this choice was that ken, which encodes a DNA binding zinc finger-type transcription factor, appears at a first glance unlikely to be a Kr target gene. This is because (1) Kr activity is not required for male genitalia formation and adult eye development, the two processes in which ken is involved. (2) ken is expressed early in two stripes that do not overlap with the Kr expression domain during blastoderm stage and gastrulation. In contrast, it was found that the isolated 749-bp DNA fragment is highly enriched in the DNA of Krüppel-associated chromatin and that it contains five Krüppel binding sites confirmed by gel mobility shift assays (Matyash, 2004).

To solve this apparent dilemma and to thereby demonstrate that the screen has indeed led to Krüppel target genes, it was asked whether Krüppel does regulate ken expression in vivo by performing in situ hybridizations of ken probes to whole mount preparations of wild type and homozygous Kr1 lack-of-function mutant embryos. In wild type, Krüppel is initially expressed in a broad band in the central region of the blastoderm. In contrast, ken is expressed in two distinct stripes that are anteriorly adjacent and posterior to the Kr central domain. In Kr mutant embryos, the two stripes of ken expression are not altered, but an additional expression domain was observed where Kr is normally expressed at syncytial blastoderm stage. This expression domain appears earlier than the normal stripes of ken expression, and it subsequently fades in a posterior to anterior direction, resulting in a third narrow stripe that remains separated from the anterior ken stripe. These observations establish that in the absence of Kr activity, ken is activated in the central region of the embryo and that this aspect of ken activity is normally repressed in a Krüppel-dependent manner (Matyash, 2004).

Previous results have shown that the expression of the anterior stripe of ken is activated in response to the transcription factors encoded by bicoid and hunchback, whereas the posterior stripe is activated by the transcription factor of tailless, and its shape and size are due to repression by Huckebein. To establish whether ectopic expression of Krüppel also causes the repression of ken, a heat shock-driven Kr transgene was used to misexpress Kr uniformly in the blastoderm embryo. The posterior stripe of ken expression is not affected by ectopic Kr activity, whereas the anterior ken stripe is lacking. Collectively, the results demonstrate that Krüppel participates in early ken regulation by acting as a local repressor of the gene in wild type embryos (Matyash, 2004).

This study was directed at the identification of Krüppel-dependent genes involved in neurogenesis, muscle, and Bolwig organ development. Genes identified that are involved in neurogenesis include tup, cut and short stop. In fact, 55 of the 82 isolated genes are known to participate in these developmental processes. Thus, it is expected that Krüppel regulates possibly several hundreds of genes during the entire life cycle of the fly (Matyash, 2004).

Two of the Kr target genes (emc and osa) have been identified in a genetic modifier screen for gene products that mediate Kr activity. In addition, a DNA fragment corresponded to the intron of the gene CG7097, a putative regulatory target of segmentation genes expressed during blastoderm formation. Microarray-based expression data and whole mount in situ hybridization of early embryos shows that this gene as well as additional 29 of the 43 candidate genes are expressed during the first 14 h of embryonic development. These observations and the results of the genetic studies with ken indicate that the DNA isolated from Krüppel-associated chromatin revealed in vivo target sites of the transcription factor (Matyash, 2004).

Previous analysis has shown that during segmentation Krüppel controls the activity of other transcription factors that are part of a cell fate-determining gene network. The results suggest that this earlier finding is not restricted to Kr segmentation function since the majority of the Krüppel target genes identified in this study (18% of the total isolates) encode transcription factors as well. The more important notion is, however, that Krüppel not only participates in the regulation of transcription factor networks at the different levels of the segmentation gene cascade but also assists signaling events by regulating various pathway components, as exemplified by target genes coding for components of the JAK/STAT-signaling pathway. Krüppel target DNA includes portions of the genes ken, STAT92E, and stc, which code for JAK/STAT-mediating transcription factors as well as factors known to participate in signaling by the epidermal growth factor receptor (Asteroid) and Rho GTPases (Gef64C). Moreover, the isolation of genes encoding lipid metabolism-related enzymes and the lipid carrier Neural Lazarillo (NLaz) suggests that Krüppel not only takes part in embryonic fat body development but also participates in metabolic functions (fat storage or fat consumption) of the organ (Matyash, 2004).

The majority of the newly isolated Krüppel target sites lack Krüppel binding site clusters as revealed in cis-acting elements of the Krüppel-dependent segmentation genes. However, the isolated and subsequently tested set of DNA fragments is enriched in Krüppel-associated chromatin, as has been found with the eve stripe 2 element, which contains clustered Krüppel target sites. This finding suggests that the clustering of binding sites is not the sole biologically relevant marker for Krüppel-dependent cis-acting control elements. Furthermore, the algorithm applied to detect Krüppel binding sites only counted matches of sequences to a weighted matrix that were arbitrarily set above a certain threshold. In consequence, functional low affinity binding sites or Krüppel-dependent DNA segments that contain only few and unclustered high affinity binding sites were left undetected (Matyash, 2004).

Interestingly, more than half of the Krüppel target DNA fragments (68%) were located in introns and exon/intron overlap sequences or in exons and not at the canonical 5' termini of protein-coding genes. The location of these fragments downstream of the transcription start sites suggests that they may represent distal regulatory elements (e.g., enhancers or silencers) or promoters for non-coding RNAs, as implied by a most recent study on transcription factor binding along human chromosome 21 and 22. Because noncoding transcripts within the Drosophila genome are not systematically annotated, it cannot be decide whether Krüppel participates in the transcription of such transcripts (Matyash, 2004).

A surprising result of this study was that ken, which is not expressed in the Krüppel domain of wild type blastoderm embryos, is in fact a target of Krüppel. In the absence of Kr activity, ken is activated in the central region of the blastoderm. Thus, in addition to the regulation of ken expression in the anterior and posterior stripe domains, which involves the activities of bicoid in cooperation with the gap genes hunchback, tailless, and huckebein, Krüppel is needed to prevent ectopic ken activation in the blastoderm embryo. This finding and the notion that ubiquitous Krüppel expression abolishes ken activity in the anterior but not in the posterior stripe domain suggest that the two stripes of ken expression are under the control of separate cis-acting elements, of which only one mediates repression by Krüppel (Matyash, 2004).

Dynamical analysis of regulatory interactions in the gap gene system of Drosophila

Genetic studies have revealed that segment determination in Drosophila melanogaster is based on hierarchical regulatory interactions among maternal coordinate and zygotic segmentation genes. The gap gene system constitutes the most upstream zygotic layer of this regulatory hierarchy, responsible for the initial interpretation of positional information encoded by maternal gradients. A detailed analysis of regulatory interactions involved in gap gene regulation is presented based on gap gene circuits, which are mathematical gene network models used to infer regulatory interactions from quantitative gene expression data. The models reproduce gap gene expression at high accuracy and temporal resolution. Regulatory interactions found in gap gene circuits provide consistent and sufficient mechanisms for gap gene expression, which largely agree with mechanisms previously inferred from qualitative studies of mutant gene expression patterns. These models predict activation of Kr by Cad and clarify several other regulatory interactions. This analysis suggests a central role for repressive feedback loops between complementary gap genes. Repressive interactions among overlapping gap genes show anteroposterior asymmetry with posterior dominance. Finally, these models suggest a correlation between timing of gap domain boundary formation and regulatory contributions from the terminal maternal system (Jaeger, 2004b).

Although activating contributions from Bcd and Cad show some degree of localization, positioning of gap gene boundaries during cycle 14A is largely under the control of repressive gap-gap cross-regulatory interactions. Thereby, activation is a prerequisite for repressive boundary control, which counteracts broad activation of gap genes in a spatially specific manner. In addition, gap genes show a tendency toward autoactivation, which increasingly potentiates activation by Bcd and Cad during cycle 14A. Autoactivation is involved in maintenance of gap gene expression within given domains and sharpening of gap domain boundaries during cycle 14A (Jaeger, 2004b).

Regulatory loops of mutual repression create positive regulatory feedback between complementary gap genes, providing a straightforward mechanism for their mutually exclusive expression patterns. Such a mechanism of 'alternating cushions' of gap domains has been proposed previously. The results suggest that this mechanism is complemented by repression among overlapping gap genes. Overlap in expression patterns of two repressors imposes a limit on the strength of repressive interactions between them. Accordingly, repression between neighboring gap genes is generally weaker than that between complementary ones. Moreover, repression among overlapping gap genes is asymmetric, centered on the Kr domain. Posterior to this domain, only posterior neighbors contribute functional repressive inputs to gap gene expression, while anterior neighbors do not. This asymmetry is responsible for anterior shifts of posterior gap gene domains during cycle 14A (Jaeger, 2004b).

Repression by Tll mediates regulatory input to gap gene expression by the terminal maternal system. Tll provides the main repressive input to early regulation of the posterior boundary of posterior gt, and activation by Tll is required for posterior hb expression. Note that these two features form only during cycle 13 and early cycle 14A, while other gap domain boundaries are already present at the transcript level during cycles 10-12 and largely depend on the anterior and posterior maternal systems for their initial establishment. The delayed formation of posterior patterning features and their distinct mode of regulation are reminiscent of segment determination in primitive dipterans and intermediate germ-band insects, supporting a conserved dynamical mechanism across different insect taxa (Jaeger, 2004b).

The set of regulatory interactions presented here provides a consistent and sufficient dynamical mechanism for gap gene expression. In summary, this set of interactions consists of the following five basic regulatory mechanisms: (1) broad activation by Bcd and/or Cad, (2) autoactivation, (3) strong repressive feedback between mutually exclusive gap genes, (4) asymmetric repression between overlapping gap genes, and (5) feed-forward repression of posterior domain boundaries by the terminal gap gene tll. In the following subsections, evidence is discussed concerning specific regulatory interactions involved in each of these basic mechanisms in some detail (Jaeger, 2004b).

Activation by Bcd and Cad: Activation of gap gene expression by Bcd and Cad is supported by the following. Bcd binds to the regulatory regions of hb, Kr, and kni. The kni regulatory region also contains binding sites for Cad. The anterior domains of gt and hb are absent in embryos from bcd mothers. The posterior domain of gt is missing in embryos mutant for both maternal and zygotic cad, while the posterior domain of kni is absent in embryos mutant for maternal bcd plus maternal and zygotic cad. These results suggest partial redundancy of activation of kni by Bcd, consistent with evidence from zygotic cad embryos from bcd mothers, where maternally provided Cad is sufficient to activate kni (Jaeger, 2004b).

Kr expression expands anteriorly in embryos from bcd mothers, which is due to the absence of the anterior gt and hb domains. Bcd has been shown to activate expression of Kr reporter constructs. The fact that Kr is still expressed in embryos from bcd mutant mothers has been attributed to activation by general transcription factors or low levels of Hb. In contrast, the models predict that this activation is provided by Cad. Although Kr expression is normal in embryos overexpressing cad, repressive control of Kr boundaries could account for the lack of expansion of the Kr domain in such embryos (Jaeger, 2004b).

The activating effect of Cad on hb found in gap gene circuits is likely to be spurious. The anterior hb domain is absent in embryos from bcd mutant mothers, which show uniformly high levels of Cad. Moreover, the complete absence of the posterior hb domain in tll mutants suggests activation of posterior hb by Tll rather than by Cad. It is believed that this spurious activation of hb by Cad is due to the absence of hkb in gap gene circuits. The posterior hb domain fails to retract from the posterior pole in hkb mutants, suggesting a repressive role of Hkb in regulation of the posterior hb border. Consistent with this, the posterior boundary of the posterior hb domain never fully forms in any of the circuits. Moreover, Tll is constrained to a very small or no interaction with hb due to the absence of the posterior repressor Hkb, since activation of hb by Tll would lead to increasing hb expression extending to the posterior pole (Jaeger, 2004b).

Autoactivation:: A role for autoactivation in the late phase of hb regulation is supported by the fact that the posterior border of anterior hb is shifted anteriorly in a concentration-dependent manner in embryos with decreasing doses of zygotic Hb. Weakened and narrowed expression of Kr in mutants encoding a functionally defective Kr protein suggests Kr autoactivation. Similarly, a delay in the expression of gt in mutants encoding a defective Gt protein indicates gt autoactivation. However, the results suggest that gt autoactivation is not essential. It is generally weaker than autoactivation of other gap genes, and circuits lacking gt autoactivation show no specific defects in gt expression. Finally, in the case of kni, there is no experimental evidence for autoactivation, while some authors have even suggested kni autorepression. No such autorepression has been detected in any gap gene circuit (Jaeger, 2004b).

Repression between complementary gap genes: Mutual repression of gt and Kr is supported by the following. gt expression expands into the region of the central Kr domain in Kr embryos. In contrast, Kr expression is not altered in gt mutants before germ-band extension. However, Gt binds to the Kr regulatory region, and the central domain of Kr is absent in embryos overexpressing gt. Moreover, Kr expression extends further anterior in hb gt double mutants than in hb mutants alone. The above is consistent with this analysis, which shows no significant derepression of Kr in the absence of Gt even though repression of Kr by Gt is quite strong (Jaeger, 2004b).

Hb binds to the kni regulatory region, and the posterior kni domain expands anteriorly in hb mutants. Embryos overexpressing hb show no kni expression at all, and embryos misexpressing hb show spatially specific repression of kni expression.There is no clear posterior expansion of kni in hb mutants. This could be due to the relatively weak and late repressive contribution of Hb on the posterior kni boundary or due to partial redundancy with repression by Gt and Tll. The posterior hb domain expands anteriorly in kni mutants, but anterior hb expression is not altered in these embryos. Nevertheless, a role of Kni in positioning the anterior hb domain is suggested by the fact that misexpression of kni leads to spatially specific repression of both anterior and posterior hb domains. Moreover, only slight posterior expansion of anterior hb is observed in Kr mutants, while hb is completely derepressed between its anterior and posterior domains in Kr kni double mutants (Jaeger, 2004b).

Repression between overlapping gap genes: gt, kni, and Kr show repression by their immediate posterior neighbors hb, gt, and kni, respectively. Retraction of posterior Gt from the posterior pole during midcycle 14A fails to occur in hb mutants, and no gt expression is observed in embryos overexpressing hb. The posterior kni boundary is shifted posteriorly in gt mutant embryos, and kni expression is reduced in embryos overexpressing gt. Note that these effects are very subtle and were not reported in similar studies by different authors. A weak but functional interaction of Gt with kni is consistent with these results. This interaction was found to be essential even in a circuit where it was deemed below significance level. Finally, Kni has been shown to bind to the Kr regulatory region, and the central Kr domain expands posteriorly in kni mutants (Jaeger, 2004b).

In contrast, no effect of Kr on hb was detected. However, hb expression expands posteriorly in Kr mutants. This effect is likely to involve repression of hb by Kni. Kni levels are reduced in Kr embryos. hb is completely derepressed between its anterior and posterior domains in Kr kni double mutants, whereas anterior hb does not expand at all in kni mutants alone. Taken together these results suggests that there is direct repression of hb by Kr in the embryo, but it is at least partially redundant with repression of hb by Kni (Jaeger, 2004b).

Unlike repression by posterior neighbors, no or only weak repression was found of posterior kni, gt, and hb by their anterior neighbors Kr, kni, and gt, respectively. Most gap gene circuits show weak activation of hb by Gt. Graphical analysis failed to reveal any functional role for such activation. Moreover, no functional interaction was found between gt and Kni. Although relatively weak repression of kni by Kr was found in 6 out of 10 circuits, no specific patterning defects could be detected in the other 4. Consistent with the above, expression of posterior hb is normal in gt mutants, and both the anterior boundaries of posterior gt and kni are positioned correctly in kni and Kr mutant embryos, respectively (Jaeger, 2004b).

Note that activation of kni by Kr, which has been proposed to explain decreased expression levels of kni in Kr mutants, was never found. The results strongly support the view that this interaction is indirect through Gt, which is further corroborated by the fact that kni expression is completely restored in Kr gt double mutants compared to that in Kr mutants alone (Jaeger, 2004b).

A significant repressive effect of Hb on Kr was found. Consistent with this, Hb has been shown to bind to the Kr regulatory region, and the central Kr domain expands anteriorly in hb mutants. However, partial redundancy of this interaction is suggested by correct positioning and shape of the anterior Kr domain in a circuit that does not show repression of Kr by Hb (Jaeger, 2004b).

It has been proposed that Hb plays a dual role as both activator and repressor of Kr. In the framework of the gene circuit model, concentration-dependent switching of regulative action could be implemented by allowing genetic interconnection parameters to switch sign at certain regulator concentration thresholds. The current model explicitly does not include such a possibility. Nevertheless, circuits have been obtained that reproduce Kr expression faithfully, suggesting that a dual role of Hb is not required for proper Kr expression. Moreover, activation of Kr by Hb was ever observed in any of the circuits. Therefore, the results support a mechanism in which the activation of Kr by Hb is indirect through derepression of kni (Jaeger, 2004b).

Repression by Tll: Only a few earlier theoretical approaches have considered terminal gap genes. Gap gene circuits accurately reproduce tll expression. However, in gene circuits, tll is subject to regulation by other gap genes, which is inconsistent with experimental evidence. In contrast, the correct expression pattern of tll in gap gene circuits allows its effect on other gap genes to be studied in great detail. Strong repressive effects of Tll on Kr, kni, and gt have been found. Tll binding sites have been found in the regulatory regions of Kr and kni. In tll mutants, Kr expression is normal, whereas expression of kni expands posteriorly, and the posterior gt domain fails to retract from the posterior pole. No expression of Kr, kni, or gt can be detected in embryos overexpressing tll under a heat-shock promoter (Jaeger, 2004b).

Reverse engineering the gap gene network of Drosophila

A fundamental problem in functional genomics is to determine the structure and dynamics of genetic networks based on expression data. A new strategy is described for solving this problem and it is applied to recently published data on early Drosophila development. The method is orders of magnitude faster than current fitting methods and allows fitting of different types of rules for expressing regulatory relationships. Specifically, this approach is sused to fit models using a smooth nonlinear formalism for modeling gene regulation (gene circuits) as well as models using logical rules based on activation and repression thresholds for transcription factors. The technique also allows inference of regulatory relationships de novo or testing network structures suggested by the literature. A series of models is fitted to test several outstanding questions about gap gene regulation, including regulation of and by hunchback and the role of autoactivation. Based on the modeling results and validation against the experimental literature, a revised network structure is proposed for the gap gene system. Interestingly, some relationships in standard textbook models of gap gene regulation appear to be unnecessary for, or even inconsistent with, the details of gap gene expression during wild-type development (Perkins, 2006).

The regulatory structure of the Combined model is itself sufficient to reproduce all six gap gene domains using either the gene circuit or logical formalisms for production rate functions. Support is cited for the Combined model, and then consider the results of the individual models in light of several outstanding questions about gap gene regulation are discussed (Perkins, 2006).

The maternal proteins Bcd and Cad are largely responsible for activating the trunk gap genes, with Bcd being more important for the anterior domains and Cad more important for the posterior domains. Bcd is a primary activator of the anterior hb domain, the anterior gt domain, and the Kr domain. Cad activates posterior gt. The kni domain is present in bcd mutants and in cad mutants, but not in bcd;cad double mutants. This suggests redundant activation by the two maternal factors. Such redundant activation of kni is present in the Unc-GC model. For the other models, the optimization selected one or the other as activators, but not both. Tll is crucial for activating the posterior hb domain, while it represses Kr, kni, and gt, preventing their expression in the extreme posterior. All the regulatory relationships between the gap genes in the Combined model are repressive. The complementary gap gene pairs, hb-kni and Kr-gt are known to be strongly mutually repressive, as was found in nearly all the models. [Repression of hb by Kni is not part of the Rivera-Pomar and Jäckle (RPJ) regulatory relationships (Rivera-Pomar, 1996), but the unconstrained gene circuit (Unc-GC) model and Unc-Logic model (that employs the regulatory structure discovered by the unconstrained gene circuit fit, except that Gt activation of hb and Kni activation of gt were removed) included the link.] The models also suggest that mutual repression between hb and Kr helps to set the boundary between those two domains. A chain of repressive relationships, hb-gt-kni-Kr, causes the shifts in the Kr, kni, and posterior gt domains. Autoactivation by hb is well-established, and there is also some evidence for autoactivation by Kr and gt (Perkins, 2006).

Does Hb have a dual regulatory effect on Kr? There is a long-running debate about whether or not low levels of Hb activate Kr. In hb mutants, the Kr domain expands anteriorly, suggesting that Hb represses Kr. However, Kr expression in these mutants is lower than in wild-type and expands posteriorly in embryos overexpressing Hb. Further, in embryos lacking Bcd and Hb, the Kr domain is absent, but can be restored in a dosage-dependent manner by reintroducing Hb. These observations suggest that Hb activates Kr. It has been suggested, therefore, that low levels of Hb activate Kr while high levels repress it. An alternative explanation, however, is that the apparently activating effects of Hb are indirect, via Hb's repression of kni and Kni's repression of Kr. Optimization of the Unc-GC model, which could have resulted in activation or repression of Kr by Hb, but not both, resulted in repression. The RPJ models allow for a dual effect, but activation by Hb was eliminated during optimization of the RPJ-Logic model. The RPJ-GC model retained functional activation and repression of Kr by Hb. However, Kr expression in this model is defective. Kr is not properly repressed in the anterior. Further, Kr is ectopically expressed in a small domain in the posterior of the embryo. Thus, the current models provide no support for activation of Kr by Hb. The only support found, which is crucial in all models except Unc-Logic and also consistent with the mutant and overexpression studies, is for repression of Kr by Hb (Perkins, 2006).

What represses hb between the anterior and posterior domains? Another point of disagreement in the literature is what prevents the expression of hb between its two domains. In the model of Rivera-Pomar and Jäckle (1996), repression by Kr is the explanation. The RPJ models confirm that this mechanism is sufficient. Specifically, in these models Kr repression prevents hb expression just to the posterior of the anterior hb domain. Between the Kr and posterior hb domains, there is no explicit repression of hb. Rather, Hb is not produced simply because of a lack of activating factors. In contrast, the models of Jaeger (2004a and b) detected no effect of Kr and attributed repression solely to Kni. The Unc-GC and Unc-Logic models found repression by Kni, but in addition to repression by Kr, not instead of it. Kr is more responsible for repression near the anterior hb domain and Kni is more responsible for repression near the posterior hb domain. This is consistent with observations of expression in mutant embryos. Embryos mutant for Kr show slight expansion of the anterior hb domain, while kni embryos show expansion of the posterior hb domain. In Kr;kni double mutants, hb is completely derepressed between its two usual domains. This suggests, as seen in the Unc-GC and Unc-Logic models, that Kr and Kni are both repressors of hb, that their activity is redundant in the center of the trunk, and that Kr and Kni are the dominant repressors for setting the boundaries of the anterior and posterior domains, respectively. This interpretation was also favored by Jaeger (2004a and b), on the basis of the mutant data, even though Jaeger's models did not find repression by Kr (Perkins, 2006).

The posterior hb domain. In all of the current models, the posterior hb domain is activated by Tll and sustained by Tll and hb autoactivation. Rivera-Pomar (1996) did not consider the posterior hb domain, and did not include activation by Tll in his model. That link was added to the RPJ network structure because otherwise it was not possible to capture the posterior hb domain. The model of Jaeger (2004a and b) captured the domain without Tll activation by substituting activation from cad. However, there is no confirming evidence for such an interaction. The absence of posterior hb in tll mutants and the inability of the models to explain posterior hb by other means, leads to the straightforward hypothesis that Tll activates posterior hb. Posterior hb is unique in that the domain begins to form later than the other five domains modeled. In the RPJ models, this happens simply because high levels of Tll are needed to activate hb -- levels that are reached only at about t = 30 min. The Unc-GC and Unc-Logic models also employ repression by Cad to slightly delay Hb production in the posterior. However, there is no confirming evidence for such repression, and it is omitted from the Combined model (Perkins, 2006).

Shifting of the Kr, kni, and posterior gt domains. Domain shifting was first observed by Jaeger (2004a and b) and attributed to a chain of repressive regulatory relationships, hb-gt-kni-Kr. The current models largely support the importance of this regulatory chain, particularly the final two links. Repression of Kr by Kni was significant in all of the current models. Repression of kni by Gt was present in all models except RPJ-Logic, where it would be of little impact anyway, since RPJ-Logic has a defective posterior gt domain. Consistent with these findings, Kni binds to the regulatory region of Kr, and the Kr domain expands towards the posterior in kni mutants. Similarly, the kni domain expands posteriorly in gt mutants, while embryos overexpressing gt show reduced kni expression (Perkins, 2006).

Repression of gt by Hb is not as well supported by the current models. The Unc-GC model included the link, though the regulatory weight was the smallest of all those in the model. The link was eliminated from Unc-Logic and, of course, not present in the RPJ network structure. Instead, the models utilized decreasing activation by Cad (Unc-GC, Unc-Logic) and repression by Tll (Unc-GC, RPJ-GC) to shift the posterior gt domain. Even with these links, however, shifting of the domain is not well-captured. RPJ-GC appears to capture the posterior gt shift best (Figure 3E). However, it relies on its small ectopic Kr domain to repress gt, a completely incorrect mechanism. Interestingly, a gene circuit fit using the network structure of Sanchez and Thieffry (2001), captured the shift of posterior gt better than any of the other current models, and it did so using repression of gt by Hb, providing additional modeling support for the relationship. There also is strong mutant evidence in favor of the relationship. In hb mutants, the posterior gt domain does not retract from the posterior pole. Further, Gt is absent in embryos that have ubiquitous Hb, such as maternal oskar or nanos mutants or embryos expressing Hb ubiquitously under a heat-shock promoter. Thus, sufficient evidence was found to include a repressive link from hb to gt in the Combined model (Perkins, 2006).

Activating or repressing links that oppose the direction of the repressive chain were eliminated by optimization of the Unc-Logic, RPJ-GC, and RPJ-Logic models. In agreement with this result, the boundaries of the kni and posterior gt domains are correctly positioned in Kr and kni mutants, respectively. Thus, the simplest picture supported by the current models and consistent with the mutant studies is that there is no regulation from Kr, kni, or posterior gt to any of their immediate posterior neighbors, and that the repressive chain highlighted by Jaeger (2004a and b) is indeed responsible for domain shifting (Perkins, 2006).

Do gap genes autoregulate? All four of the current models include autoactivation by hb. This is supported by the observation that late anterior hb expression is absent in embryos lacking maternal and early zygotic Hb 47. The models suggest hb autoactivation also plays a crucial role in sustaining the posterior domain, once it has been initiated by Tll, a role not previously emphasized. Autoactivation for the other genes was found by the Unc-GC model, but is not part of the RPJ network structure. It included autoactivation only for Kr and gt in the Combined model, on the basis of a weakened and narrowed Kr domain in embryos producing defective Kr protein and a delay in gt expression in embryos producing defective gt protein. Interestingly, the gene circuit models of Jaeger (2004a and b) also found autoactivation for all four gap genes, but they considered autoactivation by gt to be the weakest and least certain. In contrast, the Unc-Logic model retained gt autoactivation while eliminating autoactivation for Kr and kni. The RPJ-Logic model was unable to reproduce the posterior gt domain. However, it was found that by adding gt autoactivation to the model, it was able to create and sustain posterior gt correctly, bringing the error of the model down to 15.34. This suggests that, after hb, gt is the most likely candidate for autoactivation. However, even this is not strictly necessary. The RPJ-GC model is able to reproduce and sustain the posterior gt domain without autoactivation by relying on cooperative activation from Bcd and Cad (Perkins, 2006).

Comparison of regulatory architectures. The regulatory relationships proposed by Rivera-Pomar and Jäckle (1996) are not fully consistent with the data and require amending. Repression of gt by Kni, which contradicts the mechanism of domain shifts described by Jaeger (2004a and b), was eliminated by the optimization in both of the current models based on the RPJ regulators. Activation of kni by Kr was never observed. No support was found for a dual regulatory effect of Hb on Kr. Activation of Kr at low levels of Hb was eliminated in the RPJ-Logic model. It was retained in the RPJ-GC model, but resulted in serious patterning defects. Inclusion of Tll as an activator of hb was sufficient to produce the posterior hb domain. Based on the current fits and the primary experimental literature, there are likely other regulatory links missing from the model of Rivera-Pomar and Jäckle, though they are not strictly required to reproduce the wild-type gap gene patterns. Foremost is repression of hb by Kni, which appears important for eliminating hb expression anterior of the posterior domain. Fits based on the Sanchez and Thieffry (2001) regulatory relationships also support these conclusions (Perkins, 2006).

In contrast, the regulatory relationships in the Combined model and both the Unc-GC and Unc-Logic models are able to capture the wild-type gap patterns without gross defects. The relationships in the Unc-GC model are very similar to those obtained by Jaeger (2004a and b). For example, the regulation of Kr and kni is qualitatively equivalent in both models, and there is a single minor difference in the regulation of gt. The optimizations correctly identified activation of hb by Tll, which was missed by Jaeger (2004a and b), though the current models did less well at capturing shifting of the posterior gt domain. These regulatory relationships are also similar to those found by Gursky (2004), though that study was based on gap gene expression data with much lower accuracy and temporal resolution than the data used in this study. These similarities show that differences in the mathematical formulations of these models-as ordinary versus partial differential equations, how diffusion and nuclei doubling are modeled, and choice of boundary conditions and other simulation parameters-are not important for the reproduction of the gap gene patterns nor for the inference of regulatory relationships from the data (Perkins, 2006).

Deciphering a transcriptional regulatory code: modeling short-range repression in the Drosophila embryo

Systems biology seeks a genomic-level interpretation of transcriptional regulatory information represented by patterns of protein-binding sites. Obtaining this information without direct experimentation is challenging; minor alterations in binding sites can have profound effects on gene expression, and underlie important aspects of disease and evolution. Quantitative modeling offers an alternative path to develop a global understanding of the transcriptional regulatory code. Recent studies have focused on endogenous regulatory sequences; however, distinct enhancers differ in many features, making it difficult to generalize to other cis-regulatory elements. This study applied a systematic approach to simpler elements and presents the first quantitative analysis of short-range transcriptional repressors, which have central functions in metazoan development. Fractional occupancy-based modeling uncovered unexpected features of these proteins' activity that allow accurate predictions of regulation by the Giant, Knirps, Krüppel, and Snail repressors, including modeling of an endogenous enhancer. This study provides essential elements of a transcriptional regulatory code that will allow extensive analysis of genomic information in Drosophila melanogaster and related organisms (Fakhouri, 2010).

In this study, by using a reductionist analysis of short-range repression, a relatively untouched, yet central aspect of gene regulation was explored in Drosophila. Earlier qualitative studies highlighted the extreme distance dependence of short-range repressors, and comparative analysis has shown many instances of evolutionary plasticity of regulatory regions controlled by these proteins. Knowing that transcription factors influence each other in a local manner permitted the identification of novel enhancers, based on the clustering of binding sites. Yet, clustering studies alone do not provide the basis for predicting evolutionary changes that reshape transcriptional output, or predicting activity of coregulated enhancers. For example, the original hypothesis that the affinity and or number of Bicoid-binding sites dictates the output of regulated genes has been replaced by an understanding that other, as-yet unknown features, seem to have more decisive functions (Fakhouri, 2010).

Earlier modeling studies focused on endogenous enhancers, which have complex arrangements of transcription factor-binding sites. The curret studies focused on detecting quantitative differences resulting from subtle differences in binding sites, allowing modeling with a tractable number of parameters. A common block of Dorsal and Twist activator sites was used, allowing a focus on changes made in the number and arrangement of repressor sites; clearly, differences in affinity, number, and arrangement of activator sites also have decisive functions in dictating transcriptional output; thus, future modeling efforts will need to integrate these elements as well. The tight focus on short-range repressors with the analysis of a relatively small number of reporter genes provided sufficient data for robust estimation of important parameters. From the comparison of repression by other short-range repressors, it is likely that the analysis of Giant can guide studies of other similarly acting repressors, including Krüppel, Knirps, and Snail (Fakhouri, 2010).

Relating to transcriptional regulatory code, this study uncovered specific quantitative features that seem to apply to short-range repressors in a general context. A complex non-linear quenching relationship was found that suggests that within the range of activity, Giant, and probably other short-range repressors, have an optimum distance of action that may reflect steric constraints. Multiple formulations of the model generated very similar predictions, suggesting that this non-linear distance function is a real feature of the system. Consistent with this notion, an earlier study of transcription factor-binding sites in Drosophila enhancers discovered an overall preference of Krüppel sites to be found 17 bp from Bicoid activator sites, which may be an indication that other short-range repressors also have preferred distances for optimal activity (Fakhouri, 2010).

The similar quenching efficiencies for repressors acting adjacent to Dorsal or Twist activator sites were an additional significant finding. The similar effect on disparate activator proteins indicates that the effects of short-range repression are general, and are likely to be translatable to distinct contexts. Earlier empirical tests had already pointed in this direction; for example, insertion of ectopic-binding sites for Knirps and Krüppel into rho NEE sequences is sufficient to induce repression, although these proteins do not usually cross-regulate. In addition, short-range repressors can counteract a variety of transcriptional activation domains with similar efficiency, suggesting that specific protein-protein contacts are not essential. In one area quantitative differences were found between parameters derived from the synthetic gene modules and the endogenous regulatory regions. The importance of homotypic cooperativity predicted for Snail sites in the context of the rho NEE was overall much higher than that found for Giant, Krüppel, and Knirps sites acting on the synthetic gene constructs; this might be an example in which the individual proteins do exhibit different context dependencies perhaps because the proteins differ in level of stickiness. Alternatively, the distance between the Snail sites in question, 23 bp, might facilitate cooperative interactions much more than the closely apposed spacing used in the genes genes used in this study, in which steric interference may have an opposing function (Fakhouri, 2010).

In modeling mutant forms of the endogenous rho NEE, several important features of the architecture of this regulatory region were uncovered. This enhancer seems to use redundancy in use of Snail to mediate repression; based on earlier experiments, it seems that even a single Snail site is sufficient to mediate repression. Such redundancy may provide the correct dynamical response, with a swift repression of rho at an early enough time in which Snail levels are still low, or it may ensure that gene output is robust to environmental and genetic noise (Fakhouri, 2010).

The rho NEE modeling also highlighted features of transcriptional activators. Activator-scaling factors for Dorsal were reproducibly lower than those of Twist, and this was apparent for several different assumptions of expression level. The relative differences in contribution to activation can be explained by examination of the structure of the enhancer; contribution by the low intrinsic values of Dorsal is amplified by strong cooperativity with Twist, setting up a chain of interacting weak sites that together are highly active. Experimental evidence bears out these conclusions: isolated Dorsal sites tested on reporter genes mediate relatively weak activation, and a rho NEE lacking Twist sites, but containing four Dorsal sites, is similarly compromised (Fakhouri, 2010).

Earlier studies suggested that many developmental enhancers, including those regulated by short-range repressors, may possess a flexible 'billboard' design, in which individual factors or small groups of proteins would independently communicate with the promoter region, so that the net output of an enhancer would reflect the cumulative set of contacts over a short time period. Such a view of enhancers would account for the evolutionary plasticity observed in regulatory sequences. No DNA-scaffolded superstructure, reflecting the formation of a unique three-dimensional complex, would be necessary in this scenario. Yet, the modeling suggests that the rho NEE might involve communication between relatively distant-binding sites, through sets of cooperative interactions. In this case, it is possible that such distant interactions might be compatible with a flexible structure, if many distinct configurations of binding sites provide such a cooperative network. Current studies have indeed highlighted potential frameworks involving Dorsal and interacting factors on same classes of enhancer. Application of a transcriptional regulatory code integrating activities of activators and repressors is a critical next step to illuminate enhancer design and evolution (Fakhouri, 2010).

Molecular dissection of cis-regulatory modules at the Drosophila bithorax complex reveals critical transcription factor signature motifs. Dev. Biol

At the Drosophila bithorax complex (BX-C) over 330kb of intergenic DNA is responsible for directing the transcription of just three homeotic (Hox) genes during embryonic development. A number of distinct enhancer cis-regulatory modules (CRMs) are responsible for controlling the specific expression patterns of the Hox genes in the BX-C. While it has proven possible to identify orthologs of known BX-C CRMs in different Drosophila species using overall sequence conservation, this approach has not proven sufficiently effective for identifying novel CRMs or defining the key functional sequences within enhancer CRMs. This study demonstrates that the specific spatial clustering of transcription factor (TF) binding sites is important for BX-C enhancer activity. A bioinformatic search for combinations of putative TF binding sites in the BX-C suggests that simple clustering of binding sites is frequently not indicative of enhancer activity. However, through molecular dissection and evolutionary comparison across the Drosophila genus it was discovered that specific TF binding site clustering patterns are an important feature of three known BX-C enhancers. Sub-regions of the defined IAB5 and IAB7b enhancers were both found to contain an evolutionarily conserved signature motif of clustered TF binding sites which is critical for the functional activity of the enhancers. Together, these results indicate that the spatial organization of specific activator and repressor binding sites within BX-C enhancers is of greater importance than overall sequence conservation and is indicative of enhancer functional activity (Starr, 2011).

The clustered organization of TF binding sites has been shown to be crucially important to the functional activity of enhancers. However, despite detailed studies of a small set of enhancers in Drosophila, including the eve stripe 2 (S2E) enhancer, the precise rules of cis-regulatory grammar have yet to be fully elucidated. In an effort to investigate the role of clustering of predicted TF binding sites for the identification of enhancers in the 330 kb Drosophila BX-C, a search for simple clusters of HB and KR binding sites was performed. The search algorithm returned 26 putative enhancers (PCRMs), of which 6 (23%) overlapped previously identified enhancers. The overlapping regions for four of these confirmed enhancers (BRE, IAB2, IAB5 and IAB8) were tested in transgenic reporter gene assay and recapitulated the known domains of regulatory activity in the embryo. Furthermore, the 1037 bp R10 region that was tested, that is able to recapitulate IAB2 enhancer functional activity, refines the boundaries of the previously characterized 1970 bp IAB2 sequence. The search also identified 20 additional PCRM sequences. Twelve of these previously uncharacterized genomic regions were analyzed for enhancer activity and only one (R8 from the bxd/pbx region) was found to be a novel embryonic enhancer capable of driving expression in a pattern indicative of Ubx gene expression. This result indicates that the approach of searching for novel enhancers in the BX-C using simple clustering may have significant limitations (Starr, 2011).

A key question is why 11 of the 16 PCRMs tested (69%) are false positives. Two possibilities include; a) that the PCRMs may in fact be actively regulating expression of the BX-C genes at later stages of development or in very specific patterns in post-embryonic tissues, and b) that in testing a specific ~ 1 kb genomic fragment from each PCRM critical regulatory sequences in neighboring regions may have been removed. However, the recent availability of in vivo TF binding data may also offer some potential answers. The binding of anterio-posterior restricted gap/terminal and pair-rule transcription factors in stages 4-5 embryos appears to correlate strongly with the functional activity of the PCRMs. When scored for ten specific TFs which are potential regulators of the BX-C enhancers, all the PCRMs tested in the transgenic assay that had chromatin immunoprecipitation (ChIP) binding peaks for 6 or more of the protein factors function as embryonic enhancers. For each of these confirmed enhancers, both KR and HB demonstrate in vivo binding at the endogenous genomic region corresponding to the enhancer. In contrast, all the false positive PCRMs do not have binding peaks for more than 5 of the TFs and most have less than 3, often reflecting an absence of binding for KR or HB (Starr, 2011).

One interpretation of this data is that the predicted TF binding sites in many of the false positive PCRMs do not represent actual in vivo embryonic binding sites and, as a result, the PCRM is not functional. In addition to KR and HB repressor binding sites, it is also important to consider the presence of potential binding sites for an appropriate activator (FTZ or EVE) necessary for the functional activity of the enhancer. Analysis of the 5 PCRMs that demonstrate in vivo activity reveals that each contains at least 3 strong predicted binding sites for the appropriate pair-rule activator. However, in many cases the false positive PCRMs tested also appear to contain putative activator binding sites. In these cases it is possible that additional architectural requirements (for example, close spacing between multiple activator and/or repressor binding sites) may be necessary for in vivo embryonic activity to occur. In support of this idea, analysis of the genomic fragments that were tested from the iab-2 to iab-8 genomic regions (R10, 11, 12, 13, 14, 15, 17, 20, and 21), predicts that R15 (overlapping IAB5) has a closely-spaced cluster of FTZ-KR sites and that R10 (overlapping IAB2) and R20 (overlapping IAB8) possess a closely spaced cluster of EVE-KR sites within 150 bp of one another, whereas the other regions do not appear to harbor pair-rule activator (FTZ or EVE) and repressor (KR) clusters in such close proximity. A third possibility is that additional protein factors may be involved which may affect the ability of TFs to access the binding sites within the predicted enhancer sequence. Such proteins, which control the recruitment of chromatin components and nucleosome positioning, are thought to be critical to the regulation of embryonic gene expression through the modulation of TF binding affinity at enhancers (Starr, 2011).

The presence of a simple cluster of KR and HB binding sites in many of the enhancers of the BX-C argues that certain precise patterns of TF binding site clusters may be responsible for functional activity among similarly-regulated enhancers. In the IAB8 enhancer, a distinct cluster of EVE-KR binding sites (one KR, two EVE sites) is highly conserved across different Drosophila species. The 3' third of IAB8 harboring the EVE-KR motif (minIAB8) is able to drive reporter gene expression in the characteristic IAB8-pattern in the presumptive A8 segment of transgenic Drosophila. Deletion of the pair of EVE binding sites (∆EVE) significantly weakens enhancer activity in A8, suggesting that while the these two EVE sites are important, cryptic weak EVE binding sites in the remaining sequence of the enhancer (which are sufficiently low scoring to escape computational prediction at the ln(p-value) cutoff of - 6.0) are capable of partially compensating for the loss of the two strong predicted EVE sites. In support of this idea is the recent discovery that even weak affinity binding sites contribute to TF occupancy at regulatory regions in Drososphila embryos. In that study it was found that the level of factor occupancy in vivo correlates more strongly with the degree of chromatin accessibility at a given site, rather than in vitro measurements of the affinity of a factor for a particular DNA sequence (Li, 2011). This observation may be especially relevant in the case of pair-rule factors (such as EVE), where a high localized concentration of the protein in each stripe may also facilitate the increased occupancy of low affinity binding sites (Starr, 2011).

A 141 bp fragment (EK) from within the minIAB8 region containing only the EVE-KR cluster drives strong reporter gene expression in A8, but also ectopic expression immediately posterior of A8 and weaker expression immediately anterior of A8. Ectopic reporter gene expression is also observed in the anterior head domain of the embryo. This result indicates that the EK fragment by itself lacks important binding sites responsible for repression in the anterior head domain of the embryo (such as HB) and for the region immediately anterior of A8 (such as KNI). Several predicted HB and KNI repressor sites capable of performing this role are present within the 602 bp minIAB8 enhancer. Importantly, in the C3-A4 domain of the embryo where the KR repressor protein is expressed, there is a lack of enhancer-driven reporter gene expression from the EK fragment, suggesting that the single KR site within the EVE-KR cluster is sufficient to allow KR-mediated repression in that domain of the embryo. The continued presence of the EVE-KR cluster within the IAB8 enhancer, despite extensive reorganization of TF binding sites across the Drosophila orthologs, is reminiscent of the architectural constraints in the Drosophila and Sepsid eve S2E orthologs, which possess a highly conserved cluster of overlapping BCD activator and KR repressor binding sites necessary for enhancer function (Starr, 2011).

To extend the analysis of the functional role of clustered TF binding sites the IAB5 and IAB7b enhancers from the Drosophila BX-C were also analyzed. Chimeric enhancers assembled from the D. melanogaster and D. pseudoobscura IAB5 orthologs appear to have their functional activity entirely preserved and drive reporter gene expression in presumptive abdominal segments A5, A7 and A9. This result contrasts with an earlier study in which chimeric enhancers assembled from reciprocal halves of D. melanogaster and D. pseudoobscura S2E orthologs did not accurately recapitulate enhancer activity. It is possible that the regulatory output for the chimeric IAB5 enhancers may be subject to very subtle modifications. Such modifications may result in changes to expression patterns that are beyond the detection of the reporter gene assay. However, one explanation for the difference in functional output between these two examples is that in the case of the S2E the organization of TF binding sites within the chimeric enhancer was sufficiently modified so as to destroy the functional activity of the enhancer, whereas for IAB5 this was not the case (Starr, 2011).

To further dissect the organization of TF binding sites in IAB5 the predicted TF binding sites in the sequence were examined. This approach reveals a highly evolutionarily conserved signature TF binding site motif consisting of two strong FTZ activator sites close to two strong KR repressor sites in the center of the defined 1 kb enhancer. The FTZ-KR signature motif is present and intact in both the functional IAB5 chimeric enhancers, cMP and cPM. In the case of the cMP enhancer, the signature motif is present in the IAB5.2 half from D. pseudoobscura, while in the case of the reciprocal cPM enhancer, the signature motif is present in the IAB5.2 half from D. melanogaster. Molecular dissection of IAB5 shows that the individual IAB5.2 halves from Drosophila and D. pseudoobscura each show functional enhancer activity, while the corresponding IAB5.1 halves that lack the FTZ-KR signature motif do not. Furthermore, the 424 bp region containing the center peak of sequence conservation of IAB5 (cIAB5) and the FTZ-KR signature motif drives reporter gene expression in the characteristic three-stripe IAB5-pattern in transgenic Drosophila. In support of the critical functional role of this region, previous functional studies showed that the strongest predicted KR binding site within this signature motif is in fact critical to regulate the spatially restricted expression directed by IAB5 to the posterior presumptive A5, A7, and A9 segments in the Drosophila embryo. In the context of the endogenous gene complex a single point mutation in this KR repressor binding site (Superabdominal mutation) causes an anterior expansion of the embryonic domain of Abd-B expression and results in a homeotic transformation of the A3 segment into the more posterior A5 segment. This result confirms that the strong KR binding site in the signature motif is essential for the in vivo functional activity of the IAB5 enhancer (Starr, 2011).

The IAB7b enhancer, which is expressed in the presumptive A7 segment of the Drosophila embryo, is thought to be regulated by many of the same activators and repressors as IAB5. Bioinformatic analysis reveals that a highly conserved FTZ-KR signature motif, very similar to the one identified in IAB5, is also present in the IAB7b enhancer. Molecular dissection of IAB7b to test the role of the signature motif in the activity of the enhancer demonstrates that a 154 bp region containing the motif (2F2K, with two FTZ and two KR sites) from within the Drosophila IAB7b enhancer is able to drive reporter gene expression in the presumptive A5, A7 and A9 segments of transgenic Drosophila, with notably stronger expression in A7. This expression pattern is very similar to that driven by the IAB5 enhancer. A 114 bp region (2F1K, containing two FTZ and one KR site) from within the Drosophila IAB7b enhancer also drives this same pattern of reporter gene expression, suggesting that the 3' KR site is dispensable for repression of enhancer activity in the central domains of the embryo. Despite the fact that the 3' KR site also overlaps predicted BCD and HB repressor binding sites, no ectopic anterior enhancer-driven expression is observed in the 2F1K construct when tested in transgenic embryos, suggesting that the single remaining 5' KR binding site is sufficient for repression. In fact, in more distantly related Drosophila species, the presence of two KR sites positioned near the pair of FTZ sites is lost, and only a single KR site remains (Starr, 2011).

A 110 bp region (1F2K, containing 1 FTZ and two KR sites) from IAB7b does not drive gene expression, demonstrating that the outer FTZ site is required for activation of the enhancer. One possible molecular explanation for the necessity of the outer FTZ binding site is that FTZ may be acting as a dimer in order to activate IAB5 and IAB7b. In both enhancers a pair of strong FTZ sites are present in the FTZ-KR signature motif. While the ability of FTZ to dimerize has not been reported in the literature, other homeodomain-leucine zipper proteins have been shown to function as dimers. In many such cases the protein factors are also able to bind DNA target sequences as monomers, albeit with comparatively lower affinity. There is also evidence that FTZ is capable of interaction with other proteins, namely the orphan nuclear receptor FTZ-F1 through its LXXLL leucine zipper motif. In this case the heterodimer is capable of co-activation of target genes. Given that the consensus binding sites for the two factors are very different; FTZ (NNYAATTR), FTZ-F1 (BSAAGGDKRDD, it is perhaps to be expected that none of the predicted FTZ and FTZ-F1 binding sites in the IAB5 or IAB7b enhancers directly overlap. However, in future studies it will be of interest to explore the role of FTZ homo- and hetero-dimerization in regulating IAB5 and IAB7b activity (Starr, 2011).

The ability of the 2F2K and 2F1K regions to drive reporter gene expression in an IAB5-like manner in the presumptive A5, A7 and A9 segments of transgenic Drosophila suggests that additional inputs into IAB7b are required to spatially restrict endogenous enhancer-driven gene expression to only the A7 segment. A likely candidate for repression of IAB7b activity in the A5 segment of the embryo is KNI, which is expressed in the presumptive A1-A6 segments. Bioinformatic analysis predicts several candidate KNI binding sites in the full length 728 bp IAB7b enhancer, whereas the 2F2K and 2F1K regions lack any such predicted KNI sites. Previous studies revealed that the repression of the IAB7b enhancer in A5 is mediated by sequences in the 728 bp fragment and does not require additional flanking 5' or 3' regions. In addition, while disruption of the two KR sites in the signature motif does result in reporter gene activation by IAB7b in anterior regions of the embryo, repression persists in the A5 segment. This result indicates that a factor other than KR is responsible for repression in A5. In the entire 728 bp enhancer only three strong KNI sites are predicted, all located in the ~ 300 bp region on the abd-A side of the signature motif. These sites all lie within an evolutionarily conserved genomic region and some of the sites are conserved in distantly related Drosophila species. The significance of these KNI sites in restricting the IAB7b mediated-expression pattern is currently under investigation (Starr, 2011).

A key question in understanding cis-regulatory grammar is why certain arrangements of TF binding sites confer functional enhancer activity while others fail to do so. The turnover of binding sites is common during the evolution of enhancers in different species, yet the functional activity of rapidly-evolving enhancer orthologs from different species is often robust, even across several hundred million years of evolutionary divergence. In the case of the BX-C, bioinformatic analysis demonstrates that there is extensive binding site turnover in the IAB5, IAB7b, and IAB8 enhancers across the Drosophila genus, particularly in more distantly related species. Despite this turnover of TF binding sites, the newly identified FTZ-KR signature motif present in both IAB5 and IAB7b and the functionally important EVE-KR cluster within IAB8 are composed of similar patterns of conserved binding site architecture. Specifically, the organization of sites is such that a pair of strong activator (FTZ or EVE) binding sites and at least one strong repressor (KR) site are in close proximity (< 116 bp) to each other. Notably, the spacing between the FTZ and KR sites in the signature motif is largely unchanged across IAB5 and IAB7b enhancer orthologs in distantly related Drosophila species, although in the case of IAB7b there is the loss of the secondary KR binding site in species more distantly related to Drosophila. Conservation of genomic architecture of these TFBSs in the BX-C enhancers does not directly indicate that the specific spacing between sites is essential. However, the functional activity of genomic regions containing these motifs supports previous findings that closely spaced activator and repressor binding sites are critical for enhancer function and suggests that the architecture of binding sites within an enhancer is subject to significant evolutionary constraint (Starr, 2011).

It has recently been suggested through computational synthetic evolution studies that the inherent bias for deletions over insertions in the genome of Drosophila (and many other species) may result in the gradual loss of nucleotide space between TF binding sites. In effect, this deletion bias helps to artificially cluster binding sites together. In this case, although clustering of TF binding sites may not itself be a feature originally selected for in evolution on the basis of its functional significance, once established in the genome it may still play a functional role in enhancer activity. Molecular dissection of IAB5, IAB7b, and IAB8 enhancer function argues that specific clusters of activator and repressor binding sites do play a key role in enhancer activity. As a result, such clusters, once present in enhancers, may well be under positive evolutionary selective pressure, as evidenced by the largely invariant organization of the binding sites in the IAB5 and IAB7b FTZ-KR signature motif. This selection does not preclude the possibility that if binding sites arise nearby in the genome de novo, these new binding sites may also contribute to enhancer functional activity. In this scenario, the original TF binding site cluster may no longer be necessary for enhancer function. Indeed, in the case of the IAB8 enhancer, the ∆EVE region tested in s transgenic assay may be an example of this phenomenon. This fragment is able to exhibit a weak IAB8-like enhancer function even with the deletion of the pair of strong predicted EVE binding sites, potentially through the activity of weaker EVE binding sites that are present in the remaining sequence (Starr, 2011).

Although the precise spatial arrangement of TF binding sites within an enhancer may not exactly mirror the ancestral arrangement, computational predictions suggest that functional clusters of TF binding sites are likely to result from the spatial re-organization of older pre-existing sites during evolution. Such clusters therefore also likely indicate genomic regions with robust enhancer activity. The fact that enhancer activity in the BX-C appears to be dependent on signature motifs that represent specific spatial arrangements of TF binding sites in minimal modular regions, indicates that the physical patterns of binding site clustering are functionally significant in terms of enhancer architecture (Starr, 2011).

A system of repressor gradients spatially organizes the boundaries of Bicoid-dependent target genes

The homeodomain (HD) protein Bicoid (Bcd) is thought to function as a gradient morphogen that positions boundaries of target genes via threshold-dependent activation mechanisms. This study analyzed 66 Bcd-dependent regulatory elements, and their boundaries were shown to be positioned primarily by repressive gradients that antagonize Bcd-mediated activation. A major repressor is the pair-rule protein Runt (Run), which is expressed in an opposing gradient and is necessary and sufficient for limiting Bcd-dependent activation. Evidence is presented that Run functions with the maternal repressor Capicua and the gap protein Kruppel as the principal components of a repression system that correctly orders boundaries throughout the anterior half of the embryo. These results put conceptual limits on the Bcd morphogen hypothesis and demonstrate how the Bcd gradient functions within the gene network that patterns the embryo (Chen, 2012).

This study identified 32 enhancers that respond to Bcd-dependent activation and form expression boundaries at different positions along the AP axis of fly embryos. Adding these elements to the 34 previously known enhancers constitutes the largest data set of in vivo-tested and -confirmed enhancers regulated by a specific transcription factor in all of biology (Chen, 2012).

The 32 confirmed enhancers were identified among 77 tested genomic fragments, which were selected because they showed in vivo-binding activity, or they conformed to a stringent homotypic-clustering model for predicted Bcd-binding sites, or both. All seven previously unknown fragments showing in vivo binding and a predicted site cluster directed Bcd-dependent transcription in the early embryo. Other fragments from the top 50 ChIP-Chip signals (which do not conform to the clustering model) were also very likely (21 of 26) to test positive in the in vivo test, but this likelihood drops significantly (9 of 25) in a set of fragments from lower on the list of ChIP-Chip fragments. Interestingly, of 19 tested fragments that contain clusters of predicted sites, but no in vivo binding activity, not a single one tested positive in vivo. These results suggest that in ;vivo binding assays are much better predictors of regulatory function than simple site-clustering algorithms alone (Chen, 2012).

One explanation for the failure of these predicted site clusters to bind Bcd in vivo is that they lie in heterochromatic regions of the genome that prevent site access. However, because they fail to function when taken out of their normal context (in reporter genes), whatever is preventing activation must be a property of the fragment itself and not its location in the genome. Interestingly, a number of Bcd site cluster-containing fragments drive expression later in development. It is proposed that these fragments fail to bind Bcd because they lack sites for cofactors that facilitate Bcd binding. In preliminary experiments it was observed that Bcd-activated fragments contain on average more binding sites for the ubiquitous activator protein Zelda (Zld) than those that fail to activate. Zld has been shown to be critical for timing the zygotic expression of hundreds of genes in the maternal to zygotic transition (Chen, 2012).

These results suggest strongly that a gradient of Run protein plays a major role in limiting Bcd-dependent activation. Run seems to work as part of a repression system that also includes Cic and possibly Kr. Expression boundaries in the region anterior to the presumptive cephalic furrow shift toward the posterior in run and cic mutants, and the double mutant causes boundaries that are normally well separated to collapse into a single position (Chen, 2012).

The use of multiple repressors permits flexibility in binding site architecture within enhancers that establish boundaries at similar positions. For example type I enhancers show overrepresentations of both Run and Cic sites, but 27% lack strong matches to the Cic PWM, and 12% lack strong matches to the Run PWM. Importantly, however, all type I enhancers lacking Cic sites contain Run sites, and those lacking Run sites contain Cic sites. Multiple Kr sites were observed in a large number of Bcd-dependent enhancers, which suggests that Kr is also a major component of the repression system that orders Bcd-dependent expression boundaries. Taken together, these data suggest that antagonistic repression of Bcd-mediated activation is a key design principle of the system that organizes the AP body plan. The repressors identified so far (Run, Cic, and Kr) are expressed in overlapping domains with gradients at different positions, consistent with the formation and ordering of a relatively large number of boundaries throughout the anterior half of the embryo (Chen, 2012).

The close linkage between repressor sites and Bcd sites within discrete enhancers suggests that repression occurs via short-range interactions that interfere directly with Bcd binding or activation. Interestingly, Cic also shows repressive effects that seem to be binding site independent. For example some type I enhancers do not contain recognizable Cic sites, but their expression boundaries expand posteriorly in cic mutants. This could be caused by the reduced expression of run and Kr in cic mutants. However, genetically removing both Kr and run causes a less dramatic expansion than that seen in the absence of cic. This suggests that Cic binds these enhancers via suboptimal sites or that it is required for the correct patterning of another unknown repressor. Another possibility is that these expansions are caused indirectly by changing the balance of MAPK phosphorylation events that control terminal patterning (Chen, 2012).

These results do not strictly falsify the Bcd morphogen hypothesis, but they support the idea that the Bcd gradient can establish only a 'rough framework that is elaborated by the interaction of the zygotic segmentation genes'. What is the nature of this framework, and what role does it play in the network that precisely positions target gene boundaries (Chen, 2012)?

One component of the system, the Cic repression gradient, is maternally produced and formed by downregulation at the poles via the terminal patterning system. This gradient is formed independently of Bcd but is critical for establishing boundaries of Bcd-dependent target genes. In contrast, Bcd is involved in activating the expression patterns of run and Kr and in repressing them in anterior regions. Both run and Kr expand anteriorly in bcd mutants. There is no evidence that Bcd functions directly as a transcriptional repressor, so these repressive activities are probably indirect. Previous work showed that the Bcd target gene gt is involved in setting the anterior Kr boundary, and it is hypothesized that another Bcd target gene, slp1, encodes a forkhead domain (FKH) protein that sets the anterior boundary of the early run pattern. slp1 is expressed in a pattern reciprocal to the run pattern and was previously shown to position the anterior boundaries of several pair-rule gene stripes including run stripe 1 (Chen, 2012).

These results suggest that a major function of the Bcd gradient is the differential positioning of two repressors, Slp1 and Gt, which set the positions of the Run and Kr repression gradients, which then feedback to repress Bcd-dependent target genes. How are slp1 and gt differentially positioned? One possibility is that slp1 and gt enhancers respond to specific concentrations within the Bcd gradient, consistent with the original model for morphogen activity. However, the fact that the slp1 and gt expression domains form boundaries at the same positions in embryos lacking the Cic and Run repressors argues against this model for these genes (Chen, 2012).

It was also shown that Bcd target genes normally expressed in cephalic regions form and correctly position posterior boundaries in embryos containing flattened Bcd gradients. Run is still expressed in these embryos, specifically in a domain that consistently abuts the boundaries of the anterior Bcd target genes, regardless of copy number. This suggests that a mutually repressive interaction between Slp1 and Run is maintained in these embryos but does not explain how these boundaries are consistently oriented perpendicularly to the AP axis. The answer might lie in the fact that the flattened Bcd gradients in these embryos are not completely flat but are present as shallow gradients with slightly higher levels in anterior regions. In these embryos the slight changes in concentration along the AP axis might cause a bias that enables the orientation of the mutual repression interaction. In wild-type embryos, Bcd is much more steeply graded, which makes this bias stronger and the boundary between these mutual repressors more robust (Chen, 2012).

These results suggest that antagonistic repression precisely orders Bcd-dependent expression boundaries. However, repression may not be required for the activity of all morphogens. For example the extracellular signal activin has been shown to activate target genes in a threshold-dependent manner in isolated animal caps from frog embryos. Also, a gradient of the transcription factor Dorsal (Dl) is critical for setting boundaries between different tissue types along the dorsal-ventral (DV) axis of the fly embryo. It is thought that the major mechanism in Dl-specific patterning is threshold-dependent activation, which is quite different from the system described in this paper. One major difference between Bcd and Dl is the number of boundaries specified: three for Dl and more than ten for Bcd. It is proposed that the robust ordering of more boundaries simply requires a more complex system (Chen, 2012).

In general, though, it seems that antagonistic mechanisms are involved in controlling the establishment or interpretation of most morphogen activities. For example in the Drosophila wing disc, the TGF-N2 signal Dpp forms an activity gradient that is refined by interactions with multiple extracellular factors. Also, in vertebrates the signaling activity of the extracellular morphogen Sonic hedgehog (Shh) is affected by positive and negative interactions with specific molecules on the surfaces of receiving cells (Chen, 2012).

There is some evidence that transcriptional repression is also used for refining the patterning activities of extracellular molecules. Dpp acts as a long-range morphogen that activates two major target genes (optomotor blind [omb] and spalt [sal]) in nested patterns with boundaries at different positions with respect to the source of Dpp. Although these boundaries could in theory be formed by differential responses to the morphogen, it is clear that the transcriptional repressor Brinker (Brk), which is expressed in an oppositely oriented gradient, also plays an important role. The Brk gradient is itself positioned by Dpp activity in a manner analogous to positioning of the Run and Kr repressor gradients by Bcd. Also, a similar transcriptional network functions in Shh-mediated patterning of the vertebrate neural tube, where a series of spatially oriented repressors feeds back to limit the expression boundaries of Shh-mediated cell fate decisions (Chen, 2012).

Conceptually, these more complex systems are reminiscent of the reaction-diffusion model proposed by Turing, in which a localized activator would activate a repressor, which would diffuse more rapidly than the activator, and feed back on its activity. These systems strongly suggest that the patterning activity of a single monotonic gradient is insufficiently robust for establishing precise orders of closely positioned expression boundaries. By integrating gradients with repressive mechanisms that refine gradient shape or influence outputs, systems are generated that ensure consistency in body plan establishment while still maintaining the flexibility required for complex systems to evolve (Chen, 2012).

HOT regions function as patterned developmental enhancers and have a distinct cis-regulatory signature

HOT (highly occupied target) regions bound by many transcription factors are considered to be one of the most intriguing findings of the recent modENCODE reports, yet their functions have remained unclear. This study tested 108 Drosophila melanogaster HOT regions in transgenic embryos with site-specifically integrated transcriptional reporters. In contrast to prior expectations, 102 (94%) were found to be active enhancers during embryogenesis and to display diverse spatial and temporal patterns, reminiscent of expression patterns for important developmental genes. Remarkably, HOT regions strongly activate nearby genes and are required for endogenous gene expression, as was shown using bacterial artificial chromosome (BAC) transgenesis. HOT enhancers have a distinct cis-regulatory signature with enriched sequence motifs for the global activators Vielfaltig, also known as Zelda, and Trithorax-like, also known as GAGA. This signature allows the prediction of HOT versus control regions from the DNA sequence alone (Kvon, 2012).

Taken together, these data show that Drosophila HOT regions function as cell type-specific transcriptional enhancers to up-regulate nearby genes during early embryo development. In contrast to prior expectations, HOT enhancers display diverse spatial and temporal activity patterns, which are reminiscent of expression patterns of important developmental genes. It was further found that the activity of many HOT enhancers appears to be unrelated to the expression of the bound transcriptional activators, suggesting that neutral TF binding to HOT regions is frequent. Interestingly, for Twi, Kr, and five additional TFs, it was found that HOT enhancers with functional footprints of the TFs are significantly enriched in the TFs' motifs compared with HOT enhancers to which the TFs seem to bind neutrally (e.g., 2.2-fold for Twi). This supports previous suggestions that the recruitment of TFs to HOT regions might be independent of the TFs' motifs and mediated by protein-protein interactions or nonspecific DNA bindin. This seems to be particularly true for (HOT) regions to which the TFs bind neutrally without impact on the regions' transcriptional enhancer activity (Kvon, 2012).

By uncovering a distinct cis-regulatory signature that is characteristic and predictive of HOT regions, computational analysis establishes a link between HOT regions, early embryonic enhancers (EEEs), and maternal TFs that are ubiquitously present in the early Drosophila embryo. Specifically, the results suggest that ZLD might be more generally important for the establishment of regulatory elements in the early embryo, while GAGA appears to be a distinguishing feature of HOT regions. This is supported by an analysis of genome-wide data on ZLD and GAGA binding in early Drosophila embryos: While 71.4% of HOT regions and 75.0% of EEEs are bound by ZLD (compared with 42.2% and 13.0% of control WARM and COLD regions), GAGA binds to 53.4% of HOT regions but only 20.0% of EEEs (compared with 28.3% and 7.8% for WARM and COLD regions). Even when considering only regions that are functioning as transcriptional enhancers in the early embryo (all EEEs from CAD and this study combined), GAGA binds to significantly more HOTenhancers than to enhancers that are not HOT. An instructive role for ZLD in defining chromatin that is open and accessible to other factors is further supported by its unusual property to bind to the majority (64%) of all occurrences of its sequence motif in the Drosophila genome. ZLD might thus be a prerequisite for both HOTregions and EEEs more generally. Similarly, a role for GAGA in nucleating or promoting the formation of TF complexes is consistent with its ability to self-oligomerize via its BTB/POZ domain and also form heteromeric complexes with the TF Tramtrack and potentially other BTB/POZ domain- containing TFs (e.g., Abrupt, Bric-a-brac, Broad complex, and others). GAGA, with its ability to recruit other TFs by protein-protein interactions, might contribute to HOT regions independent of the specific cellular or developmental context. Interestingly, C. elegans HOT regions are also strongly enriched in the GAGA motifs, and the motif is the most important sequence feature when classifying C. elegans HOT versus control regions. GAGA-like factors or their putative homologs or functional analogs across species might be a conserved feature of metazoan HOT regions (Kvon, 2012).

Precision of hunchback expression in the Drosophila embryo

Activation of the gap gene hunchback (hb) by the maternal Bicoid gradient is one of the most intensively studied gene regulatory interactions in animal development. Most efforts to understand this process have focused on the classical Bicoid target enhancer located immediately upstream of the P2 promoter. However, hb is also regulated by a recently identified distal shadow enhancer as well as a neglected 'stripe' enhancer, which mediates expression in both central and posterior regions of cellularizing embryos. This study employed BAC transgenesis and quantitative imaging methods to investigate the individual contributions of these different enhancers to the dynamic hb expression pattern. These studies reveal that the stripe enhancer is crucial for establishing the definitive border of the anterior Hb expression pattern, just beyond the initial border delineated by Bicoid. Removal of this enhancer impairs dynamic expansion of hb expression and results in variable cuticular defects in the mesothorax (T2) due to abnormal patterns of segmentation gene expression. The stripe enhancer is subject to extensive regulation by gap repressors, including Kruppel, Knirps, and Hb itself. It is proposed that this repression helps ensure precision of the anterior Hb border in response to variations in the Bicoid gradient (Perry, 2012).

hunchback (hb) is the premier gap gene of the segmentation regulatory network. It coordinates the expression of other gap genes, including Kruppel (Kr), knirps (kni), and giant (gt) in central and posterior regions of cellularizing embryos. The gap genes encode transcriptional repressors that delineate the borders of pair-rule stripes of gene expression. hb is activated in the anterior half of the precellular embryo, within 20-30 min after the establishment of the Bicoid gradient during nuclear cleavage cycles 9 and 10 (~90 min following fertilization). This initial hb mRNA transcription pattern exhibits a reasonably sharp on/off border within the presumptive thorax. This border depends on cooperative interactions of Bicoid monomers bound to linked sites in the proximal ('classical') enhancer. However, past studies and recent computational modeling suggest that Bicoid cooperativity is not sufficient to account for this precision in hb expression (Perry, 2012).

The hb locus contains two promoters, P2 and P1, and three enhancers. The 'classical' proximal enhance and distal shadow enhancer mediate activation in response to the Bicoid gradient. Expression is also regulated by a third enhancer, the 'stripe' enhancer, which is located over 5 kb upstream of P2. Each of these enhancers was separately attached to a lacZ reporter gene and expressed in transgenic embryos. As shown previously, the Bicoid target enhancers mediate expression in anterior regions of nuclear cleavage cycle (cc) 12-13 embryos, whereas the stripe enhancer mediates two stripes of gene expression at later stages, during cc14. The anterior stripe is located immediately posterior to the initial hb border established by the proximal and distal Bicoid target enhancers (Perry, 2012).

BAC transgenesis was used to determine the contribution of the stripe enhancer to the complex hb expression pattern. For some of the experiments, the hb transcription unit was replaced with the yellow (y) reporter gene, which contains a large intron permitting quantitative detection of nascent transcripts. The resulting BAC mimics the endogenous expression pattern, including augmented expression at the Hb border. However, removal of the stripe enhancer from an otherwise intact y-BAC transgene leads to diminished expression at this border and in posterior regions (Perry, 2012).

The functional impact of removing the stripe enhancer was investigated by genetic complementation assays. A BAC transgene containing 44 kb of genomic DNA encompassing the entire hb locus and flanking regulatory DNAs fully complements deficiency homozygotes carrying a newly created deletion that cleanly removes the hb transcription unit. The resulting adults are fully viable, fertile, and indistinguishable from normal strains. Embryos obtained from these adults exhibit a normal Hb protein gradient, including a sharp border located between eve stripes 2 and 3 (Perry, 2012).

The Hb BAC transgene lacking the stripe enhancer fails to complement hb/hb mutant embryos due to the absence of the posterior hb expression pattern, which results in the fusion of the seventh and eighth abdominal segments. In addition, the anterior Hb domain lacks the sharp 'stripe' at its posterior limit, resulting in an anterior expansion of Even-skipped (Eve) stripe 3 because the Hb repressor directly specifies this border. There is also a corresponding shift in the position of Engrailed (En) stripe 5, which is regulated by Eve stripe 3. The narrowing of En stripes 4 and 5, due to the anterior shift of stripe 5, correlates with patterning defects in the mesothorax (Perry, 2012).

Quantitative measurements indicate significant alterations of the anterior Hb expression pattern upon removal of the stripe enhancer. There is an anterior shift at the midpoint of the mature pattern, spanning two to three cell diameters. This boundary normally occurs at 47.2% egg length (EL; measured from the anterior pole). In contrast, removal of the stripe enhancer shifts the boundary to 45.6% EL. The border also exhibits a significant diminishment in slope. Normally, there is a decrease in Hb protein concentration of 20% over 1% EL. Removal of the stripe enhancer diminishes this drop in concentration, with a reduction of just 10% over 1% EL. The most obvious qualitative change in the distribution of Hb protein is seen in regions where there are rapidly diminishing levels of the Bicoid gradient. Normally, the transition from maximum to minimal Hb levels occurs over a region of 10% EL (43%-53% EL). Removal of the stripe enhancer causes a significant expansion of this transition, to 26% EL (27%-53% EL). It is therefore concluded that the stripe enhancer is essential for shaping the definitive Hb border (Perry, 2012).

The preceding studies suggest that the proximal and distal Bicoid target enhancers are not sufficient to establish the definitive Hb border at the onset of segmentation during cc14. Instead, the initial border undergoes a dynamic posterior expansion encompassing several cell diameters due to the action of the stripe enhancer. This enhancer is similar to the eve stripe 3+7 enhancer. Both enhancers mediate two stripes, one in central regions and the other in the posterior abdomen, and the two sets of stripes extensively overlap. Previous studies provide a comprehensive model for the specification of eve stripes 3 and 7, whereby the Hb repressor establishes the anterior border of stripe 3 and the posterior border of stripe 7 while the Kni repressor establishes the posterior border of stripe 3 and anterior border of stripe 7. Whole-genome chromatin immunoprecipitation (ChIP) binding assays and binding site analysis identify numerous Hb and Kni binding sites in the hb stripe enhancer, along with several Kr sites (Perry, 2012).

Site-directed mutagenesis was used to examine the function of gap binding sites in the hb stripe enhancer. Since the full-length, 1.4 kb enhancer contains too many binding sites for systematic mutagenesis, a 718 bp DNA fragment was identified that mediates weak but consistent expression of both stripes, particularly the posterior stripe. Mutagenesis of all ten Hb binding sites in this minimal enhancer resulted in a striking anterior expansion of the expression pattern. This observation suggests that the Hb repressor establishes the anterior border of the central stripe, as seen for eve stripe 3. There is no significant change in the posterior border of the central stripe or the anterior border of the posterior stripe, and repression persists in the presumptive abdomen (Perry, 2012).

Mutagenesis of the Kni binding sites resulted in expanded expression in the presumptive abdomen, similar to that seen for the eve 3+7 enhancer. More extensive depression was observed upon mutagenesis of both the Kni and Kr binding sites. These results suggest that the Kr and Kni repressors establish the posterior border of the central Hb stripe and the anterior border of the posterior stripe. This depressed pattern is virtually identical to the late hb expression pattern observed in Kr1;kni10 double mutants. The reliance on Kr could explain why the Hb central stripe is shifted anterior of eve stripe 3, which is regulated solely by Kni (Perry, 2012).

The dynamic regulation of the zygotic Hb expression pattern can be explained by the combinatorial action of the proximal, shadow, and stripe enhancers. The proximal and distal shadow enhancers mediate activation of hb transcription in response to the Bicoid gradient in anterior regions of cc10-13 embryos. The initial border of hb transcription is rather sharp, but the protein that is synthesized from this early pattern is distributed in a broad and shallow gradient, extending from 30% to 50% EL. During cc14 the stripe enhancer mediates transcription in a domain that extends just beyond the initial hb border. Gap repressors, including Hb itself, restrict this second wave of zygotic hb transcription to the region when there are rapidly diminishing levels of the Bicoid gradient, in a stripe that encompasses 44%-47% EL. The protein produced from the stripe enhancer is distributed in a sharp and steep gradient in the anterior thorax. It has been previously suggested that the steep Hb protein gradient is a direct readout of the broad Bicoid gradient. However, the current studies indicate that this is not the case. It is the combination of the Bicoid target enhancers and the hb stripe enhancer that produces the definitive pattern (Perry, 2012).

It has been proposed that Hb positive autofeedback is an important feature of the dynamic expression pattern. However, the mutagenesis of the hb stripe enhancer is consistent with past studies suggesting that Hb primarily functions as a repressor. The only clear-cut example of positive regulation is seen for the eve stripe 2 enhancer. Mutagenesis of the lone Hb-3 binding site results in diminished expression from a minimal enhancer. It was suggested that Hb somehow facilitates neighboring Bicoid activator sites, and attempts were made to determine whether a similar mechanism might apply to the proximal Bicoid target enhancer. The two Hb binding sites contained in this enhancer were mutagenized, but the resulting fusion gene mediates an expression pattern that is indistinguishable from the normal enhancer). It is therefore likely that the reduction of the central hb stripe in hb/hb embryos is the indirect consequence of expanded expression of other gap repressors, particularly Kr and Kni (Perry, 2012).

The hb stripe enhancer mediates expression in a central domain spanning 44%-47% EL, which coincides with the region exhibiting population variation in the distribution of the Bicoid gradient. Despite this variability, the definitive Hb border was shown to be relatively constant among different embryos. Previous studies suggest that the Kr and Kni repressors function in a partially redundant fashion to ensure the reliability of this border. This paper has presented evidence for direct interactions of these repressors with the hb stripe enhancer, and suggest that a major function of the enhancer is to 'dampen' the variable Bicoid gradient. Indeed, removal of this enhancer from an otherwise normal Hb BAC transgene results in variable patterning defects in the mesothorax, possibly reflecting increased noise in the Hb border (Perry, 2012).

Estimating binding properties of transcription factors from genome-wide binding profiles

The binding of transcription factors (TFs) is essential for gene expression. One important characteristic is the actual occupancy of a putative binding site in the genome. In this study, an analytical model is proposed to predict genomic occupancy that incorporates the preferred target sequence of a TF in the form of a position weight matrix (PWM), DNA accessibility data (in the case of eukaryotes), the number of TF molecules expected to be bound specifically to the DNA and a parameter that modulates the specificity of the TF. Given actual occupancy data in the form of ChIP-seq profiles, copy number and specificity are backwards inferred for five Drosophila TFs during early embryonic development: Bicoid, Caudal, Giant, Hunchback and Kruppel. The results suggest that these TFs display thousands of molecules that are specifically bound to the DNA and that whilst Bicoid and Caudal display a higher specificity, the other three TFs (Giant, Hunchback and Kruppel) display lower specificity in their binding (despite having PWMs with higher information content). This study gives further weight to earlier investigations into TF copy numbers that suggest a significant proportion of molecules are not bound specifically to the DNA (Zabet, 2014: 25432957).

Krüppel: Biological Overview | Evolutionary Homologs | Regulation | Protein Interactions | Developmental Biology | Effects of Mutation | References

Home page: The Interactive Fly © 1997 Thomas B. Brody, Ph.D.

The Interactive Fly resides on the
Society for Developmental Biology's Web server.