Giant

The giant promoter contains a TATA box (Capovilla, 1992).

A major challenge in interpreting genome sequences is understanding how the genome encodes the information that specifies when and where a gene will be expressed. The first step in this process is the identification of regions of the genome that contain regulatory information. In higher eukaryotes, this cis-regulatory information is organized into modular units [cis-regulatory modules (CRMs)] of a few hundred base pairs. A common feature of these cis-regulatory modules is the presence of multiple binding sites for multiple transcription factors. Transcription factor binding sites have a tendency to cluster; the extent to which they do can be used as the basis for the computational identification of cis-regulatory modules. By using published DNA binding specificity data for five transcription factors active in the early Drosophila embryo, genomic regions containing unusually high concentrations of predicted binding sites were identified for these factors. A significant fraction of these binding site clusters overlap known CRMs that are regulated by these factors. In addition, many of the remaining clusters are adjacent to genes expressed in a pattern characteristic of genes regulated by these factors. One of the newly identified clusters, mapping upstream of the gap gene giant (gt) was tested; it acts as an enhancer that recapitulates the posterior expression pattern of gt (Berman, 2002).

The transcription factors Bicoid (Bcd), Caudal (Cad), Hunchback (Hb), Krüppel (Kr), and Knirps (Kni) act at very early stages of Drosophila development to define the anterior-posterior axis of the embryo. Bcd and Cad are maternal activators broadly distributed in the anterior and posterior portions of the embryo, respectively. Hb, Kr, and Kni are zinc-finger gap proteins that act primarily as repressors in specific embryonic domains. Sequences of previously described binding sites were collected for these five factors present in the cis-regulatory regions of known target genes. The binding sequences for each factor were aligned by using the motif-assembly program, and the binding specificities of each factor were modeled with position weight matrices (PWMs). PWMs are a useful way to represent binding specificities and provide a statistical framework for searching for novel instances of the motif in genome sequences (Berman, 2002).

A freely available program PATSER was used to search the genome for sequences that match these PWMs, and a web-based visualization tool, CIS-ANALYST (http://www.fruitfly.org/cis-analyst/) was devised to display the location of predicted binding sites along with genome annotations in selected genomic regions. PATSER assigns a score to each potential site that reflects the agreement between the site and the corresponding PWM. These scores approximate the free energy of binding between the factor and site, and CIS-ANALYST uses a user-defined cutoff parameter to eliminate predicted low-affinity sites (Berman, 2002).

Using CIS-ANALYST, the distribution of Bcd, Cad, Hb, Kr, and Kni binding sites were examined in a 1-Mb genomic region surrounding the well-characterized eve locus at a site_p value of 0.0003. At this relatively high-stringency value, most experimentally verified binding sites are retained; at more restrictive values, many of these sites would be lost (Berman, 2002).

To investigate whether binding site clustering could help to explain the specificity of these factors for eve, a simple notion of binding site clustering was incorporated into CIS-ANALYST, allowing searches for segments of a specified length containing a minimum number of predicted binding sites. When the 1-Mb region surrounding eve was searched for dense clusters of predicted high-affinity sites (at least 13 Bcd, Cad, Hb, Kr, or Kni sites in a 700-bp window), three discrete regions were identified. Strikingly, these three clusters are all adjacent to eve, and overlap the previously characterized stripe 2, stripe 3+ 7, and stripe 4 + 6enhancers (Berman, 2002).

To generalize and quantify these promising results, a broader collection of 19 well-defined CRMs from 9 Drosophila genes known to be required for proper embryonic development was compiled. Each of these CRMs is sufficient to direct the expression of a distinct anterior-posterior pattern in early embryos; genetic evidence suggests that each CRM is regulated by at least one of the following: Bcd, Cad, Hb, Kr, and Kni. Mutation and in vitro DNA binding studies completed on a subset of the CRMs provide evidence for a direct regulatory relationship. The same clustering criteria that were successful for identifying CRMs in eve (700-bp regions with at least 13 predicted binding sites) identified clusters overlapping 14 of these 19 known CRMs (Berman, 2002).

A search of the entire genome for 700-bp windows containing at least 13 predicted binding sites identified 133 clusters in addition to the 19 described above, or ~1 per 700 kb of noncoding sequence. As expected, when more stringent clustering criteria are used, both the number of known CRMs recovered and the number of novel clusters identified decrease. The novel clusters identified with a density of at least 15 binding sites per 700 bp, a level at which half of the known CRMs are still recovered, were further examined. Binding site plots for the 22 novel clusters identified at this high stringency condition, and 6 additional novel clusters identified with an equally stringent search by using only Bcd, Hb, Kr, and Kni have been published as supporting information on the PNAS web site). Twenty-three of these 28 clusters fall in regions between genes, whereas the remaining 5 fall in introns. There are therefore 49 genes that either contain a novel cluster of binding sites or flank an intergenic region that does. The expression patterns of these 49 genes in early embryos were examined by whole-mount RNA in situ hybridization and DNA microarray hybridization. At least 10 of the 28 clusters were adjacent to a gene that showed localized anterior-posterior expression in the syncitial or cellular blastoderm stages, consistent with early regulation by maternal effect or gap transcription factors. Although the numbers are small, this is significantly more than the 1 or 2 expected if the positions of clusters had been chosen at random (Berman, 2002).

One of these clusters is located ~2 kb upstream of the gap gene giant (gt). During cellularization, gt is expressed in two broad domains, one in the anterior and one in the posterior portion of the embryo. The pattern of expression of the posterior expression domain is determined by the activities of Cad, Hb, and Kr. However, the cis-regulatory sequence controlling this posterior expression pattern has not been precisely identified. Whether this cluster of binding sites might be the gt posterior enhancer was evaluated. A 1.1-kb fragment containing this cluster was placed in a reporter construct containing the eve minimal promoter fused to a lacZ reporter gene. The expression pattern of this construct largely recapitulates the early expression pattern of the gt posterior expression domain. In the absence of Kr function, the anterior border of the gt posterior domain shifts anteriorly, indicating repression by Kr. The construct containing the gt posterior enhancer exhibits a similar shift in the absence of Kr (Berman, 2002).

The maternal morphogen Bicoid (Bcd) is distributed in an embryonic gradient that is critical for patterning the anterior-posterior (AP) body plan in Drosophila. Previous work identified several target genes that respond directly to Bcd-dependent activation. Positioning of these targets along the AP axis is thought to be controlled by cis-regulatory modules (CRMs) that contain clusters of Bcd-binding sites of different 'strengths.' A combination of Bcd-site cluster analysis and evolutionary conservation has been used to predict Bcd-dependent CRMs. Tested were 14 predicted CRMs by in vivo reporter gene assays; 11 showed Bcd-dependent activation, which brings the total number of known Bcd target elements to 21. Some CRMs drive expression patterns that are restricted to the most anterior part of the embryo, whereas others extend into middle and posterior regions. However, no strong correlation is detected between AP position of target gene expression and the strength of Bcd site clusters alone. Rather, binding sites for other activators, including Hunchback and Caudal correlate with CRM expression in middle and posterior body regions. Also, many Bcd-dependent CRMs contain clusters of sites for the gap protein Krüppel, which may limit the posterior extent of activation by the Bcd gradient. It is proposed that the key design principle in AP patterning is the differential integration of positive and negative transcriptional information at the level of individual CRMs for each target gene (Ochoa-Espinosa, 2005).

In reporter gene assays, 11 of the 14 tested fragments directed expression patterns in wild-type embryos that recapitulate all or part of the endogenous patterns of the associated genes. These experiments identified several elements that control segmentation genes, including three new gap gene CRMs. Two CRMs were found in the genomic region that lies 5' of the gap gene gt. One CRM (gt23) is initially expressed in a broad anterior domain and then refines into two stripes. A second CRM (gt1) is expressed later in a small dorsal domain very near the anterior tip. Double stain experiments indicated that the timing and spatial regulation of both patterns are indistinguishable from the anterior expression domains of the endogenous gt gene. A CRM 3' of the gap gene tll was identified that drives expression similar to the anterior tll domain (Ochoa-Espinosa, 2005).

Four novel CRMs were identified near known pair rule genes. One CRM was detected in the 3' region of hairy and drives expression of a small anterior dorsal domain similar to the hairy 0 stripe of the endogenous gene. Another CRM is located 3' of the paired gene and directs expression of an early broad domain that coincides with the later position of the native paired stripes 1 and 2. Two more CRMs (slpA and slpB) were identified in the slp locus, which contains the two related genes, slp1 and slp2. Both slpA and slpB faithfully reproduce parts of the early slp1 and slp2 expression patterns (Ochoa-Espinosa, 2005).

Four other CRMs were identified near the genes bowl, CG9571, D/fsh, and bl/Mir7. In three cases (bowl, CG9571, and D/fsh), the newly identified CRMs direct patterns similar to their associated endogenous genes. The final CRM (bl/Mir7) is located in the sixth intron of the bl gene and directs a strong anterior domain of expression. However, the endogenous bl gene is expressed nearly ubiquitously , which makes it an unlikely target of regulation by this CRM. One potential target of this element is the microRNA gene (Mir7), which is located 7 kb downstream in the eighth intron of bl. Four of the CRMs reported here (gt1, gt23, slpA, and D/fsh) were also identified in a recent genome-wide search for new patterning elements based on clusters of combinations of different binding sites including Bcd. The fragments used in that study were significantly larger in size but show very similar patterns to those in this study (Ochoa-Espinosa, 2005).

Transcriptional Regulation

The anterior stripe of giant is activated by Bicoid (Eldon, 1991). Caudal also serves to activate the anterior stripe: removal of zygotic caudal expression removes anterior giant expression (Schulz, 1995).

Most of the thoracic and abdominal segments of Drosophila are specified early in embryogenesis by the overlapping activities of the gap genes hunchback, Krüppel, knirps, and giant. The orderly expression of these genes depends on two maternal determinants: Bicoid, which activates hb anteriorly, and Nanos, which blocks translation of hb posteriorly. The resulting gradient of HB protein dictates where Krüppel, knirps, and giant genes are expressed by providing a series of concentration thresholds that regulate each gene independently (Struhl, 1992). Subsequent studies point to the importance of bicoid and caudal in activating both the anterior and posterior stripes of giant. Posterior gt expression is almost completely abolished in embryos lacking maternal and zygotic caudal. In the absence of bicoid and zygotic caudal, gt expression is weak (Rivera-Pomar, 1995).

caudal is both maternally and zygotically expressed in Drosophila whereby the two phases of expression can functionally replace each other. The zygotic expression under the control of the HB repressor forms an abdominal and a posterior domain. The cad domain functions by activating the expression of the abdominal gap genes knirps (kni) and giant (gt) (Schulz, 1995).

Krüppel represses giant in the central domain, thus assuring separated anterior and posterior expression (Kraut, 1991b).

Anteroposterior polarity of the Drosophila embryo is initiated by the localized activities of the maternal genes, bicoid and nanos, which establish a gradient of the Hunchback morphogen. nanos determines the distribution of the maternal HB protein by regulating its translation. To identify further components of this pathway suppressors of nanos have been isolated. In the absence of nanos high levels of HB protein repress the abdomen-specific genes knirps and giant. In suppressor-of-nanos mutants, knirps and giant are expressed in spite of high HB levels. The suppressors are alleles of Enhancer of zeste (E[z]) a member of the Polycomb group (PC-G) of genes. E(z), and likely other PC-G genes, are required for maintaining the expression domains of knirps and giant initiated by the maternal HB protein gradient. A small region of the knirps promoter mediates the regulation by E(z) and HB. Because PC-G genes are thought to control gene expression by regulating chromatin, it is proposed imprinting at the chromatin level underlies the determination of anteroposterior polarity in the early embryo (Pelegri, 1994).

The requirements for the multi sex combs (mxc) gene during development have been examined to gain further insight into the mechanisms and developmental processes that depend on the important trans-regulators forming the Polycomb group (PcG) in Drosophila. Although mxc has not yet been cloned, it is known to be allelic with the tumor suppressor locus lethal (1) malignant blood neoplasm [l(1)mbn]. The mxc product is dramatically needed in most tissues because its loss leads to cell death after a few divisions. mxc also has a strong maternal effect. Hypomorphic mxc mutations are found to enhance other PcG gene mutant phenotypes and cause ectopic expression of homeotic genes, confirming that PcG products are cooperatively involved in repression of selector genes outside their normal expression domains. The mxc product is needed for imaginal head specification, through regulation of the ANT-C gene Deformed. This analysis reveals that mxc is involved in the maternal control of early zygotic gap gene expression known to involve some other PcG genes and suggests that the mechanism of this early PcG function could be different from the PcG-mediated regulation of homeotic selector genes later in development (Saget, 1998).

Induction of uncontrolled growth and deregulation of Hox genes are linked in mammals, where Hox products can induce leukemia. In Drosophila, modification of homeotic gene expression causes homeosis, sometimes associated with increased proliferation but not with uncontrolled tumorous growth, possibly because the identity of each segment is specified by a combination of HOM products. Loss or gain of one HOM gene will likely lead to a new combination that is found elsewhere in wild type, and cells expressing this combination could be expected to follow the corresponding developmental pathway and give rise to homeotic transformations. However, because each cellular identity apparently corresponds to a given proliferation rate, loss or ambiguity of identity due to deregulation of several selector genes in a single cell, such as mxc mutations apparently induce, could lead to loss of proliferation control. Identification of mxc partners and targets, as well as of the molecular nature of the mxc product, may help throw light on the genes and mechanisms involved in this process (Saget, 1998).

It has been proposed that certain PcG genes are required for the maintenance of the expression domains of knirps and giant, through a mechanism similar to the regulation of homeotic genes. The regionalization of the Drosophila embryo depends on the maternally supplied products of bicoid (bcd), hunchback (hb), and nanos (nos). Nos represses the translation of the maternal HB mRNA in the posterior embryonic region. This permits the expression of the zygotic gap genes knirps (kni) and giant (gt), which specify posterior identities. These genes would otherwise be repressed by Hb. Embryos from nos/nos mothers form no abdominal segments, but this phenotype can be rescued by a total lack of hb in the maternal germline. It can also be dominantly rescued by the mutation of maternally supplied regulator molecules that normally repress kni and gt in the zygote. Pelegri and Lehmann (1994) have shown that certain mutant products of the PcG genes E(z), Psc, and pleiohomeotic can partially rescue nos by such a maternal effect. To determine if mutation of mxc also affects this regulation, the cuticles of embryos were examined from mxc/+;hb nos/nos mothers that were heterozygous for different mxc mutations. This genetic background was used because a decrease in the amount of maternal hb product can partially rescue the nos phenotype in F1 embryos. Such embryos can differentiate a few abdominal denticle belts and form an adequate background to evaluate increased rescue of nos. Thus loss-of-function PcG mutations should have a strong effect on rescue, and the embryos from hb nos/nos mothers that have two PcG mutations in their genetic background should permit increased rescue of the nos phenotype (Saget, 1998).

Any of three E(z)^son (suppressor of nanos) alleles or a hypomorphic pleiohomeotic allele partially rescue the phenotypes of hb nos/nos progeny by a maternal effect; deficiencies covering E(z) or the Psc/Su(z)2 complex also allow some maternal rescue of hb nos/nos progeny, yet the strongest effect is observed with the gain-of-function E(z)^son alleles. The EMS-induced allele mxcG48 rescues the hb nos/nos progeny phenotype, whereas a deficiency of mxc does not. Some rescue with the Psc/Su(z)2 complex deletion Df(2)vgB is also observed and strong rescue (consistently >50%) is observed with an EMS-induced pleiohomeotic allele phob, described as amorphic. This suggests that phob and mxcG48 are probably not amorphic alleles, and that maternal rescue of hb nos/nos progeny by a PcG gene is most efficient with a non-null mutation (Saget, 1998).

Segmentation of embryos from transheterozygous mothers was also examined. Because neither a reduction of wild-type PcG product nor two PcG mutations in trans in the hb nos/nos mothers increases nos rescue, these data strongly suggest that, whatever the mechanism of gap gene regulation by these PcG mutations may be, it does not function like the PcG-mediated maintenance of homeotic gene expression in embryos and in imaginal discs. The strong rescue provided by several non-null EMS-induced mutations, which may produce mutant proteins, leads to a proposal that modified PcG proteins are poisoning a normal process. How this process depends on wild-type regulation by PcG products has yet to be established (Saget, 1998).

Chip is a homolog of the recently discovered mouse Nli/Ldb1/Clim-2 and Xenopus Xldb1 proteins, which bind nuclear LIM domain proteins. Chip protein interacts with the LIM domains in the Apterous homeodomain protein, and Chip interacts genetically with apterous, showing that these interactions are important for Apterous function in vivo. Importantly, Chip also appears to have broad functions beyond interactions with LIM domain proteins. Chip is present in all nuclei examined and at numerous sites along the salivary gland polytene chromosomes. Embryos without Chip activity lack segments and show abnormal gap and pair-rule gene expression, although no LIM domain proteins are known to regulate segmentation. Lack of active Chip affects Giant more severely than the other gap proteins. In wild-type precellular and early cellular blastoderm embryos, Gt is restricted to two broad bands, whereas in Chip germ-line clone embryos, Gt is expressed at low to moderate levels in the entire embryo, including the pole cells. In later stages, Gt expression is similar to wild type. Lack of Giant in embryos lacking active Chip can explain the decreased expression of Kr and Kni because Gt represses Kr and kni. It is conceivable that abnormal Gt expression also weakens Eve stripe 2. It is concluded that Chip is a ubiquitous chromosomal factor required for normal expression of diverse genes at many stages of development. It is suggested that Chip cooperates with different LIM domain proteins and other factors to structurally support remote enhancer-promoter interactions (Morcillo, 1997).

Cooperative interactions by DNA-binding proteins have been implicated in cell-fate decisions in a variety of organisms. To date, however, there are few examples in which the importance of such interactions has been explicitly tested in vivo. This study tests the importance of cooperative DNA binding by the Bicoid protein in establishing a pattern along the anterior-posterior axis of the early Drosophila embryo. bicoid mutants specifically defective in cooperative DNA binding fail to direct proper development of the head and thorax, leading to embryonic lethality. The mutants do not faithfully stimulate transcription of downstream target genes such as hunchback (hb), giant, and Krüppel. Quantitative analysis of gene expression in vivo indicates that bcd cooperativity mutants are unable to accurately direct the extent to which hb is expressed along the anterior-posterior axis; they display a reduced ability to generate sharp on/off transitions for hb gene expression. These failures in precise transcriptional control demonstrate the importance of cooperative DNA binding for embryonic patterning in vivo (Lebrecht, 2005).

The Drosophila gap gene network is composed of two parallel toggle switches

Drosophila gap genes provide the first response to maternal gradients in the early fly embryo. Gap genes are expressed in a series of broad bands across the embryo during first hours of development. The gene network controlling the gap gene expression patterns includes inputs from maternal gradients and mutual repression between the gap genes themselves. In this study a modular design is proposed for the gap gene network, involving two relatively independent network domains. The core of each network domain includes a toggle switch corresponding to a pair of mutually repressive gap genes, operated in space by maternal inputs. The toggle switches present in the gap network are evocative of the phage lambda switch, but they are operated positionally (in space) by the maternal gradients, so the synthesis rates for the competing components change along the embryo anterior-posterior axis. Dynamic model, constructed based on the proposed principle, with elements of fractional site occupancy, required 5-7 parameters to fit quantitative spatial expression data for gap gradients. The identified model solutions (parameter combinations) reproduced major dynamic features of the gap gradient system and explained gap expression in a variety of segmentation mutants (Papatsenko, 2011).

Fertilized eggs of Drosophila contain several spatially distributed maternal determinants - morphogen gradients, initiating spatial patterning of the embryo. One of the first steps of Drosophila embryogenesis is the formation of several broad gap gene expression patterns within first 2 hrs of development. Gap genes are regulated by the maternal gradients, so their expression appears to be hardwired to the spatial (positional) cues provided by the maternal gradients; in addition, gap genes are involved into mutual repression. How the maternal positional cues and the mutual repression contribute to the formation of the gap stripes has been a subject of active discussion (Papatsenko, 2011).

Accumulated genetics evidence and results of quantitative modeling suggest the occurrence of maternal positional cues (position-specific activation potentials), contributing to spatial expression of four trunk gap genes: knirps (kni), Kruppel (Kr), hunchback (hb) and giant (gt). Existing data suggest that the central Knirps domain stripe is largely the result of activation by Bicoid (Bcd) and repression by Hunchback. Central domain Kruppel stripe is the result of both activation and repression from Hunchback, which acts as a dual transcriptional regulator on Kr. Hunchback is one of the most intriguing among the segmentation genes. Maternal hb mRNA is deposited uniformly, but its translation is limited to the anterior, zygotic anterior expression of hb is under control of Bcd and Hb itself. Zygotic posterior expression of Hunchback (not included in the current model) is under the control of the terminal torso signaling system. Giant is activated by opposing gradients of Bicoid and Caudal and initially expressed in a broad domain, which refines later into anterior and posterior stripes. This late pattern appears to be the consequence of Kruppel repression (Papatsenko, 2011).

Predicting functional properties of a gene network combining even a dozen genes may be a difficult task. To facilitate the functional exploration, gene regulatory networks are often split into network domains or smaller units, network motifs with known or predictable properties. The network motif based models can explain dynamics of developmental gradients and even evolution of gradient systems and underlying gene regulatory networks. The gene network leading to the formation of spatial gap gene expression patterns is an example, where simple logic appeared to be far behind the system's complexity. Gap genes provide first response to maternal gradients in the early fly embryo and form a series of broad stripes of gene expression in the first hours of the embryo development. While the system has been extensively studied in the past two decades both in vivo and in silico a simple and comprehensive model explaining function of the entire network has been missing (Papatsenko, 2011).

In the current study, a modular design has been proposed for the gap gene network; the network has been represented as two similar parallel modules (or two sub networks). Each module involved three network motifs, two for maternal inputs (one for one gap gene) and a toggle switch describing mutual repression in the pair of the gap genes. Formally, the toggle switches present in the gap gene network are evocative of the bistable phage lambda switch; however, they are operated by maternal inputs and their steady state solutions depend on spatial position in embryo, not environmental variables. The proposed modular design accommodated 5-7 realistic parameters and reproduced major known features of the gap gene network (Papatsenko, 2011).

Mapping Polycomb Response Elements at the Drosophila melanogaster giant Locus

Polycomb-group (PcG) proteins are highly conserved epigenetic transcriptional regulators. They are capable of either maintaining the transcriptional silence of target genes through many cell cycles or enabling a dynamic regulation of gene expression in stem cells. In Drosophila melanogaster, recruitment of PcG proteins to targets requires the presence of at least one Polycomb Response Element (PRE). Although the sequence requirements for PREs are not well defined, the presence of Pho, a PRE-binding PcG protein, is a very good PRE indicator. This study identified two PRE-containing regions at the PcG target gene, giant, one at the promoter and another approximately 6 kb upstream. PRE-containing fragments, which coincide with localized presence of Pho in chromatin immunoprecipitations, were shown to maintain restricted expression of a lacZ reporter gene in embryos and to cause pairing-sensitive silencing of the mini-white gene in eyes. The results also reinforce previous observations that, although PRE maintenance and pairing-sensitive silencing activities are closely linked, the sequence requirements for these functions are not identical (Alhaj Abed, 2013).

Targets of Activity

Ectopically expressed giant represses the expression of both the Krüppel and knirps segmentation gap genes. An analysis of the interactions between Krüppel, knirps and giant reveals a network of negative regulation. The apparent positive regulation of knirps by Krüppel is in fact mediated by a negative effect of Krüppel on giant and a negative effect of Giant on knirps (Capovilla, 1992).

A 480 bp region of the even-skipped promoter is both necessary and sufficient to direct a stripe of LacZ expression within the limits of the endogenous eve stripe 2. The maternal morphogen Bicoid and the gap proteins Hunchback, Krüppel and Giant all bind with high affinity to closely linked sites within this small promoter element. GT is directly involved in the formation of the anterior border, although additional repressors may participate in this process (Small, 1992).

The entire functional even-skipped locus of Drosophila is contained within a 16 kilobase region. As a transgene, this region is capable of rescuing even-skipped mutant flies to fertile adulthood. Detailed analysis of the 7.7 kb of regulatory DNA 3' of the transcription unit reveals ten novel, independently regulated patterns. Most of these patterns are driven by non-overlapping regulatory elements, including ones for syncytial blastoderm stage stripes 1 and 5, while a single element specifies both stripes 4 and 6. Expression analysis in gap gene mutants shows that stripe 5 is restricted anteriorly by Krüppel and posteriorly by giant, the same repressors that regulate stripe 2. Consistent with the coregulation of stripes 4 and 6 by a single cis-element, both the anterior border of stripe 4 and the posterior border of stripe 6 are set by zygotic hunchback, and the region between the two stripes is carved out by knirps. Thus the boundaries of stripes 4 and 6 are set through negative regulation by the same gap gene domains that regulate stripes 3 and 7, but at different concentrations (Fujioka, 1999).

The striped expression pattern of the pair-rule gene even skipped (eve) is established by five stripe-specific enhancers, each of which responds in a unique way to gradients of positional information in the early Drosophila embryo. The enhancer for eve stripe 2 (eve 2) is directly activated by the morphogens Bicoid (Bcd) and Hunchback (Hb). Since these proteins are distributed throughout the anterior half of the embryo, formation of a single stripe requires that enhancer activation is prevented in all nuclei anterior to the stripe 2 position. The gap gene giant (gt) is involved in a repression mechanism that sets the anterior stripe border, but genetic removal of gt (or deletion of Gt-binding sites) causes stripe expansion only in the anterior subregion that lies adjacent to the stripe border. A well-conserved sequence repeat, (GTTT)₄ has been identified that is required for repression in a more anterior subregion. This site is bound specifically by Sloppy-paired 1 (Slp1), which is expressed in a gap gene-like anterior domain. Ectopic Slp1 activity is sufficient for repression of stripe 2 of the endogenous eve gene, but is not required, suggesting that it is redundant with other anterior factors. Further genetic analysis suggests that the (GTTT)₄-mediated mechanism is independent of the Gt-mediated mechanism that sets the anterior stripe border, and suggests that a third mechanism, downregulation of Bcd activity by Torso, prevents activation near the anterior tip. Thus, three distinct mechanisms are required for anterior repression of a single eve enhancer, each in a specific position. Ectopic Slp1 also represses eve stripes 1 and 3 to varying degrees, and the eve 1 and eve 3+7 enhancers each contain GTTT repeats similar to the site in the eve 2 enhancer. These results suggest a common mechanism for preventing anterior activation of three different eve enhancers (Andrioli, 2002).

Previous experiments suggested that the gap gene gt and the Gt-binding sites are required for the correct positioning of the anterior eve 2 border. To test the relationship between the (GTTT)₄-binding activity and Gt-mediated repression, the eve2Delta(GTTT)₄-lacZ construct was crossed into a gt mutant background. If the two repression mechanisms are independent, an additive effect would be expected from combining these two perturbations. If, however, Gt-mediated repression is partially redundant with the (GTTT)₄-binding activity, removing both might cause a more severe derepression. In this cross there is an anterior shift and slight expansion of stripe 2 that is similar to the effects on the wild-type eve 2 transgene in gt mutants. No new effect is detected on the band of derepression created by deleting the (GTTT)₄ site, and a small repressed area is still maintained between the two parts of the pattern. This result is consistent with an additive effect, and suggests that the (GTTT)₄-binding activity functions independently of Gt-mediated repression. The failure to derepress in the region between the two parts of the pattern probably reflects the activity of the unknown protein X, which normally participates with Gt in repression (Andrioli, 2002). To further test the roles of Slp1 and Gt in eve patterning, a fragment of the snail (sna) promoter and the yeast FLP-FRT system were used to drive ectopic domains of each gene along the ventral surface of the embryo. This method is an efficient way to test whether any gene is sufficient for repression of individual stripes because the ventral expression domain intersects all seven eve stripes. In this assay, Slp1 expression alone distorts the expression of eve 1 in ventral regions by shifting it posteriorly, and causes a strong repression of eve 2 and a weaker repression of eve 3. By contrast, there is no detectable effect on the posterior stripes. Thus, Slp1 activity is sufficient for repression of specific anterior stripes including eve 2. By contrast, ventrally misexpressed Gt causes only a weak repression of eve 1 and 2, but strongly affects eve 5, a repression target of the posterior gt expression domain. The minor effect of Gt on eve 2 is transient, and the stripe recovers and expands posteriorly later in cycle 14. This expansion is probably caused by repression of Kr, which forms the posterior border of eve 2. These results confirm that Gt is not sufficient for effective repression of eve 2, and that its effect is much weaker than Slp1-mediated repression. Embryos that contain ventral expression domains of both Slp1 and Gt were constructed. While effects of both genes are detected within the same embryos, there is no evidence of synergistic repression activity in these experiments. This is consistent with the demonstration that the (GTTT)₄ site is independent of Gt-mediated repression (Andrioli, 2002).

A genetic and molecular analysis of two Hairy (H) pair-rule stripes has been carried out in order to determine how gradients of gap proteins position adjacent stripes of gene expression in the posterior of Drosophila embryos. Regulatory sequences of hairy have been identified that are critical for the expression of h stripes 5 and 6. Fragments of 302 bp and 526 bp are required for stripe 5 and 6 activation, respectively. Posterior stripe boundaries are established by gap protein repressors unique to each stripe: h stripe 5 posterior border repressed by the Giant protein and h stripe 6 posterior border is repressed by the Hunchback protein (Langeland, 1994).

Giant regulates fushi tarazu in two regions: stripes 1 and 2 and stripes 5 and 6 (Reinitz, 1990).

Giant regulates the establishment of the expression patterns of Antennapedia and Abdominal-B. In particular, Giant is the factor that controls the anterior limit of early Antennapedia expression (Reinitz, 1990).

Analysis of the initial paired expression suggests that the gap genes hunchback, Krüppel, knirps and giant activate paired expression in stripes. Two exceptions are noted: stripe 1 is activated by even-skipped, and stripe 8 depends upon runt (Gutjahr, 1993).

Early developmental patterning of the Drosophila embryo is driven by the activities of a diverse set of maternally and zygotically derived transcription factors, including repressors encoded by gap genes such as Krüppel, knirps, and giant and the mesoderm-specific snail gene. At a molecular level, the mechanism of repression by gap transcription factors is not well understood. Initial characterization of these transcription factors suggests that they act as short-range repressors, interfering with the activity of enhancer or promoter elements 50 to 100 bp away. To better understand the molecular mechanism of short-range repression, the properties of the Giant gap protein have been investigated. The ability of endogenous Giant to repress when bound close to the transcriptional initiation site was tested. Tandem 'CD1' Giant binding sites derived from the Krüppel (Kr) promoter were inserted 5' of a transposase basal promoter (55 bp from the initiation of transcription of the lacZ reporter gene). Strong repression of the lacZ gene is detected in embryos bearing this transgene; expression of the lacZ gene in both anterior and posterior Giant regions is strongly attenuated. The repression of lacZ in anterior and posterior regions is relieved when the transgene is assayed in gt mutant embryos, confirming that the Giant protein is mediating the repression. Giant effectively represses a heterologous promoter when binding sites are located at -55 bp with respect to the start of transcription. Consistent with its role as a short-range repressor, as the binding sites are moved to more distal locations, repression is diminished (Hewitt, 1999).

It is probable that more distally located repressor sites (gt-110 and gt-160, located respectively 110 and 160 bp from the start of transcription) may be effective only at higher concentrations of Giant because the binding sites may not be filled effectively. A partially filled site may still be effective at close range because when Giant is close to the promoter, chance interactions between Giant and its target will occur more frequently, obviating the need for saturation of the binding site by Giant. At a greater distance, these chance interactions would be less frequent, leading to weak repression. Partial filling of binding sites may simply be an indication that the levels of Giant protein are below the Kd for the binding site, so the site is empty for a fraction of the time. Alternatively, even identical sites may not bind Giant protein equally well; Giant may bind cooperatively to its cognate sites with a target protein, so that moving the sites farther from the target may break repressor-target cooperative interactions. Raising the level of Giant protein (moving up the gradient) would suffice to give greater occupancy of Giant sites, and reestablish repression. Quantitation of relative Giant protein levels in embryos suggests that differences in repressor protein levels of less than two fold is sufficient to switch a gene from an active to an inactive state. Similar cooperative effects have been reported for activators, including the Bicoid activator. If Giant is located close to the promoter, chance interactions between Giant and its target will occur more frequently because of the short diffusion distance, obviating the need for saturation of the binding site by Giant. The mechanism whereby endogenous promoters might differentially respond to different concentrations of repressors is not known, but the results obtained from this study suggest that exact placement of short-range repressors with respect to other promoter elements and the number of binding sites might suffice to endow a promoter with high or low sensitivity. Additional factors, such as binding site affinity, may also contribute to differential promoter sensitivity toward repression. For example, a binding site located within the eve stripe 2 enhancer was found to bind Giant protein less well than a site in the Kr promoter: consistent with this finding, eve has been found to be less sensitive to Giant repression than Kr. Thus, in addition to binding site affinity and number, cis element positioning within a promoter can affect the response of a gene to a repressor gradient (Hewitt, 1999 and references).

A chimeric Gal4-Giant protein lacking the basic leucine zipper domain can specifically repress reporter genes, suggesting that the Giant effector domain is an autonomous repression domain. The three Giant sites in the even-skipped promoter cover large regions and do not closely resemble the compact sites from the Kr promoter. Giant is likely to interact as a homodimer with the Kr promoter, and as a heterodimer with the complex eve stripe 2 enhancer sites, using an as-yet unidentified partner basic leucine zipper protein. Such a gene product may be encoded by a locus identified on the left arm of chromosome 2, mutations in which can cause a gt-like phenotype. Consistent with the possibility that Giant protein repression may involve other factors, recent experiments suggest that eve might be co-regulated by a Giant partner protein localized to the anterior portion of the embryo (Wu, 1998). However, the activity of the Gal4-Giant protein on a variety of activators shows Giant can act as a homodimer on reporter genes (the Gal4 DNA binding domain binds to its cognate site as a dimer). If Giant binds to DNA as a heterodimer in the embryo, the partner protein may serve to modulate DNA binding rather than to effect transcriptional repression (Hewitt, 1999 and references).

The analysis of transgenes with successively greater repressor-promoter spacing demonstrates that short-range repression can act on more than an all-or-nothing basis, however, the activity of these repressors is sharply attenuated over relatively short distances. It has been suggested that short-range repression may directly target adjacent transcription factors in a process akin to quenching; in this case, Giant might alternately quench transcription activators or the basal machinery, depending on the location of the binding site. Such lack of specificity may be consistent with quenching acting through chromatin modification on an extremely local level, as has been reported for Ume6-mediated repression in yeast. Alternatively, repression by direct targeting of the basal transcription machinery is possible, as has been reported for the Krüppel and Even-Skipped proteins. Cofactors such as dCtBP are implicated in the activity of some short-range repressors, such as Knirps, Krüppel and Snail, but apparently not Giant . Further molecular characterization of Giant repression activity will be necessary to distinguish between these alternatives (Hewitt, 1999 and references).

The gene proboscipedia (pb) is a member of the Antennapedia complex in Drosophila and is required for the proper specification of the adult mouthparts. In the embryo, pb expression serves no known function despite having an accumulation pattern in the mouthpart anlagen that is conserved across several insect orders. Several of the genes necessary to generate this embryonic pattern of expression have been identified. These genes can be roughly split into three categories based on their time of action during development. (1) Prior to the expression of pb, the gap genes are required to specify the domains where pb may be expressed. (2) The initial expression pattern of pb is controlled by the combined action of the genes Deformed (Dfd), Sex combs reduced (Scr), cap'n'collar (cnc), and teashirt (tsh). cnc and tsh act as as negative regulators of pb expression in the mandible and first thoracic segments, respectively. (3) Maintenance of this expression pattern later in development is dependent on the action of a subset of the Polycomb group genes. These interactions are mediated in part through a 500-bp regulatory element in the second intron of pb. Dfd protein binds in vitro to sequences found in this fragment. This is the first clear demonstration of autonomous positive cross-regulation of one Hox gene by another in Drosophila and the binding of Dfd to a cis-acting regulatory element indicates that this control might be direct (Rusch, 2000).

Many of the genes that are members of either the gap, pair rule, or segment polarity genes have some effect on the pattern of pb accumulation. For the most part, mutations in genes of these classes reduce the number of cells expressing pb but do not eliminate Pb entirely from the affected segments. In no case do they cause pb to accumulate ectopically. The most striking results were caused by zygotic mutations in the genes buttonhead (btd), giant (gt), and hunchback (hb). btd is a head gap gene required for formation of the mandibular segment. In btd mutants, no mandibular structures are seen and no pb accumulation occurs anterior of the maxillary segment. pb accumulation is normal in the other gnathal segments. Mutations in both gt and hb disrupt the formation of the labial lobe and result in concomitant loss of pb expression therein. When the pb reporter #7 was tested in a hb mutant background no lacZ expression in the presumptive labial segment was found. In the case of gt, pb expression is not entirely extinguished. Weak pb accumulation can sometimes be seen in the most dorsal and posterior cells of the presumptive labial segment, overlapping with the few remaining cells of the engrailed stripe in the labial segment. For both gt and hb, this reduction or loss of pb expression in the labial lobe cannot be attributed to alterations in the Scr pattern because Scr accumulates in the cells posterior to the maxillary segment (Rusch, 2000).

Dynamical analysis of regulatory interactions in the gap gene system of Drosophila

Genetic studies have revealed that segment determination in Drosophila melanogaster is based on hierarchical regulatory interactions among maternal coordinate and zygotic segmentation genes. The gap gene system constitutes the most upstream zygotic layer of this regulatory hierarchy, responsible for the initial interpretation of positional information encoded by maternal gradients. A detailed analysis of regulatory interactions involved in gap gene regulation is presented based on gap gene circuits, which are mathematical gene network models used to infer regulatory interactions from quantitative gene expression data. The models reproduce gap gene expression at high accuracy and temporal resolution. Regulatory interactions found in gap gene circuits provide consistent and sufficient mechanisms for gap gene expression, which largely agree with mechanisms previously inferred from qualitative studies of mutant gene expression patterns. These models predict activation of Kr by Cad and clarify several other regulatory interactions. This analysis suggests a central role for repressive feedback loops between complementary gap genes. Repressive interactions among overlapping gap genes show anteroposterior asymmetry with posterior dominance. Finally, these models suggest a correlation between timing of gap domain boundary formation and regulatory contributions from the terminal maternal system (Jaeger, 2004b).

Although activating contributions from Bcd and Cad show some degree of localization, positioning of gap gene boundaries during cycle 14A is largely under the control of repressive gap-gap cross-regulatory interactions. Thereby, activation is a prerequisite for repressive boundary control, which counteracts broad activation of gap genes in a spatially specific manner. In addition, gap genes show a tendency toward autoactivation, which increasingly potentiates activation by Bcd and Cad during cycle 14A. Autoactivation is involved in maintenance of gap gene expression within given domains and sharpening of gap domain boundaries during cycle 14A (Jaeger, 2004b).

Regulatory loops of mutual repression create positive regulatory feedback between complementary gap genes, providing a straightforward mechanism for their mutually exclusive expression patterns. Such a mechanism of 'alternating cushions' of gap domains has been proposed previously. The results suggest that this mechanism is complemented by repression among overlapping gap genes. Overlap in expression patterns of two repressors imposes a limit on the strength of repressive interactions between them. Accordingly, repression between neighboring gap genes is generally weaker than that between complementary ones. Moreover, repression among overlapping gap genes is asymmetric, centered on the Kr domain. Posterior to this domain, only posterior neighbors contribute functional repressive inputs to gap gene expression, while anterior neighbors do not. This asymmetry is responsible for anterior shifts of posterior gap gene domains during cycle 14A (Jaeger, 2004b).

Repression by Tll mediates regulatory input to gap gene expression by the terminal maternal system. Tll provides the main repressive input to early regulation of the posterior boundary of posterior gt, and activation by Tll is required for posterior hb expression. Note that these two features form only during cycle 13 and early cycle 14A, while other gap domain boundaries are already present at the transcript level during cycles 10-12 and largely depend on the anterior and posterior maternal systems for their initial establishment. The delayed formation of posterior patterning features and their distinct mode of regulation are reminiscent of segment determination in primitive dipterans and intermediate germ-band insects, supporting a conserved dynamical mechanism across different insect taxa (Jaeger, 2004b).

The set of regulatory interactions presented here provides a consistent and sufficient dynamical mechanism for gap gene expression. In summary, this set of interactions consists of the following five basic regulatory mechanisms: (1) broad activation by Bcd and/or Cad, (2) autoactivation, (3) strong repressive feedback between mutually exclusive gap genes, (4) asymmetric repression between overlapping gap genes, and (5) feed-forward repression of posterior domain boundaries by the terminal gap gene tll. In the following subsections, evidence is discussed concerning specific regulatory interactions involved in each of these basic mechanisms in some detail (Jaeger, 2004b).

Activation by Bcd and Cad: Activation of gap gene expression by Bcd and Cad is supported by the following. Bcd binds to the regulatory regions of hb, Kr, and kni. The kni regulatory region also contains binding sites for Cad. The anterior domains of gt and hb are absent in embryos from bcd mothers. The posterior domain of gt is missing in embryos mutant for both maternal and zygotic cad, while the posterior domain of kni is absent in embryos mutant for maternal bcd plus maternal and zygotic cad. These results suggest partial redundancy of activation of kni by Bcd, consistent with evidence from zygotic cad embryos from bcd mothers, where maternally provided Cad is sufficient to activate kni (Jaeger, 2004b).

Kr expression expands anteriorly in embryos from bcd mothers, which is due to the absence of the anterior gt and hb domains. Bcd has been shown to activate expression of Kr reporter constructs. The fact that Kr is still expressed in embryos from bcd mutant mothers has been attributed to activation by general transcription factors or low levels of Hb. In contrast, the models predict that this activation is provided by Cad. Although Kr expression is normal in embryos overexpressing cad, repressive control of Kr boundaries could account for the lack of expansion of the Kr domain in such embryos (Jaeger, 2004b).

The activating effect of Cad on hb found in gap gene circuits is likely to be spurious. The anterior hb domain is absent in embryos from bcd mutant mothers, which show uniformly high levels of Cad. Moreover, the complete absence of the posterior hb domain in tll mutants suggests activation of posterior hb by Tll rather than by Cad. It is believed that this spurious activation of hb by Cad is due to the absence of hkb in gap gene circuits. The posterior hb domain fails to retract from the posterior pole in hkb mutants, suggesting a repressive role of Hkb in regulation of the posterior hb border. Consistent with this, the posterior boundary of the posterior hb domain never fully forms in any of the circuits. Moreover, Tll is constrained to a very small or no interaction with hb due to the absence of the posterior repressor Hkb, since activation of hb by Tll would lead to increasing hb expression extending to the posterior pole (Jaeger, 2004b).

Autoactivation:: A role for autoactivation in the late phase of hb regulation is supported by the fact that the posterior border of anterior hb is shifted anteriorly in a concentration-dependent manner in embryos with decreasing doses of zygotic Hb. Weakened and narrowed expression of Kr in mutants encoding a functionally defective Kr protein suggests Kr autoactivation. Similarly, a delay in the expression of gt in mutants encoding a defective Gt protein indicates gt autoactivation. However, the results suggest that gt autoactivation is not essential. It is generally weaker than autoactivation of other gap genes, and circuits lacking gt autoactivation show no specific defects in gt expression. Finally, in the case of kni, there is no experimental evidence for autoactivation, while some authors have even suggested kni autorepression. No such autorepression has been detected in any gap gene circuit (Jaeger, 2004b).

Repression between complementary gap genes: Mutual repression of gt and Kr is supported by the following. gt expression expands into the region of the central Kr domain in Kr embryos. In contrast, Kr expression is not altered in gt mutants before germ-band extension. However, Gt binds to the Kr regulatory region, and the central domain of Kr is absent in embryos overexpressing gt. Moreover, Kr expression extends further anterior in hb gt double mutants than in hb mutants alone. The above is consistent with this analysis, which shows no significant derepression of Kr in the absence of Gt even though repression of Kr by Gt is quite strong (Jaeger, 2004b).

Hb binds to the kni regulatory region, and the posterior kni domain expands anteriorly in hb mutants. Embryos overexpressing hb show no kni expression at all, and embryos misexpressing hb show spatially specific repression of kni expression.There is no clear posterior expansion of kni in hb mutants. This could be due to the relatively weak and late repressive contribution of Hb on the posterior kni boundary or due to partial redundancy with repression by Gt and Tll. The posterior hb domain expands anteriorly in kni mutants, but anterior hb expression is not altered in these embryos. Nevertheless, a role of Kni in positioning the anterior hb domain is suggested by the fact that misexpression of kni leads to spatially specific repression of both anterior and posterior hb domains. Moreover, only slight posterior expansion of anterior hb is observed in Kr mutants, while hb is completely derepressed between its anterior and posterior domains in Kr kni double mutants (Jaeger, 2004b).

Repression between overlapping gap genes: gt, kni, and Kr show repression by their immediate posterior neighbors hb, gt, and kni, respectively. Retraction of posterior Gt from the posterior pole during midcycle 14A fails to occur in hb mutants, and no gt expression is observed in embryos overexpressing hb. The posterior kni boundary is shifted posteriorly in gt mutant embryos, and kni expression is reduced in embryos overexpressing gt. Note that these effects are very subtle and were not reported in similar studies by different authors. A weak but functional interaction of Gt with kni is consistent with these results. This interaction was found to be essential even in a circuit where it was deemed below significance level. Finally, Kni has been shown to bind to the Kr regulatory region, and the central Kr domain expands posteriorly in kni mutants (Jaeger, 2004b).

In contrast, no effect of Kr on hb was detected. However, hb expression expands posteriorly in Kr mutants. This effect is likely to involve repression of hb by Kni. Kni levels are reduced in Kr embryos. hb is completely derepressed between its anterior and posterior domains in Kr kni double mutants, whereas anterior hb does not expand at all in kni mutants alone. Taken together these results suggests that there is direct repression of hb by Kr in the embryo, but it is at least partially redundant with repression of hb by Kni (Jaeger, 2004b).

Unlike repression by posterior neighbors, no or only weak repression was found of posterior kni, gt, and hb by their anterior neighbors Kr, kni, and gt, respectively. Most gap gene circuits show weak activation of hb by Gt. Graphical analysis failed to reveal any functional role for such activation. Moreover, no functional interaction was found between gt and Kni. Although relatively weak repression of kni by Kr was found in 6 out of 10 circuits, no specific patterning defects could be detected in the other 4. Consistent with the above, expression of posterior hb is normal in gt mutants, and both the anterior boundaries of posterior gt and kni are positioned correctly in kni and Kr mutant embryos, respectively (Jaeger, 2004b).

Note that activation of kni by Kr, which has been proposed to explain decreased expression levels of kni in Kr mutants, was never found. The results strongly support the view that this interaction is indirect through Gt, which is further corroborated by the fact that kni expression is completely restored in Kr gt double mutants compared to that in Kr mutants alone (Jaeger, 2004b).

A significant repressive effect of Hb on Kr was found. Consistent with this, Hb has been shown to bind to the Kr regulatory region, and the central Kr domain expands anteriorly in hb mutants. However, partial redundancy of this interaction is suggested by correct positioning and shape of the anterior Kr domain in a circuit that does not show repression of Kr by Hb (Jaeger, 2004b).

It has been proposed that Hb plays a dual role as both activator and repressor of Kr. In the framework of the gene circuit model, concentration-dependent switching of regulative action could be implemented by allowing genetic interconnection parameters to switch sign at certain regulator concentration thresholds. The current model explicitly does not include such a possibility. Nevertheless, circuits have been obtained that reproduce Kr expression faithfully, suggesting that a dual role of Hb is not required for proper Kr expression. Moreover, activation of Kr by Hb was ever observed in any of the circuits. Therefore, the results support a mechanism in which the activation of Kr by Hb is indirect through derepression of kni (Jaeger, 2004b).

Repression by Tll: Only a few earlier theoretical approaches have considered terminal gap genes. Gap gene circuits accurately reproduce tll expression. However, in gene circuits, tll is subject to regulation by other gap genes, which is inconsistent with experimental evidence. In contrast, the correct expression pattern of tll in gap gene circuits allows its effect on other gap genes to be studied in great detail. Strong repressive effects of Tll on Kr, kni, and gt have been found. Tll binding sites have been found in the regulatory regions of Kr and kni. In tll mutants, Kr expression is normal, whereas expression of kni expands posteriorly, and the posterior gt domain fails to retract from the posterior pole. No expression of Kr, kni, or gt can be detected in embryos overexpressing tll under a heat-shock promoter (Jaeger, 2004b).

Presence of TAF1-mediated histone acetylation at the transcriptionally silent gt promoter

TAF1 contains two kinase domains, an N-terminal (NTK, amino acids 1 to 496) and a C-terminal (CTK, amino acids 1496 to 2132) domain (Dikstein, 1996). In vitro, the NTK and the CTK autophosphorylate and the NTK transphosphorylates the RAP74 subunit of the GTF TFIIF. In contrast to the NTK, the CTK did not phosphorylate RAP74 but strongly phosphorylated H2B, indicating that the CTK possesses H2B-specific kinase activity. Protein kinases contain two essential functional motifs, an adenosine triphosphate (ATP) binding motif and an amino acid–specific kinase motif. Computational sequence comparison analyses identified a putative serine and threonine (S/T) kinase motif (amino acids 1534 to 1546) and two tandem ATP binding domains (amino acids 1747 to 1780) in the CTK. Interestingly, the S/T kinase motif is located in the first bromodomain of the double bromodomain module (DBD), which binds acetylated lysines. However, CTK(D1538A) retains the ability to bind acetylated H4 in vitro, indicating that the introduced mutation disrupts kinase activity rather than acetylated lysine-based substrate recognition. In addition to kinase domains, TAF1 has a histone acetyltransferase (HAT) domain that acetylates H3 Lys¹⁴ (H3-K14) and unidentified lysines in H4 in vitro (Maile, 2004).

Dynamic changes in chromatin structure, induced by posttranslational modification of histones, play a fundamental role in regulating eukaryotic transcription. Histone H2B is phosphorylated at evolutionarily conserved Ser³³ (H2B-S33) by the carboxyl-terminal kinase domain (CTK) of the Drosophila TFIID subunit TAF1. Phosphorylation of H2B-S33 at the promoter of the cell cycle regulatory gene string and the segmentation gene giant coincides with transcriptional activation. Elimination of TAF1 CTK activity in Drosophila cells and embryos reduces transcriptional activation and phosphorylation of H2B-S33. These data reveal that H2B-S33 is a physiological substrate for the TAF1 CTK and that H2B-S33 phosphorylation is essential for transcriptional activation events that promote cell cycle progression and development (Maile, 2004).

Transcription initiation in eukaryotes involves dynamic changes in chromatin structure that permit assembly of the transcription machinery at a gene promoter. The fundamental structural unit of chromatin is the nucleosome, which contains 146 base pairs of DNA wrapped around a histone octamer composed of two copies each of histones H2A, H2B, H3, and H4. Distinct patterns of histone modifications (e.g., acetylation, phosphorylation, and methylation) may act as 'modification cassettes' that facilitate DNA-dependent events. For example, in vertebrates phosphorylation of H2B Ser¹⁴ is associated with apoptotic chromatin, and in all eukaryotes phosphorylation of H3 Ser¹⁰ is associated with transcriptionally active and mitotic chromatin. Although all histones are phosphorylated in vivo, the function of many of these modifications and the kinases that carry them out are not known (Maile, 2004).

With the use of an in vitro kinase assay, it was found that the Drosophila general transcription factor (GTF) TFIID phosphorylates histone H2B but not H1, H2A, H3, or H4. TFIID is a multiprotein complex composed of the TATA box–binding protein (TBP) and numerous TBP-associated factors (TAFs). TFIID functions during transcription initiation by nucleating assembly of GTFs and RNA polymerase II at the promoter. TAF1 (formerly TAF_II250) is the only TFIID subunit that possesses kinase activity, suggesting that it phosphorylates H2B (Wassarman, 2001). In fact, recombinant TAF1 and denatured and renatured recombinant TAF1 phosphorylated H2B in vitro, demonstrating that TAF1 has intrinsic, H2B-specific kinase activity. Collectively, these results indicate that TAF1 alone and in the context of TFIID phosphorylates H2B (Maile, 2004).

Protein kinases contain two essential functional motifs, an adenosine triphosphate (ATP) binding motif and an amino acid–specific kinase motif. Computational sequence comparison analyses identified a putative serine and threonine (S/T) kinase motif (amino acids 1534 to 1546) and two tandem ATP binding domains (amino acids 1747 to 1780) in the CTK. To test whether the identified motifs mediate H2B phosphorylation, in vitro kinase assays were performed with the use of CTK polypeptides lacking the S/T kinase motif (CTKDelta1600) or the ATP binding motifs (CTKDeltaATP). Relative to the wild-type CTK, CTKDelta1600 and CTKDeltaATP weakly phosphorylated H2B. To confirm the role of the S/T kinase motif, a catalytically important aspartic acid was mutated to an alanine (D1538A) in the motif. Like CTKDelta1600, CTK(D1538A) exhibited weak autophosphorylation and H2B transphosphorylation activities. Interestingly, the S/T kinase motif is located in the first bromodomain of the double bromodomain module (DBD), which binds acetylated lysines. However, CTK(D1538A) retains the ability to bind acetylated H4 in vitro, indicating that the introduced mutation disrupts kinase activity rather than acetylated lysine-based substrate recognition. Thus, the identified S/T kinase and ATP binding motifs of the TAF1 CTK are essential for H2B phosphorylation (Maile, 2004).

To identify H2B residue(s) phosphorylated by the CTK, whether the CTK phosphorylates the N-terminal tail of Drosophila H2B (amino acids 1 to 39, H2BT) or the tailless H2B core domain (amino acids 40 to 123) was examined; the CTK phosphorylated H2BT but not the H2B core domain. Next, to pinpoint which residue(s) in H2BT is phosphorylated, mutant H2BT peptides were generated in which alanines replaced all or individual serines or threonines. The CTK did not phosphorylate peptides lacking all serines, suggesting that it phosphorylates either Ser⁵ (H2B-S5) or Ser³³ (H2B-S33). To test this, H2BT peptides were used as substrates that contained alanines in place of H2B-S5, H2B-S33, or both (H2BT-S5A, H2BT-S33A, and H2BT-S5/33A, respectively). The CTK phosphorylated H2BT-S5A but not H2BT-S33A or H2BT-S5/33A, indicating that H2B-S33 is the target of the CTK (Maile, 2004).

To investigate whether H2B-S33 is phosphorylated in vivo, a polyclonal antibody was raised recognizing phosphorylated H2B-S33 (H2B-S33P). On Western blots, the antibody recognized H2BT containing H2B-S33P but not recombinant, unphosphorylated H2B or an H3 peptide (amino acids 1 to 32) containing phosphorylated Ser¹⁰ and Ser²⁸. In addition, the H2B-S33P antibody recognized H2BT and recombinant H2B that was phosphorylated in vitro by the CTK or TFIID, indicating that the antibody specifically recognizes phosphorylated H2B-S33. The H2B-S33 antibody also recognized a protein with a molecular weight similar to that of H2B from histone preparations from Drosophila embryos or S2 cells, providing evidence that H2B-S33 is a target for phosphorylation in vivo. To determine whether TAF1 mediates H2B-S33 phosphorylation in vivo, RNA interference (RNAi) was used to eliminate TAF1 expression in S2 cells. As shown by Western blot analysis, both TAF1 expression and H2B-S33 phosphorylation were reduced in TAF1 RNAi cells compared with mock RNAi cells, suggesting that TAF1 is a major H2B-S33 kinase in vivo (Maile, 2004).

Flow cytometry analysis of TAF1 RNAi cells revealed that loss of TAF1 results in G2-M phase cell cycle arrest. To test the hypothesis that TAF1 controls the transcription of genes whose activities contribute to G2-M progression, microarray expression profiling and reverse transcription polymerase chain reaction (RT-PCR) were used to monitor transcription in mock and TAF1 RNAi cells. Both methods showed that transcription of string (stg), which encodes a Drosophila homolog of yeast Cdc25, was reduced. The Stg protein phosphatase is predominantly expressed during G2 and activates the cell cycle by dephosphorylating Cdc2. Because loss of stg from S2 cells by RNAi causes G2-M arrest, TAF1 may regulate G2-M progression by activating stg transcription (Maile, 2004).

Chromatin immunoprecipitation (XChIP) was used to establish whether there is a direct correlation between transcriptional activation of stg and TAF1-mediated phosphorylation of H2B-S33 at the stg promoter. Cross-linked chromatin was isolated from mock and TAF1 RNAi S2 cells and immunoprecipitated with TAF1 or H2B-S33P antibodies. Immunoprecipitated DNA was purified and used as a template for PCR to detect the stg promoter or coding region and actin5C promoter. In contrast, TAF1 is not essential for actin5C transcription, and H2B-S33P antibodies do not precipitate the actin5C promoter. Thus, the transcriptional dependence of a gene for TAF1 is correlated with H2B-S33 phosphorylation, not with TAF1 association (Maile, 2004).

To distinguish whether loss of H2B-S33 phosphorylation at the stg promoter is due directly to loss of TAF1 or indirectly to G2-M arrest, XChIP analysis was performed on S2 cells arrested in G2-M by RNAi of the SIN3 transcriptional corepressor. Stg transcription is repressed in SIN3 RNAi cells, yet the stg promoter remains associated with H2B-S33P and TAF1, indicating that loss of H2B-S33 phosphorylation in TAF1 RNAi cells is because of elimination of TAF1 rather than G2-M arrest (Maile, 2004).

In addition to kinase domains, TAF1 has a histone acetyltransferase (HAT) domain that acetylates H3 Lys¹⁴ (H3-K14) and unidentified lysines in H4 in vitro (Mizzen, 1996). XChIP analysis detected acetylated H3-K14 and H4 at the transcriptionally active stg promoter in mock RNAi cells but not at the inactive stg promoter in TAF1 RNAi cells. In contrast, TAF1-independent histone modifications did not correlate with activation of stg in mock and TAF1 RNAi cells. Taken together, these results indicate that TAF1-mediated phosphorylation of H2B-S33 and acetylation of H3 and H4 potentiate transcriptional activation in Drosophila cells (Maile, 2004).

To investigate the role of TAF1-mediated phophorylation of H2B-S33 during fly development, a recessive lethal TAF1 allele, TAF1^CTK, was used which contains a nonsense mutation at amino acid 1728 that truncates the CTK downstream of the DBD was used. The corresponding protein (TAF1DeltaCTK) is expressed in Drosophila but presumably does not have CTK activity, because it does not phosphorylate H2B in vitro. In situ hybridization was used to monitor transcription in embryos homozygous mutant for TAF1^CTK and heterozygous mutant for the maternal activator Caudal (Cad). In this genetic background, transcription of the gap gene giant (gt) was reduced. Gt is transcribed in two domains along the anterior-posterior axis of blastoderm-stage embryos. Transcription of the posterior gt domain (pgt) is Cad-dependent, whereas transcription of the anterior gt domain (agt) is Cad-independent. Relative to controls (cad/+ or TAF1^CTK), pgt transcription was reduced in cad/+;TAF1^CTK embryos (Maile, 2004).

XChIP analysis was used to examine whether TAF1-mediated phosphorylation of H2B-S33 contributes to pgt transcription. Cross-linked chromatin was isolated from the posterior halves of cad/+;TAF1^CTK and control embryos and immunoprecipitated with antibodies to H2B-S33P, acetylated histones, or TAF1. PCR analysis detected H2B-S33P at the transcriptionally active gt promoter in control embryos, but not at the transcriptionally repressed promoter in cad/+;TAF1^CTK embryos. To monitor TAF1 binding, two antibodies, TAF1-M and TAF1-C, where used that recognize the middle domain and the CTK of TAF1, respectively. Both antibodies precipitated the gt promoter from control embryos, indicating that TAF1DeltaCTK and maternally contributed, wild-type TAF1 are present at the gt promoter in the pgt. In contrast, although the TAF1-M antibody precipitated the gt promoter from cad/+;TAF1^CTK embryos, TAF1-C did not. Because TAF1DeltaCTK is present at a higher concentration in cad/+;TAF1^CTK embryos than maternal TAF1, this result indicates that TAF1DeltaCTK is preferentially recruited to the gt promoter in the pgt. This result is supported by the presence of TAF1-mediated histone acetylation at the transcriptionally silent gt promoter. Thus, TAF1-mediated phosphorylation of H2B-S33 contributes to transcriptional activation during Drosophila embryogenesis (Maile, 2004).

Ser³³ is the only evolutionarily conserved serine or threonine in the N-terminus of metazoan H2Bs. In the crystal structure of the Xenopus laevis nucleosome, the equivalent serine links the H2B DNA-binding N-terminal tail to the histone fold domain. Thus, replacing the hydroxyl group on Ser³³ with a bulkier, negatively charged phosphate group may drastically affect H2B tail interactions with DNA. This is important because the H2B tail regulates nucleosome mobility. Deletion of the tail bypasses the requirement for the SWI/SNF nucleosome-remodeling complex in yeast, and the tail is critical for maintaining the position of histone octamers in in vitro sliding assays. These findings support a model in which TAF1-mediated phosphorylation of H2B-S33 disrupts DNA-histone interactions, resulting in local decondensation of chromatin. Decondensation may trigger chromatin remodeling and formation of a chromatin structure that facilitates assembly of other GTFs at a promoter, a function that is primarily attributed to TFIID (Maile, 2004).

These data indicate that the S/T kinase motif of the CTK is located in the DBD. In the crystal structure of the DBD, the position of the S/T kinase motif does not overlap with the acetylated lysine-binding surface of the DBD, suggesting that it is an independent functional unit of the DBD. Members of the fsh/RING3 (BET) family of DBD proteins have kinase activity, suggesting that TAF1 is a member of a kinase family whose catalytic motif resides within the DBD (Maile, 2004).

Phosphorylation of H2B-S33 by TAF1 is essential for transcriptional activation of stg/cdc25 and, consequently, cell cycle progression. Similarly, depletion of yeast TAF5, human TAF2, or a twofold reduction in chicken TBP results in G2-M arrest. Like TAF1, TBP regulates stg/cdc25 expression, providing support for the finding that the H2B-S33 kinase activity of TAF1 occurs in the context of TFIID. Interestingly, depletion of yeast TAF1, which does not possess a CTK, and inactivation of TAF1 HAT activity induce G1 arrest because of reduced transcription of B- and D-type cyclins, respectively (Apone, 1996; Dunphy, 2001). Thus, loss of all TAF1 activities causes G2-M arrest whereas loss of TAF1 HAT activity causes G1 arrest, suggesting gene-specific requirements for TAF1 CTK and HAT activities. In contrast, the presence of phosphorylated H2B-S33 and acetylated H3 and H4 at the stg and gt promoters implies that TAF1 CTK and HAT activities can cooperate in transcriptional activation of some genes. This proposal is supported by the finding that loss of H2B-S33P from the gt promoter results in reduced transcription, despite the presence of TAF1-mediated histone acetylation. Thus, TAF1-mediated phosphorylation of H2B-S33 may work in concert with other TAF1-mediated histone modifications, H1 ubiquitination, and H3 and H4 acetylation to contribute to the chromatin-based mechanisms underlying transcription activation of eukaryotic genes (Maile, 2004).

Reverse engineering the gap gene network of Drosophila

A fundamental problem in functional genomics is to determine the structure and dynamics of genetic networks based on expression data. A new strategy is described for solving this problem and it is applied to recently published data on early Drosophila development. The method is orders of magnitude faster than current fitting methods and allows fitting of different types of rules for expressing regulatory relationships. Specifically, this approach is sused to fit models using a smooth nonlinear formalism for modeling gene regulation (gene circuits) as well as models using logical rules based on activation and repression thresholds for transcription factors. The technique also allows inference of regulatory relationships de novo or testing network structures suggested by the literature. A series of models is fitted to test several outstanding questions about gap gene regulation, including regulation of and by hunchback and the role of autoactivation. Based on the modeling results and validation against the experimental literature, a revised network structure is proposed for the gap gene system. Interestingly, some relationships in standard textbook models of gap gene regulation appear to be unnecessary for, or even inconsistent with, the details of gap gene expression during wild-type development (Perkins, 2006).

The regulatory structure of the Combined model is itself sufficient to reproduce all six gap gene domains using either the gene circuit or logical formalisms for production rate functions. Support is cited for the Combined model, and then consider the results of the individual models in light of several outstanding questions about gap gene regulation are discussed (Perkins, 2006).

The maternal proteins Bcd and Cad are largely responsible for activating the trunk gap genes, with Bcd being more important for the anterior domains and Cad more important for the posterior domains. Bcd is a primary activator of the anterior hb domain, the anterior gt domain, and the Kr domain. Cad activates posterior gt. The kni domain is present in bcd mutants and in cad mutants, but not in bcd;cad double mutants. This suggests redundant activation by the two maternal factors. Such redundant activation of kni is present in the Unc-GC model. For the other models, the optimization selected one or the other as activators, but not both. Tll is crucial for activating the posterior hb domain, while it represses Kr, kni, and gt, preventing their expression in the extreme posterior. All the regulatory relationships between the gap genes in the Combined model are repressive. The complementary gap gene pairs, hb-kni and Kr-gt are known to be strongly mutually repressive, as was found in nearly all the models. [Repression of hb by Kni is not part of the Rivera-Pomar and Jäckle (RPJ) regulatory relationships (Rivera-Pomar, 1996), but the unconstrained gene circuit (Unc-GC) model and Unc-Logic model (that employs the regulatory structure discovered by the unconstrained gene circuit fit, except that Gt activation of hb and Kni activation of gt were removed) included the link.] The models also suggest that mutual repression between hb and Kr helps to set the boundary between those two domains. A chain of repressive relationships, hb-gt-kni-Kr, causes the shifts in the Kr, kni, and posterior gt domains. Autoactivation by hb is well-established, and there is also some evidence for autoactivation by Kr and gt (Perkins, 2006).

Does Hb have a dual regulatory effect on Kr? There is a long-running debate about whether or not low levels of Hb activate Kr. In hb mutants, the Kr domain expands anteriorly, suggesting that Hb represses Kr. However, Kr expression in these mutants is lower than in wild-type and expands posteriorly in embryos overexpressing Hb. Further, in embryos lacking Bcd and Hb, the Kr domain is absent, but can be restored in a dosage-dependent manner by reintroducing Hb. These observations suggest that Hb activates Kr. It has been suggested, therefore, that low levels of Hb activate Kr while high levels repress it. An alternative explanation, however, is that the apparently activating effects of Hb are indirect, via Hb's repression of kni and Kni's repression of Kr. Optimization of the Unc-GC model, which could have resulted in activation or repression of Kr by Hb, but not both, resulted in repression. The RPJ models allow for a dual effect, but activation by Hb was eliminated during optimization of the RPJ-Logic model. The RPJ-GC model retained functional activation and repression of Kr by Hb. However, Kr expression in this model is defective. Kr is not properly repressed in the anterior. Further, Kr is ectopically expressed in a small domain in the posterior of the embryo. Thus, the current models provide no support for activation of Kr by Hb. The only support found, which is crucial in all models except Unc-Logic and also consistent with the mutant and overexpression studies, is for repression of Kr by Hb (Perkins, 2006).

What represses hb between the anterior and posterior domains? Another point of disagreement in the literature is what prevents the expression of hb between its two domains. In the model of Rivera-Pomar and Jäckle (1996), repression by Kr is the explanation. The RPJ models confirm that this mechanism is sufficient. Specifically, in these models Kr repression prevents hb expression just to the posterior of the anterior hb domain. Between the Kr and posterior hb domains, there is no explicit repression of hb. Rather, Hb is not produced simply because of a lack of activating factors. In contrast, the models of Jaeger (2004a and b) detected no effect of Kr and attributed repression solely to Kni. The Unc-GC and Unc-Logic models found repression by Kni, but in addition to repression by Kr, not instead of it. Kr is more responsible for repression near the anterior hb domain and Kni is more responsible for repression near the posterior hb domain. This is consistent with observations of expression in mutant embryos. Embryos mutant for Kr show slight expansion of the anterior hb domain, while kni embryos show expansion of the posterior hb domain. In Kr;kni double mutants, hb is completely derepressed between its two usual domains. This suggests, as seen in the Unc-GC and Unc-Logic models, that Kr and Kni are both repressors of hb, that their activity is redundant in the center of the trunk, and that Kr and Kni are the dominant repressors for setting the boundaries of the anterior and posterior domains, respectively. This interpretation was also favored by Jaeger (2004a and b), on the basis of the mutant data, even though Jaeger's models did not find repression by Kr (Perkins, 2006).

The posterior hb domain. In all of the current models, the posterior hb domain is activated by Tll and sustained by Tll and hb autoactivation. Rivera-Pomar (1996) did not consider the posterior hb domain, and did not include activation by Tll in his model. That link was added to the RPJ network structure because otherwise it was not possible to capture the posterior hb domain. The model of Jaeger (2004a and b) captured the domain without Tll activation by substituting activation from cad. However, there is no confirming evidence for such an interaction. The absence of posterior hb in tll mutants and the inability of the models to explain posterior hb by other means, leads to the straightforward hypothesis that Tll activates posterior hb. Posterior hb is unique in that the domain begins to form later than the other five domains modeled. In the RPJ models, this happens simply because high levels of Tll are needed to activate hb -- levels that are reached only at about t = 30 min. The Unc-GC and Unc-Logic models also employ repression by Cad to slightly delay Hb production in the posterior. However, there is no confirming evidence for such repression, and it is omitted from the Combined model (Perkins, 2006).

Shifting of the Kr, kni, and posterior gt domains. Domain shifting was first observed by Jaeger (2004a and b) and attributed to a chain of repressive regulatory relationships, hb-gt-kni-Kr. The current models largely support the importance of this regulatory chain, particularly the final two links. Repression of Kr by Kni was significant in all of the current models. Repression of kni by Gt was present in all models except RPJ-Logic, where it would be of little impact anyway, since RPJ-Logic has a defective posterior gt domain. Consistent with these findings, Kni binds to the regulatory region of Kr, and the Kr domain expands towards the posterior in kni mutants. Similarly, the kni domain expands posteriorly in gt mutants, while embryos overexpressing gt show reduced kni expression (Perkins, 2006).

Repression of gt by Hb is not as well supported by the current models. The Unc-GC model included the link, though the regulatory weight was the smallest of all those in the model. The link was eliminated from Unc-Logic and, of course, not present in the RPJ network structure. Instead, the models utilized decreasing activation by Cad (Unc-GC, Unc-Logic) and repression by Tll (Unc-GC, RPJ-GC) to shift the posterior gt domain. Even with these links, however, shifting of the domain is not well-captured. RPJ-GC appears to capture the posterior gt shift best (Figure 3E). However, it relies on its small ectopic Kr domain to repress gt, a completely incorrect mechanism. Interestingly, a gene circuit fit using the network structure of Sanchez and Thieffry (2001), captured the shift of posterior gt better than any of the other current models, and it did so using repression of gt by Hb, providing additional modeling support for the relationship. There also is strong mutant evidence in favor of the relationship. In hb mutants, the posterior gt domain does not retract from the posterior pole. Further, Gt is absent in embryos that have ubiquitous Hb, such as maternal oskar or nanos mutants or embryos expressing Hb ubiquitously under a heat-shock promoter. Thus, sufficient evidence was found to include a repressive link from hb to gt in the Combined model (Perkins, 2006).

Activating or repressing links that oppose the direction of the repressive chain were eliminated by optimization of the Unc-Logic, RPJ-GC, and RPJ-Logic models. In agreement with this result, the boundaries of the kni and posterior gt domains are correctly positioned in Kr and kni mutants, respectively. Thus, the simplest picture supported by the current models and consistent with the mutant studies is that there is no regulation from Kr, kni, or posterior gt to any of their immediate posterior neighbors, and that the repressive chain highlighted by Jaeger (2004a and b) is indeed responsible for domain shifting (Perkins, 2006).

Do gap genes autoregulate? All four of the current models include autoactivation by hb. This is supported by the observation that late anterior hb expression is absent in embryos lacking maternal and early zygotic Hb 47. The models suggest hb autoactivation also plays a crucial role in sustaining the posterior domain, once it has been initiated by Tll, a role not previously emphasized. Autoactivation for the other genes was found by the Unc-GC model, but is not part of the RPJ network structure. It included autoactivation only for Kr and gt in the Combined model, on the basis of a weakened and narrowed Kr domain in embryos producing defective Kr protein and a delay in gt expression in embryos producing defective gt protein. Interestingly, the gene circuit models of Jaeger (2004a and b) also found autoactivation for all four gap genes, but they considered autoactivation by gt to be the weakest and least certain. In contrast, the Unc-Logic model retained gt autoactivation while eliminating autoactivation for Kr and kni. The RPJ-Logic model was unable to reproduce the posterior gt domain. However, it was found that by adding gt autoactivation to the model, it was able to create and sustain posterior gt correctly, bringing the error of the model down to 15.34. This suggests that, after hb, gt is the most likely candidate for autoactivation. However, even this is not strictly necessary. The RPJ-GC model is able to reproduce and sustain the posterior gt domain without autoactivation by relying on cooperative activation from Bcd and Cad (Perkins, 2006).

Comparison of regulatory architectures. The regulatory relationships proposed by Rivera-Pomar and Jäckle (1996) are not fully consistent with the data and require amending. Repression of gt by Kni, which contradicts the mechanism of domain shifts described by Jaeger (2004a and b), was eliminated by the optimization in both of the current models based on the RPJ regulators. Activation of kni by Kr was never observed. No support was found for a dual regulatory effect of Hb on Kr. Activation of Kr at low levels of Hb was eliminated in the RPJ-Logic model. It was retained in the RPJ-GC model, but resulted in serious patterning defects. Inclusion of Tll as an activator of hb was sufficient to produce the posterior hb domain. Based on the current fits and the primary experimental literature, there are likely other regulatory links missing from the model of Rivera-Pomar and Jäckle, though they are not strictly required to reproduce the wild-type gap gene patterns. Foremost is repression of hb by Kni, which appears important for eliminating hb expression anterior of the posterior domain. Fits based on the Sanchez and Thieffry (2001) regulatory relationships also support these conclusions (Perkins, 2006).

In contrast, the regulatory relationships in the Combined model and both the Unc-GC and Unc-Logic models are able to capture the wild-type gap patterns without gross defects. The relationships in the Unc-GC model are very similar to those obtained by Jaeger (2004a and b). For example, the regulation of Kr and kni is qualitatively equivalent in both models, and there is a single minor difference in the regulation of gt. The optimizations correctly identified activation of hb by Tll, which was missed by Jaeger (2004a and b), though the current models did less well at capturing shifting of the posterior gt domain. These regulatory relationships are also similar to those found by Gursky (2004), though that study was based on gap gene expression data with much lower accuracy and temporal resolution than the data used in this study. These similarities show that differences in the mathematical formulations of these models-as ordinary versus partial differential equations, how diffusion and nuclei doubling are modeled, and choice of boundary conditions and other simulation parameters-are not important for the reproduction of the gap gene patterns nor for the inference of regulatory relationships from the data (Perkins, 2006).

Estimating binding properties of transcription factors from genome-wide binding profiles

The binding of transcription factors (TFs) is essential for gene expression. One important characteristic is the actual occupancy of a putative binding site in the genome. In this study, an analytical model is proposed to predict genomic occupancy that incorporates the preferred target sequence of a TF in the form of a position weight matrix (PWM), DNA accessibility data (in the case of eukaryotes), the number of TF molecules expected to be bound specifically to the DNA and a parameter that modulates the specificity of the TF. Given actual occupancy data in the form of ChIP-seq profiles, copy number and specificity are backwards inferred for five Drosophila TFs during early embryonic development: Bicoid, Caudal, Giant, Hunchback and Kruppel. The results suggest that these TFs display thousands of molecules that are specifically bound to the DNA and that whilst Bicoid and Caudal display a higher specificity, the other three TFs (Giant, Hunchback and Kruppel) display lower specificity in their binding (despite having PWMs with higher information content). This study gives further weight to earlier investigations into TF copy numbers that suggest a significant proportion of molecules are not bound specifically to the DNA (Zabet, 2014: 25432957).

Protein Interactions

There are at least three short-range gap repressors in the precellular Drosophila embryo: Krüppel, Knirps, and Giant. Krüppel and Knirps contain related repression motifs, PxDLSxH and PxDLSxK, respectively, which mediate interactions with the dCtBP corepressor protein. Giant might also interact with dCtBP. The misexpression of Giant in ventral regions of transgenic embryos results in the selective repression of eve stripe 5. A stripe5-lacZ transgene exhibits an abnormal staining pattern in dCtBP mutants that is consistent with attenuated repression by Giant. The analysis of Gal4-Giant fusion proteins has identified a minimal repression domain that contains a sequence motif, VLDLS, which is conserved in at least two other sequence-specific repressors. Removal of this sequence from the native Giant protein does not impair its repression activity in transgenic embryos. It is proposed that Giant-dCtBP interactions might be indirect and mediated by an unknown bZIP subunit that forms a heteromeric complex with Giant (Nibu, 2001).

The minimal Giant repression domain spans amino acid residues 60-133. Alignment of this sequence with the Drosophila database identifies significant homology with the zinc finger repressor, Odd-skipped (Odd). Odd represses the expression of engrailed within the even-numbered parasegments and thereby defines which of the Ftz-expressing cells activate engrailed. Giant and Odd share the following sequence: VLDLSxxxxSxExP. A third transcriptional repressor in the early embryo, Tailless, also contains the VLDLS motif. Tailless is important for repressing segmentation gene expression in the anterior and posterior poles. It is unclear whether this sequence participates in Giant-dCtBP interactions, even though it is related to the dCtBP motif (PxDLSxR/K/H). Perhaps VLDLS helps recruit an unknown corepressor protein that mediates the residual repression activity of Gal4-Giant fusion proteins in dCtBP mutants (Nibu, 2001).

The low levels of Giant produced by an twi-giant transgene are sufficient to repress the endogenous eve stripe 5 pattern but not stripe 2. The failure to repress stripe 2 is consistent with previous studies, which suggested that Giant might interact with a localized 'partner' in anterior regions of the early embryo. It is also possible that stripe 2 regulation depends on high concentration of the Giant protein. There are two alternative explanations for the sufficiency of low levels of Giant to repress stripe 5. First, the stripe 5 enhancer might contain optimal high-affinity Giant operator sites. Alternatively, Giant might interact with an unknown bZIP subunit, X, that is broadly expressed in the early embryo (Nibu, 2001).

The second possibility, whereby Giant-X heterodimers regulate stripe 5 expression, is favored. Putative Giant operator sites in the stripe 5 enhancer lack obvious dyad symmetry, which might be expected for Giant-Giant homodimers. Moreover, the VLDLS motif is essential for the repression activity of Gal4-Giant fusion proteins but is dispensable in the context of the twi-giant transgene. For example, a deletion that removes the entire minimal repression domain (amino acids 60-133) does not significantly impair the ability of a twi-giant transgene to repress eve stripe 5 and hairy stripes 3, 4, and 5. Presumably, Gal4-Giant fusion proteins function as homomultimers, so that mutations in the repression domain attenuate or eliminate activity. In contrast, the same mutations might not disrupt the activities of a heterodimeric Giant-X complex because of the ability of subunit X to recruit dCtBP. Future studies will focus on the identification of subunit X and the corepressor(s) that interact with the conserved VLDLS motif (Nibu, 2001).

The Giant protein is a short-range transcriptional repressor that refines the expression pattern of gap and pair-rule genes in the Drosophila blastoderm embryo. Short-range repressors including Knirps, Krüppel, and Snail utilize the CtBP cofactor for repression, but it is not known whether a functional interaction with CtBP is a general property of all short-range repressors. Giant repression activity was studied in a CtBP mutant and it has been found that this cofactor is required for Giant repression of some, but not all, genes. While targets of Giant such as the even-skipped stripe 2 enhancer and a synthetic lacZ reporter show clear derepression in the CtBP mutant, another Giant target, the hunchback gene, is expressed normally. A more complex situation is seen with regulation of the Krüppel gene, in which one enhancer is repressed by Giant in a CtBP-dependent manner, while another is repressed in a CtBP-independent manner. These results demonstrate that Giant can repress both via CtBP-dependent and CtBP-independent pathways, and that promoter context is critical for determining giant-CtBP functional interaction. To initiate mechanistic studies of the Giant repression activity, a minimal repression domain within Giant has been identified that encompasses residues 89-205, including an evolutionarily conserved region bearing a putative CtBP binding motif (Stunk, 2001).

The deletional analysis of Gal4-Giant chimeras indicates that Giant repression function can be localized to residues 89-205, an area of the protein that contains several tracts of highly conserved residues. Chimeras containing other portions of the Giant protein do not exhibit significant repression activity, suggesting that these regions cannot act autonomously to mediate repression, and might instead contribute to protein stability or expression. In particular, residues 266-322 appear to correlate with significantly higher repression activity of these proteins. The low levels of chimeric protein expression in the embryo precluded direct quantitation of each protein, thus this analysis is based primarily on those that did show significant activity (Stunk, 2001).

No significant physical interaction could be detected between Giant and CtBP in vitro, and the Giant protein lacks a perfect match to the consensus CtBP binding motif P-DLS-K/R/H found in the Knirps, Krüppel, and Snail proteins. However, a partial match is present: VLDLSRR (residues 98-104). The motif is evolutionarily conserved and is found within the minimal repression domain defined, consistent with a possible role in repression. Indeed, deletion of residues 89-107 inactivates the chimeric repressor. This region is clearly not sufficient for high-level repression, however, suggesting that other portions of the protein play important structural or functional roles. If CtBP directly contacts Giant in vivo, the lack of strong interaction in vitro may indicate that Giant must be posttranscriptionally modified to facilitate interaction with CtBP, perhaps via phosphorylation. Posttranslational modifications are known to play a role in CtBP binding in some instances; E1A-CtBP interactions have been shown to be regulated by acetylation of a conserved lysine residue in the CtBP binding motif. Alternatively, Giant may bind CtBP indirectly through a cofactor, much as BRCA1 has been suggested to bind CtBP through CtIP, or CtBP might be recruited via a heterodimeric basic-zipper partner of Giant. To determine whether CtBP-dependent and CtBP-independent repression activities are mediated by the same or distinct portions of the Giant protein, future studies will need to focus on identifying mutant proteins that are deficient in each of these activities (Stunk, 2001).

What characteristics of a regulatory region dictate CtBP-dependent or CtBP-independent repression? In considering which features of a gene determine CtBP-dependence or -independence, the structure of the basal promoter cannot be the deciding factor, for the same Kr promoter is regulated by distinct elements, some that exhibit CtBP-dependence and some that show CtBP-independence. Similarly, the eve gene is repressed by Knirps via CtBP-dependent and CtBP-independent regulatory elements. While the eve enhancers in question are kilobases apart, the Kr regulatory elements driving anterior and central domain (CD) expression are closely intertwined, and appear to share at least some of the same activator binding sites, suggesting that subtle differences in enhancer architecture or differences in levels of regulatory proteins interacting with those elements may dictate CtBP dependence. The Giant binding site in the Kr CD2 enhancer site was shown to be of higher affinity than the gt1 site in the eve stripe 2 enhancer. Thus, there may be a correlation between Giant binding site affinity and the requirement for CtBP, with elements containing Giant sites of lower affinity showing CtBP-dependence. A consensus has been derived for the Giant protein by aligning binding sites for Giant from eve, Kr, and the recently identified abdA iab-2 enhancer site. The consensus features an extended half-site inverted repeat TNTTAC, consistent with the dimeric nature of basic zipper proteins, and a central ACGT core common to recognition motifs for many basic zipper proteins. The higher affinity sequences from the CtBP-independent Kr CD element are closer to the consensus than those of the CtBP-dependent eve stripe 2 enhancer. Weaker sites may only be partially occupied, resulting in an overall lower level of Giant mediated repression. A loss of CtBP might further depress repression activity below a critical threshold, leading to the derepression observed. Repression of the lacZ reporter containing the giant CD1 site from Kr is CtBP-dependent, a result that contrasts with the CtBP independence of the CD itself, but this particular site may not be optimal, since it contains two mismatches. Full Giant activity may also be mediated on the native CD element through the additional high-affinity CD2 site (Stunk, 2001).

Other factors besides binding site affinity can affect Giants activity, and possibly its CtBP-dependence. Small alterations in the location of Giant binding sites is sufficient to strongly affect the ability of Giant to repress in transgenic embryo assays. Thus, location and affinity of Giant sites needs to be considered in studying CtBP-dependent repression. It is not believed that differences in the nature of the activators explain CtBP-dependence or -independence, because both AD and CD enhancers of Kr are activated by Bicoid protein, as is the eve stripe 2 enhancer. Detailed studies illuminating how the general properties of short-range transcriptional repressors are integrated into the design of promoter elements will promote an understanding of the control of complex developmentally regulated genes (Stunk, 2001).

Transcriptional repressors can be classified as short- or long-range, according to their range of activity. Functional analysis of identified short-range repressors has been carried out largely in transgenic Drosophila, but it is not known whether general properties of short-range repressors are evident in other types of assays. To study short-range transcriptional repressors in cultured cells, chimeric tetracycline repressors were created based on Drosophila transcriptional repressors Giant, Drosophila C-terminal-binding protein (dCtBP), and Knirps. Giant and dCtBP are found to be efficient repressors in Drosophila and mammalian cells, whereas Knirps is active only in insect cells. The restricted activity of Knirps, in contrast to that of Giant, suggests that not all short-range repressors possess identical activities, consistent with recent findings showing that short-range repressors act through multiple pathways. The mammalian repressor Kid is more effective than either Giant or dCtBP in mammalian cells but is inactive in Drosophila cells. These results indicate that species-specific factors are important for the function of the Knirps and Kid repressors. Giant and dCtBP repress reporter genes in a variety of contexts, including genes that are introduced by transient transfection, carried on episomal elements, or stably integrated. This broad activity indicates that the context of the target gene is not critical for the ability of short-range repressors to block transcription, in contrast to other repressors that act only on stably integrated genes (Ryu, 2002).

The regulation of inducible promoters via chimeric tetracycline repressor (TetR) proteins has attracted considerable interest for use in ectopic expression systems in cell culture, microbes, plants, and whole animals. In these systems, a chimeric protein consisting of the Escherichia coli TetR protein fused to an activation domain binds to promoters containing Tet response elements (TREs). On addition of tetracycline or doxycycline, the chimeric protein is released from the promoter and the gene is inactivated. TetR DNA-binding domains with reverse specificity have been developed to permit activation of target genes on addition of the drug. Although this system can be highly regulated, low-level basal expression can be a problem in the case of potentially toxic gene products. To overcome this problem, higher specificity Tet DNA-binding domains have been recently developed. Many endogenous genes accomplished tight regulation by the coordinated action of repressors and activators. To mimic such composite systems, a Tet repressor can be combined with a Tet activator to give repression and activation in the absence and presence of doxycycline, respectively. Such combined Tet-based activation/repression systems have been developed for yeast and mammalian systems. Most of these systems use the KRAB repressor domain. Whether KRAB repressors can work in nonvertebrate cell types has not been reported, however. In this study, a panel of transcriptional repressors has been created based on well characterized short-range repressors from Drosophila. The chimeric proteins show reproducible repression activity in the Tet system in a variety of cell types and on stably integrated or transiently introduced reporter genes. Compared with the mammalian Kid repressor, these repressors may be the preferred alternative for regulation of expression in some cell types and with certain transgene configurations (Ryu, 2002).

giant: Biological Overview | Evolutionary homologs | Developmental Biology | Effects of Mutation | References

The Interactive Fly resides on the
Society for Developmental Biology's Web server.