Trithorax-like: Biological Overview | Evolutionary Homologs | Regulation | Developmental Biology | Effects of Mutation | References

Gene name - Trithorax-like

Synonyms - GAGA

Cytological map position - 70E--70F

Function - transcription factor

Keywords - trithorax group

Symbol - Trl

FlyBase ID:FBgn0013263

Genetic map position - 3-[41]

Classification - zinc finger domain

Cellular location - nuclear

NCBI links: Precomputed BLAST | Entrez Gene | UniGene |

Recent literature
Quijano, J.C., Wisotzkey, R.G., Tran, N.L., Huang, Y., Stinchfield, M.J., Haerry, T.E., Shimmi, O. and Newfeld, S.J. (2016). lolal is an evolutionarily new epigenetic regulator of dpp transcription during dorsal-ventral axis formation. Mol Biol Evol [Epub ahead of print]. PubMed ID: 27401231
Secreted ligands in the Dpp/BMP family drive dorsal-ventral (D/V) axis formation in all Bilaterian species. However, maternal factors regulating Dpp/BMP transcription in this process are largely unknown. This study identified the BTB domain protein longitudinals lacking-like (lolal) as a modifier of decapentaplegic (dpp) mutations. It was shown that Lolal is evolutionarily related to the Trithorax group of chromatin regulators and that lolal interacts genetically with the epigenetic factor Trithorax-like during Dpp D/V signaling. Maternally driven LolalHA is found in oocytes and translocates to zygotic nuclei prior to the point at which dpp transcription begins. lolal maternal and zygotic mutant embryos display significant reductions in dpp, pMad and zerknullt expression, but they are never absent. The data suggest that lolal is required to maintain dpp transcription during D/V patterning. Phylogenetic data reveals that lolal is an evolutionarily new gene present only in insects and crustaceans. The study concludes that Lolal is the first maternal protein with a role in dpp D/V transcriptional maintenance, that Lolal and the epigenetic protein Trithorax-like are essential for Dpp D/V signaling and that the architecture of the Dpp D/V pathway evolved in the arthropod lineage after the separation from vertebrates via the incorporation of new genes such as lolal
Duarte, F. M., Fuda, N. J., Mahat, D. B., Core, L. J., Guertin, M. J. and Lis, J. T. (2016). Transcription factors GAF and HSF act at distinct regulatory steps to modulate stress-induced gene activation. Genes Dev 30: 1731-1746. PubMed ID: 27492368
The coordinated regulation of gene expression at the transcriptional level is fundamental to development and homeostasis. Inducible systems are invaluable when studying transcription because the regulatory process can be triggered instantaneously, allowing the tracking of ordered mechanistic events. This study used precision run-on sequencing (PRO-seq) to examine the genome-wide heat shock (HS) response in Drosophila and the function of two key transcription factors on the immediate transcription activation or repression of all genes regulated by HS. The primary HS response genes and the rate-limiting steps in the transcription cycle were identified that are regulated by GAGA-associated factor (GAF) and HS factor (HSF). GAF acts upstream of promoter-proximally paused RNA polymerase II (Pol II) formation (likely at the step of chromatin opening), and GAF-facilitated Pol II pausing is critical for HS activation. In contrast, HSF is dispensable for establishing or maintaining Pol II pausing but is critical for the release of paused Pol II into the gene body at a subset of highly activated genes. Additionally, HSF has no detectable role in the rapid HS repression of thousands of genes.
Tsai, S. Y., Chang, Y. L., Swamy, K. B., Chiang, R. L. and Huang, D. H. (2016). GAGA factor, a positive regulator of global gene expression, modulates transcriptional pausing and organization of upstream nucleosomes. Epigenetics Chromatin 9: 32. PubMed ID: 27468311
Promoter-proximal pausing is believed to represent a critical step in transcriptional regulation. GAGA sequence motifs have frequently been found in the upstream region of paused genes in Drosophila, implicating a prevalent binding factor, GAF, in transcriptional pausing. Using newly isolated mutants that retain only ~3 % normal GAF level, this study analyzed its impacts on transcriptional regulation in whole animals. The abundance of three major isoforms of RNA-Pol on Hsp70 was examined during heat shock. Paused RNA-Pol of Hsp70 was shown to be substantially reduced in mutants. Conversely, a global increase in paused RNA-Pol is observed when GAF is over-expressed. Coupled analyses of transcriptome and GAF genomic distribution show that 269 genes enriched for upstream GAF binding are down-regulated in mutants. Interestingly, ~15 % of them encode transcriptional factors, which might control ~2000 additional genes down-regulated in mutants. A positive correlation exists between promoter-proximal RNA-Pol density and GAF occupancy in WT, but not in mutants. Nucleosome occupancy is preferentially attenuated by GAF in the upstream region, thus strongly favoring nucleosome assembly. Significant genetic interactions were detected between GAF and the nucleosome remodeler NURF (see Iswi), the pausing factor NELF (see Nelf-A and Nelf-E), and BAB1 whose binding sites are enriched specifically in genes displaying GAF-dependent pausing. These results provide direct evidence to support a critical role of GAF in global gene expression, transcriptional pausing and upstream nucleosome organization of a group of genes.
Blythe, S. A. and Wieschaus, E. F. (2016). Establishment and maintenance of heritable chromatin structure during early Drosophila embryogenesis. Elife 5. PubMed ID: 27879204
During embryogenesis, the initial chromatin state is established during a period of rapid proliferative activity. This study measured with three-minute time resolution how heritable patterns of chromatin structure are initially established and maintained during the midblastula transition (MBT). Regions of accessibility are established sequentially, where enhancers are opened in advance of promoters and insulators. These open states are stably maintained in highly condensed mitotic chromatin to ensure faithful inheritance of prior accessibility status across cell divisions. The temporal progression of establishment is controlled by the biological timers that control the onset of the MBT. In general, acquisition of promoter accessibility is controlled by the biological timer that measures the nucleo-cytoplasmic (N:C) ratio whereas timing of enhancer accessibility is regulated independently of the N:C ratio. These different timing classes each associate with binding sites for two transcription factors, GAGA-factor and Zelda, previously implicated in controlling chromatin accessibility at ZGA.
Lomaev, D., Mikhailova, A., Erokhin, M., Shaposhnikov, A. V., Moresco, J. J., Blokhina, T., Wolle, D., Aoki, T., Ryabykh, V., Yates, J. R., Shidlovskii, Y. V., Georgiev, P., Schedl, P. and Chetverina, D. (2017). The GAGA factor regulatory network: Identification of GAGA factor associated proteins. PLoS One 12(3): e0173602. PubMed ID: 28296955
The Drosophila GAGA factor (GAF) has an extraordinarily diverse set of functions that include the activation and silencing of gene expression, nucleosome organization and remodeling, higher order chromosome architecture and mitosis. One hypothesis that could account for these diverse activities is that GAF is able to interact with partners that have specific and dedicated functions. To test this possibility affinity purification coupled with high throughput mass spectrometry were used to identify GAF associated partners. Consistent with this hypothesis the GAF interacting network includes a large collection of factors and complexes that have been implicated in many different aspects of gene activity, chromosome structure and function. Moreover, GAF interactions with a small subset of partners was shown to be direct; however for many others the interactions could be indirect, and depend upon intermediates that serve to diversify the functional capabilities of the GAF protein.
Moshe, A. and Kaplan, T. (2017). Genome-wide search for Zelda-like chromatin signatures identifies GAF as a pioneer factor in early fly development. Epigenetics Chromatin 10(1): 33. PubMed ID: 28676122
The protein Zelda was shown to play a key role in early Drosophila development, binding thousands of promoters and enhancers prior to maternal-to-zygotic transition (MZT), and marking them for transcriptional activation. Zelda has been shown to act through specific chromatin patterns of histone modifications to mark developmental enhancers and active promoters. Intriguingly, some Zelda sites still maintain these chromatin patterns in Drosophila embryos lacking maternal Zelda protein. This suggests that additional Zelda-like pioneer factors may act in early fly embryos. A computational method was developed to analyze and refine the chromatin landscape surrounding early Zelda peaks, using a multichannel spectral clustering. This allowed characterization their chromatin patterns through MZT (mitotic cycles 8-14). Specifically, focus was placed on H3K4me1, H3K4me3, H3K18ac, H3K27ac, and H3K27me3 and three different classes of chromatin signatures were identified, matching "promoters," "enhancers" and "transiently bound" Zelda peaks. Then the genome was further scanned using these chromatin patterns and additional loci - with no Zelda binding- were identified that show similar chromatin patterns, resulting with hundreds of Zelda-independent putative enhancers. These regions were found to be enriched with GAGA factor (GAF, Trl) and are typically located near early developmental zygotic genes. Overall this analysis suggests that GAF, together with Zelda, plays an important role in activating the zygotic genome. The computational approach offers an efficient algorithm for characterizing chromatin signatures around some loci of interest and allows a genome-wide identification of additional loci with similar chromatin patterns.
Moshe, A. and Kaplan, T. (2017). Genome-wide search for Zelda-like chromatin signatures identifies GAF as a pioneer factor in early fly development. Epigenetics Chromatin 10(1): 33. PubMed ID: 28676122
The protein Zelda was shown to play a key role in early Drosophila development, binding thousands of promoters and enhancers prior to maternal-to-zygotic transition (MZT), and marking them for transcriptional activation. Zelda has been shown to act through specific chromatin patterns of histone modifications to mark developmental enhancers and active promoters. Intriguingly, some Zelda sites still maintain these chromatin patterns in Drosophila embryos lacking maternal Zelda protein. This suggests that additional Zelda-like pioneer factors may act in early fly embryos. A computational method was developed to analyze and refine the chromatin landscape surrounding early Zelda peaks, using a multichannel spectral clustering. The genome was scanned using additional chromatin patterns, and loci-with no Zelda binding- were identified that show similar chromatin patterns, resulting with hundreds of Zelda-independent putative enhancers. These regions were found to be enriched with GAGA factor (GAF, Trl) and are typically located near early developmental zygotic genes. Overall this analysis suggests that GAF, together with Zelda, plays an important role in activating the zygotic genome. This computational approach offers an efficient algorithm for characterizing chromatin signatures around some loci of interest and allows a genome-wide identification of additional loci with similar chromatin patterns.
Moshe, A. and Kaplan, T. (2017). Genome-wide search for Zelda-like chromatin signatures identifies GAF as a pioneer factor in early fly development. Epigenetics Chromatin 10(1): 33. PubMed ID: 28676122
The protein Zelda was shown to play a key role in early Drosophila development, binding thousands of promoters and enhancers prior to maternal-to-zygotic transition (MZT), and marking them for transcriptional activation. Recent studies have shown that Zelda acts through specific chromatin patterns of histone modifications to mark developmental enhancers and active promoters. Intriguingly, some Zelda sites still maintain these chromatin patterns in Drosophila embryos lacking maternal Zelda protein. A computational method was developed to analyze and refine the chromatin landscape surrounding early Zelda peaks, using a multichannel spectral clustering. This allowed characterization of their chromatin patterns through MZT (mitotic cycles 8-14). Specifically, this study focused on H3K4me1, H3K4me3, H3K18ac, H3K27ac, and H3K27me3 and identified three different classes of chromatin signatures, matching "promoters," "enhancers" and "transiently bound" Zelda peaks. The genome was then further scanned using these chromatin patterns, and additional loci - with no Zelda binding - were identified that show similar chromatin patterns, resulting with hundreds of Zelda-independent putative enhancers. These regions were found to be enriched with GAGA factor (GAF, Trl) and are typically located near early developmental zygotic genes. Overall this analysis suggests that GAF, together with Zelda, plays an important role in activating the zygotic genome. This computational approach offers an efficient algorithm for characterizing chromatin signatures around some loci of interest and allows a genome-wide identification of additional loci with similar chromatin patterns.


Recent results suggest that the Drosophila transcriptional activator known as GAGA factor, or Trithorax-like, functions by influencing chromatin structure (Granok, 1995). Chromatin is the complex of DNA and proteins that bind DNA into a highly ordered structure. Before further discussion of Trithorax-like, a word about chromatin is in order. Chromatin gets its name from the affinity that this DNA-protein complex has for dyes used to stain chromosomes and cell nuclei. At the earliest stage in Drosophila development, the chromatin is transcriptionally silent. All developmental decisions are based on maternal proteins that exist in the highly ordered oocyte prior to fertilization. The transition to zygotic transcription occurs at the stage of mid-blastula transition.

The chromosomal protein Histone H1 is considered a linker histone, since it is involved, by self association, in generating the superhelical 30 nm fiber of chromatin in chromosomes. Pre-blastoderm chromatin does not contain histone H1, but instead is saturated with HMG-D, the Drosophila homolog of HMG1. As maternal HMG-D is depleted, (mid-blastula transition, at approximately cell cycle 10), histone H1 accumulates, coincident with the start of zygotic transcription. At this time, the nuclei become more compact; this is paralleled by a reduction in size of mitotic chromatin (Ner, 1994).

GAGA transcription factor has been shown to counteract chromatin repression at all levels, and by so doing, trigger the active transcription of genes subject to repression. Drosophila gene hsp70 has proven a useful tool to build a better understanding of the process. HSP70 is a so-called heat shock protein. It functions on an as needed, emergency basis to repair or discard proteins denatured by high temperatures. The main transcription factor regulating hsp70 is HSF, the heat shock transcription factor. The promoter of hsp70 contains sites for HSF and Trithorax-like/GAGA, a constitutively expressed transcription factor that binds to poly GA rich sites present in the DNA that codes for many Drosophila genes.

The hsp70 promoter binds two other factors in addition to GAGA and HSF: TFIID which serves as the TATA-binding protein complex, and RNA polymerase II. In HSP70's inactive state, polymerase has paused after synthesizing a short transcript. Pulsing the temperature results in a relief of pausing, the release of the polymerase protein and the completion and continuation of gene transcription.

What is the role of GAGA in the activation of transcription of HSP70? An artificial system had to be constructed in order to investigate the question. A hsp70 plasmid DNA containing the hsp70 promoter was constructed and the chromatin, consisting of histones was constructed without GAGA. When GAGA and an additional protein complex are added to this mixture the disruption of the chromatin structure ensues in an energy dependent process. In other words disruption of the chromatin structures requires transfer of energy from the breakdown of ATP (Tsukiyama, 1994).

Subsequent biochemical work has resulted in the purification of a Nucleosome remodeling multiprotein complex (NURF) responsible for the energy dependent remodling of chromatin. One of the constitutes is ISWI, a homolog of the yeast chromatin remodeling factor SWI2/SWF2. Thus addition of GAGA along with NURF is sufficient to remodel the chromatin, relieving its repressive effects, allowing for access to the gene of other transcription factors and initiation of transcription. The various effects of GAGA could be explained by its ability to rearrange nucleosomal positions. Chromatin remodelling by GAGA and other factors in vitro require activities that maintain a highly dynamic state of chromatin (Becker, 1995).

The role of GAGA is a model of how trithorax group proteins activate silenced genes. GAGA may turn out to have no essentially different properties from other transcription factors. It has helped however in a conceptual switch concerning understanding of gene activation. No longer is it sufficient to know which factors interact for transcriptional interaction, but the question is taken one step forward; now one must know what roles these factors play in overcoming the repressive effects of chromatin, resulting in gene activation.

One additional property of GAGA warrents mention. Many transcription factors dissociate from DNA during mitosis: the chromosomes become protected by a class of proteins called polyamines, basic proteins that have a high affinity for nucleic acid. GAGA remains associated with DNA during mitosis (Raff, 1994 and O'Brien, 1995). What is the special role of GAGA in preserving the continuity of the state of gene activation during mitosis and how does the loss of affinity of other transcription factors relate to the preservation of the differentiated state? For reviews on the role of GAGA in transcription, see Granok, 1995 and Becker, 1995.

In summary, Trithorax-like belongs to the trithorax group of genes required for normal expression of homeotic genes. Trl is involved in modifying accessability of promoters by altering the nucleosome structure, so that other transcription factors can bind (Farkas, 1994). TRL causes nucleosome disruption in an energy-dependent reaction that requires other proteins as well (Wall, 1995). Trithorax group genes oppose the action of Polycomb group genes. The latter function to silence active genes. TRL is associated with specific regions of heterochromatin during all stages of the cell cycle, including mitosis. The continual association of TRL with promoters, even during mitosis, may help explain the continuity of the differentiated state, since most transcription factors dissociate from DNA during mitosis (O'Brien, 1995).

The function of GAGA is not restricted to that of a gene-specific transcriptional activator. Trl mutations are dominant enhancers of position-effect variegation, indicating that GAGA counteracts heterochromatic silencing (Farkas, 1994). GAGA has also been implicated in the functioning of the polycomb response elements (Strutt, 1997). Immunolocalization studies revealed a strong association of GAGA with the GA-rich centric heterochromatin throughout the cell cycle in early embryos (Raff, 1994). More recent studies suggested a mitosis-specific association of GAGA with GA-rich satellite DNA (Platero, 1998). This observation might be related to a variety of nuclear cleavage cycle defects, displayed by Trl mutants, that include asynchrony and failure in chromosome condensation and segregation (Bhat, 1996). Thus, GAGA is a multipurpose protein that mediates gene-specific regulation but also plays a global role in chromosome function.

The Drosophila GAGA factor self-oligomerizes both in vivo and in vitro. GAGA oligomerization depends on the presence of the N-terminal POZ domain. The formation of dimers, tetramers, and oligomers of high stoichiometry is observed in vitro. GAGA oligomers bind DNA with high affinity and specificity. As a consequence of its multimeric character, the interaction of GAGA with DNA fragments carrying several GAGA binding sites is multivalent and of higher affinity than its interaction with fragments containing single short sites. A single GAGA oligomer is capable of binding adjacent GAGA binding sites spaced by as many as 20 base pairs. GAGA oligomers are functionally active, being transcriptionally competent in vitro. GAGA-dependent transcription activation depends strongly on the number of GAGA binding sites present in the promoter. The POZ domain is not necessary for in vitro transcription, but in its absence no synergism is observed upon an increase in the number of binding sites contained within the promoter (Espinás, 1999).

GAGA is known to enhance transcription from promoters containing d(GA·TC)n sequences, both in vitro and in vivo. To analyze the contribution of the presence of multiple binding sites to the transcription activity of GAGA, the rate of GAGA-dependent transcription activation from promoters containing an increasing number of GAGA binding sites was determined. For these experiments, the GAGA binding site found at the C-region of the engrailed promoter was multimerized and fused to a minimal promoter, which efficiently drives transcription of a G-less cassette. The constructs used in these experiments contain from 1 to 6 copies of this engrailed site. The extent of maximal activation obtained in the presence of GAGA strongly depends on the number of binding sites present at the promoter. No significant activation is observed from constructs containing only one or two GAGA binding sites, and only a moderated 3-fold activation is observed in the presence of three binding sites. However, a strong increase in activation, to about 8-9-fold, is seen from constructs containing five or six binding sites. This behavior depends on the presence of the POZ domain. When the transcription activity of the DeltaPOZ245 peptide is analyzed, a significant activation is observed in the presence of two GAGA binding sites, which increases only slightly, as does the number of binding sites. In this case, a low though reproducible activation is detected even in the presence of a single site. The synergism in transcription activation detected upon increasing the number of binding sites is consistent with the higher affinity of GAGA oligomers for fragments carrying multiple GAGA sites. Consistent with this hypothesis, this synergism depends on the presence of the POZ domain (Espinás, 1999).

Several observations suggest that, to some extent, GAGA functions at the chromatin level, participating in the formation of an open chromatin structure. GAGA is the product of the Trithorax-like(Trl) gene which, being a member of the Trithorax group, antagonizes the chromatin-mediated repression that Polycomb genes induce upon the expression of the homeotic genes. A more direct link to chromatin structure is indicated by the fact that Trl is an enhancer of position effect variegation. Moreover, in collaboration with nucleosome remodeling factor, GAGA was shown to help nucleosome disruption at specific regions of the hsp70 promoter, encompassing GAGA binding sites. At present, little is known about the specific contribution of GAGA to chromatin remodeling, but GAGA appears to be particularly efficient in this respect. Although a direct interaction with the chromatin remodeling machinery cannot be excluded, the simultaneous interaction of GAGA oligomers with multiple adjacent sites could significantly contribute to the higher efficiency of GAGA in disrupting nucleosomes. In this context, it would be interesting to know whether a functional POZ domain is required for efficient nucleosome disruption. GAGA can also activate transcription in vitro, suggesting a possible interaction with the basal transcription machinery. These results indicate that the presence of several independent GAGA sites is required for efficient transcription activation in vitro, indicating that the oligomeric character of GAGA might also be functionally relevant in this context. Interestingly, in the case of the DeltaPOZ245 peptide, significant transcription activation is detected in the presence of a single binding site, and no synergism is observed upon increasing the number of GAGA binding sites. These results suggest that the synergism observed with full GAGA arises from specific features of the GAGA-DNA complex rather than from the simple recruitment of multiple GAGA molecules to the promoter (Espinás, 1999).

A functionally conserved boundary element from the mouse HoxD locus requires GAGA factor in Drosophila

Hox genes are necessary for proper morphogenesis and organization of various body structures along the anterior-posterior body axis. These genes exist in clusters and their expression pattern follows spatial and temporal co-linearity with respect to their genomic organization. This colinearity is conserved during evolution and is thought to be constrained by the regulatory mechanisms that involve higher order chromatin structure. Earlier studies, primarily in Drosophila, have illustrated the role of chromatin-mediated regulatory processes, which include chromatin domain boundaries that separate the domains of distinct regulatory features. In the mouse HoxD complex, Evx2 and Hoxd13 are located ∼ 9 kb apart but have clearly distinguishable temporal and spatial expression patterns. This study reports the characterization of a chromatin domain boundary element from the Evx2-Hoxd13 region that functions in Drosophila as well as in mammalian cells. The Evx2-Hoxd13 region has sequences conserved across vertebrate species including a GA repeat motif, and the Evx2-Hoxd13 boundary activity in Drosophila is dependent on GAGA factor that binds to the GA repeat motif. These results show that Hox genes are regulated by chromatin mediated mechanisms and highlight the early origin and functional conservation of such chromatin elements (Vasanthi, 2010).

The role of chromatin organization in developmental gene regulation has been well established. In particular, chromatin organization that involves domain boundary elements has been shown to be a key feature of the regulation of homeotic genes in Drosophila . As the organization of Hox genes is well conserved among bilatarians, it is reasonable to speculate that the constraint that led to this conservation of organization is due to chromatin elements that regulate Hox genes. In general, when differentially expressed genes are in close proximity, as is often the case in Hox complexes, boundary elements are likely to be present between the genes to establish and maintain their distinct expression states. In the mouse HoxD complex, Evx2 and Hoxd13 are ∼9 kb apart and they are expressed in distinct regions in the developing embryo. This suggests the presence of a boundary within this 9 kb region that prevents the crosstalk between regulatory elements of the two flanking genes (Vasanthi, 2010).

In order to identify this putative boundary, sequence comparison of the Evx2-Hoxd13 region from different vertebrates were carried out, and a cluster of conserved sites along with a GA repeat motif was identified in all the species checked, from fish to mammals. The ∼3 kb fragment that included the GA repeats showed enhancer-blocking activity in Drosophila embryos, as well as in a human cell line, indicating the presence of a complex evolutionarily conserved boundary between Evx2 and Hoxd13 genes. The boundary activity was shown by both overlapping fragments, ED1a and ED1b, suggesting that the Evx2-Hoxd13 boundary is spread over several kilobases, unlike Drosophila boundaries that tend to be smaller, often less than 1 kb. Spread out boundary function in this region has also been suggested by an earlier study (Yamagishi, 2007). The complex nature of the Evx2-Hoxd13 boundary is also indicated by the observation that only early enhancers of ftz are effectively blocked, whereas late enhancers are able to drive expression of the lacZ reporter gene even in the presence of this boundary. This boundary activity was examined in the adult eye using a white gene enhancer and promoter interaction assay, and the results clearly showed no enhancer blocking activity in this tissue. These observations indicate that Evx2-Hoxd13 is a developmentally regulated boundary that functions in early embryos but not in late embryonic CNS and adult eye (Vasanthi, 2010).

It was also found that the boundary activity shown by the fragment containing GA-repeat motif is dependent on GAF in Drosophila. This indicates that the conserved GA sites are functionally relevant in Drosophila. Evx2 is the homolog of the even skipped (eve) gene of Drosophila, and both are thought to have evolved from a common ancestral gene Evx. In vertebrates, Evx is located near Hox clusters: Evx1 near HoxA and Evx2 near HoxD. In Drosophila, eve has moved away from the Hox cluster. The finding that a GAF-dependent boundary is present in the Evx2-Hoxd13 region is of particular interest in the light of a previous study showing that the eve gene in fly is also associated with a GAF-dependent boundary. These observations suggest that the boundary function evolved early on near the ancestral Evx gene and that the same combination has been conserved during evolution even in the organisms where the linkage between eve to Hox complex has been lost (Vasanthi, 2010).

Although several boundary-interacting factors are known in Drosophila, in vertebrates, CTCF is the only protein that has been well studied for its role in boundary function. A CTCF homolog is also present in Drosophila and is known to play a role in the Fab-8 boundary function in the BX-C. Interestingly, however, the Fab-7 boundary of the BX-C does not involve CTCF, and instead GAF plays an important role in its function and regulation. In the case of the Evx2-Hoxd13 boundary, and in agreement with earlier studies, no CTCF-binding sites are found. As in Fab-7, this boundary appears to be dependent on GAF. These observations suggest that although several factors act together to establish a boundary, some of them may be mutually exclusiv. Further studies in this direction will help in understanding the function and regulation of boundaries during development (Vasanthi, 2010).

These results strongly indicate the presence of GAGA-binding protein in vertebrates with functional similarity to that of Drosophila GAF. Earlier studies have also indicated that transcription of st-3 gene in Xenopus is regulated by GAGA sequences and GAGA factor, but the identity of vertebrate GAF has been elusive. In a separate study, c-krox/Th-POK was identified as the vertebrate homolog of GAF and was shown to binds to Evx2-Hoxd13 region in vertebrates (Matharu, 2010). These findings suggest that eve/Evx2 dependence on GAF is a feature acquired early in evolution and that even after eve separated from the Hox context, it retained this association and the functional features as seen in Drosophila. This work indicates that, in vertebrates, the ancient organization (as well as the GAF-dependent regulation) has been maintained at least at one of the Hox complexes. Finally, it is suggested that using this approach, other evolutionarily conserved cis elements and trans-acting factors involved in genomic organization and developmental gene regulation can be explored (Vasanthi, 2010).

HOT regions function as patterned developmental enhancers and have a distinct cis-regulatory signature

HOT (highly occupied target) regions bound by many transcription factors are considered to be one of the most intriguing findings of the recent modENCODE reports, yet their functions have remained unclear. This study tested 108 Drosophila melanogaster HOT regions in transgenic embryos with site-specifically integrated transcriptional reporters. In contrast to prior expectations, 102 (94%) were found to be active enhancers during embryogenesis and to display diverse spatial and temporal patterns, reminiscent of expression patterns for important developmental genes. Remarkably, HOT regions strongly activate nearby genes and are required for endogenous gene expression, as was shown using bacterial artificial chromosome (BAC) transgenesis. HOT enhancers have a distinct cis-regulatory signature with enriched sequence motifs for the global activators Vielfaltig, also known as Zelda, and Trithorax-like, also known as GAGA. This signature allows the prediction of HOT versus control regions from the DNA sequence alone (Kvon, 2012).

Taken together, these data show that Drosophila HOT regions function as cell type-specific transcriptional enhancers to up-regulate nearby genes during early embryo development. In contrast to prior expectations, HOT enhancers display diverse spatial and temporal activity patterns, which are reminiscent of expression patterns of important developmental genes. It was further found that the activity of many HOT enhancers appears to be unrelated to the expression of the bound transcriptional activators, suggesting that neutral TF binding to HOT regions is frequent. Interestingly, for Twi, Kr, and five additional TFs, it was found that HOT enhancers with functional footprints of the TFs are significantly enriched in the TFs' motifs compared with HOT enhancers to which the TFs seem to bind neutrally (e.g., 2.2-fold for Twi). This supports previous suggestions that the recruitment of TFs to HOT regions might be independent of the TFs' motifs and mediated by protein-protein interactions or nonspecific DNA bindin. This seems to be particularly true for (HOT) regions to which the TFs bind neutrally without impact on the regions' transcriptional enhancer activity (Kvon, 2012).

By uncovering a distinct cis-regulatory signature that is characteristic and predictive of HOT regions, computational analysis establishes a link between HOT regions, early embryonic enhancers (EEEs), and maternal TFs that are ubiquitously present in the early Drosophila embryo. Specifically, the results suggest that ZLD might be more generally important for the establishment of regulatory elements in the early embryo, while GAGA appears to be a distinguishing feature of HOT regions. This is supported by an analysis of genome-wide data on ZLD and GAGA binding in early Drosophila embryos: While 71.4% of HOT regions and 75.0% of EEEs are bound by ZLD (compared with 42.2% and 13.0% of control WARM and COLD regions), GAGA binds to 53.4% of HOT regions but only 20.0% of EEEs (compared with 28.3% and 7.8% for WARM and COLD regions). Even when considering only regions that are functioning as transcriptional enhancers in the early embryo (all EEEs from CAD and this study combined), GAGA binds to significantly more HOTenhancers than to enhancers that are not HOT. An instructive role for ZLD in defining chromatin that is open and accessible to other factors is further supported by its unusual property to bind to the majority (64%) of all occurrences of its sequence motif in the Drosophila genome. ZLD might thus be a prerequisite for both HOTregions and EEEs more generally. Similarly, a role for GAGA in nucleating or promoting the formation of TF complexes is consistent with its ability to self-oligomerize via its BTB/POZ domain and also form heteromeric complexes with the TF Tramtrack and potentially other BTB/POZ domain- containing TFs (e.g., Abrupt, Bric-a-brac, Broad complex, and others). GAGA, with its ability to recruit other TFs by protein-protein interactions, might contribute to HOT regions independent of the specific cellular or developmental context. Interestingly, C. elegans HOT regions are also strongly enriched in the GAGA motifs, and the motif is the most important sequence feature when classifying C. elegans HOT versus control regions. GAGA-like factors or their putative homologs or functional analogs across species might be a conserved feature of metazoan HOT regions (Kvon, 2012).


The GAGA transcription factor of Drosophila is ubiquitous and plays multiple roles. Characterization of cDNA clones and detection by domain-specific antibodies has revealed that the 70-90 kDa major GAGA species are encoded by two open reading frames producing GAGA factor proteins of 519 amino acids (GAGA-519) and 581 amino acids (GAGA-581), that share a common N-terminal region which is linked to two different glutamine-rich C-termini. Purified recombinant GAGA-519 and GAGA-581 proteins can form homomeric complexes that bind specifically to a single GAGA sequence in vitro. The two GAGA isoforms also function similarly in transient transactivation assays in tissue culture cells and in chromatin remodeling experiments in vitro. Only GAGA-519 protein accumulates during the first 6 h of embryogenesis. Thereafter, both GAGA proteins are present in nearly equal amounts throughout development; in larval salivary gland nuclei they colocalize completely to specific regions along the euchromatic arms of the polytene chromosomes. Coimmunoprecipitation of GAGA-519 and GAGA-581 from crude nuclear extracts and from mixtures of purified recombinant proteins, indicates direct interactions. It is suggested that homomeric complexes of GAGA-519 may function during early embryogenesis; both homomeric and heteromeric complexes of GAGA-519 and GAGA-581 may function later (Benyajati, 1997).

cDNA clone length - 2.4 kb with other variants from 3.0 kb to 4.4 kb, developmentally regulated.

Bases in 5' UTR - 177

Bases in 3' UTR - 100


Amino Acids 519

Structural Domains

Trl has two major structural domains: a zinc finger domain and an N-terminal BTB domain, also known as a POZ domain, responsible for transcriptional activation. Trithorax (Trx) itself has no BTB domain (Farkas, 1994 and Soeller, 1993).

Two other Drosophila proteins with zinc finger domains, Tramtrack and Broad Complex, also contain N-terminal domains highly related to that of TRL (Soeller, 1993).

To better define the molecular basis of the pleiotropic effects of Trithorax-like mutations, cDNAs were cloned that encode the GAGA isoforms of D. melanogaster and a distantly related species, D. virilis. The genomic organizations of both the D. melanogaster and D. virilis genes were characterized, and the expression patterns of isoform-specific mRNAs were analysed. The D. virilis GAGA isoforms show high similarity to their D. melanogaster counterparts, particularly within the BTB/POZ protein-interaction and the zinc finger DNA-binding domains. Interestingly, conservation clearly extends beyond the previously defined limits of these domains. Moreover, the comparison reveals a completely conserved block of amino acid residues located between the BTB/POZ and DNA-binding domains, and a high conservation of the C-terminus specific for one of the GAGA isoforms. Thus, sequences of as yet unknown functions are defined as rewarding targets for further mutational analyses. The high conservation of the GAGA proteins of the two species is in accord with the nearly identical genomic organization and expression patterns of the corresponding genes (Lintermann, 1998).

The protein coding sequences of Trl class A transcripts are split between four exons, which are separated by three introns of 2.2 kb, 118 bp and 160 bp. Class B transcripts are derived by the use of an alternative splice site within exon IV. Transcripts of both classes thus share exons I to III and the 5' portion of exon IV. The BTB/POZ domain is encoded, in about equal shares, by exons I and II. In addition to the C-terminal half of the BTB/POZ domain, exon II also encodes a putative nuclear localization signal. The minimal DNA-binding domain of the GAGA factor consists of a single C2H2 zinc finger and two regions of basic amino acids located immediately N-terminal to the zinc finger, and is encoded by exons III and IV. The 3' end of exon III contains basic region I and the rest of the binding domain is located in that part of exon IV that is common to both transcript classes. This part also contains a third region of basic amino acid residues located C-terminal to the zinc finger, which seems to be dispensable for DNA bining. The polypeptide encoded by the 3' part of exon IV, which is specific for class A transcripts, is characterized by stretches of polyglutamine. Simlar regions of high glutamine content are also found in the polypeptide encoded by the class B-specific exon V. The C-terminal sequences specific for the D. melanogaster GAGA-581 (class B) and the D. virilis class B isoform show a significantly higher conservation than the C-terminal sequences specific for the D. melanogaster class A and D. virilis class A isoform. Since functional differences between the different isoforms must be based on these C-terminal sequences, the class B isoforms may have adopted specialized functions that are more sensitive to changes in the amino acid sequence (Lintermann, 1998).

A class A-specific probe detects a small transcript (2.5-kb) restricted to adult females and early embryos, suggesting that it represents a maternal mRNA. The pattern of class A and B transcript expression strikingly changes during development. While the class A transcripts dominate in early embryos, transcripts of the two classes are present in similar amounts at later stages of embryogenesis. In first and second instar larvae, the 3.4-kb class B mRNA is predominant. In third instar larvae the 2.5-kb class A transcript increases to a level comparable to that of the 3.4-kb class B transcript, but the 3.9-kb class B transcript is underrepresented. This situation changes in the pupa, where the ratios between the three transcripts are comparable to the ratios seen in late embryos. Class A transcripts are not detected in males; the 3.4-kb class B transcript is clearly the dominating species in males. The 3.9-kb class B transcript seems to be strongly underrepresented in both males and females. A similar sex-specific expression of class A and class B transcripts is observed in D. melanogaster and D. virilis (Lintermann, 1998).

The effect of GAGA protein on chromatin structure and promoter function has been the subject of much attention, yet little is known of the actual mechanism and the specific contributions of individual GAGA domains to its function. The DNA-binding activity of GAGA, as specified by the single zinc finger binding domain (Zn), has been examined in some detail; however, the functions of the POZ/BTB and glutamine domain (Q) remain poorly understood. Three separate activities of the Q domain of GAGA are reported: promoter distortion, single-strand binding, and multimerization. In vitro, GAGA binding to the hsp70 promoter produces extended DNase I protection and KMnO4 hypersensitivity. These activities require both the Zn domain and Q domain of GAGA, and appear independent of the POZ/BTB domain. GAGA also has a single-stranded DNA binding affinity, as does the Q-rich region alone. GAGA forms multimers both in vitro and in vivo, and the Q domain itself forms multimers. Protein-protein interactions mediated by the Q domain may, therefore, be at least partially responsible for the multimerization capabilities of GAGA (Wilkins, 1999).

The BTB/POZ domain defines a conserved region of about 120 residues; it has been found in over 40 proteins to date. It is located predominantly at the N terminus of Zn-finger DNA-binding proteins, where it may function as a repression domain, and less frequently in actin-binding and poxvirus-encoded proteins, where it may function as a protein-protein interaction interface. A prototypic human BTB/POZ protein, PLZF (promyelocytic leukemia zinc finger) is fused to RARalpha (retinoic acid receptor alpha) in a subset of acute promyelocytic leukemias (APLs), where it acts as a potent oncogene. The exact role of the BTB/POZ domain in protein-protein interactions and/or transcriptional regulation is unknown. The BTB/POZ domain from PLZF (PLZF-BTB/POZ) has been overexpressed, purified, characterized, and crystallized. Gel filtration, dynamic light scattering, and equilibrium sedimentation experiments show that PLZF-BTB/POZ forms a homodimer with a Kd below 200 nM. Differential scanning calorimetry and equilibrium denaturation experiments are consistent with the PLZF-BTB/POZ dimer undergoing a two-state unfolding transition. Circular dichroism shows that the PLZF-BTB/POZ dimer has significant secondary structure including about 45% helix and 20% beta-sheet. Crystals of the PLZF-BTB/POZ have been prepared that are suitable for a high resolution structure determination using x-ray crystallography. The data support the hypothesis that the BTB/POZ domain mediates a functionally relevant dimerization function in vivo. The crystal structure of the PLZF-BTB/POZ domain will provide a paradigm for understanding the structural basis underlying BTB/POZ domain function (Li, 1997).

A novel zinc finger protein, ZID (standing for zinc finger protein with interaction domain) was isolated from humans. ZID has four zinc finger domains and a BTB domain, also know ans a POZ (standing for poxvirus and zinc finger) domain. At its amino terminus, ZID contains the conserved POZ or BTB motif present in a large family of proteins that include otherwise unrelated zinc fingers, such as Drosophila Abrupt, Bric-a-brac, Broad complex, Fruitless, Longitudinals lacking, Pipsqueak, Tramtrack, and Trithorax-like. The POZ domains of ZID, TTK and TRL act to inhibit the interaction of their associated finger regions with DNA. This inhibitory effect is not dependent on interactions with other proteins and does not appear dependent on specific interactions between the POZ domain and the zinc finger region. The POZ domain acts as a specific protein-protein interaction domain: The POZ domains of ZID and TTK can interact with themselves but not with each other, or POZ domains from ZF5, or the viral protein SalF17R. However, the POZ domain of TRL can interact efficiently with the POZ domain of TTK. In transfection experiments, the ZID POZ domain inhibits DNA binding in NIH-3T3 cells and appears to localize the protein to discrete regions of the nucleus (Bardwell, 1994).

Specific DNA binding to the core consensus site GAGAGAG has been shown with an 82-residue peptide (residues 310-391) taken from the Drosophila transcription factor GAGA. Using a series of deletion mutants, it was demonstrated that the minimal domain required for specific binding (residues 310-372) includes a single zinc finger of the Cys2-His2 family and a stretch of basic amino acids located on the N-terminal end of the zinc finger. In gel retardation assays, the specific binding seen with either the peptide or the whole protein is zinc dependent and corresponds to a dissociation constant of approximately 5 x 10(-9) M for the purified peptide. It has previously been thought that a single zinc finger of the Cys2-His2 family is incapable of specific, high-affinity binding to DNA. The combination of an N-terminal basic region with a single Cys2-His2 zinc finger in the GAGA protein can thus be viewed as a novel DNA binding domain. This raises the possibility that other proteins carrying only one Cys2-His2 finger are also capable of high-affinity specific binding to DNA (Pedone, 1996).

GAGA is a nuclear protein encoded by the Trithorax-like gene in Drosophila that is expressed in at least two isoforms generated by alternative splicing. By means of its specific interaction with DNA, GAGA has been involved in several nuclear transactions including regulation of gene expression. The GAGA519 isoform has been studied as a transcription factor. In vitro, the transactivation domain has been assigned to the 93 C-terminal residues that correspond to a glutamine-rich domain (Q-domain). It presents an internal modular structure and acts independently of the rest of the protein. In vivo, in Drosophila SL2 cells, Q-domain can transactivate reporter genes either in the form of GAGA or Gal4BD-Q fusions, whereas a GAGA mutant cannot (where the Q-domain has been deleted). These results give support to the notion that GAGA can function as a transcription activating factor (Vaquero, 2000).

Cullins (CULs) are subunits of a prominent class of RING ubiquitin ligases. Whereas the subunits and substrates of CUL1-associated SCF complexes and CUL2 ubiquitin ligases are well established, they are largely unknown for other cullin family members. S. pombe CUL3 (Pcu3p) forms a complex with the RING protein Pip1p and all three BTB/POZ domain proteins encoded in the fission yeast genome. The integrity of the BTB/POZ domain, which shows similarity to the cullin binding proteins SKP1 and elongin C, is required for this interaction. Whereas Btb1p and Btb2p are stable proteins, Btb3p is ubiquitylated and degraded in a Pcu3p-dependent manner. Btb3p degradation requires its binding to a conserved N-terminal region of Pcu3p that precisely maps to the equivalent SKP1/F box adaptor binding domain of CUL1. It is proposed that the BTB/POZ domain defines a recognition motif for the assembly of substrate-specific RING/cullin 3/BTB ubiquitin ligase complexes (Geyer, 2003).

These results identified BTB/POZ proteins as components of Pcu3p/Pip1p ubiquitin ligase complexes. Four pieces of evidence suggest that BTB/POZ domain proteins are functionally equivalent to the SKP1/F box adaptor dimers determining the substrate specificity of CUL1-associate SCF complexes: (1) all three BTB/POZ proteins present in the fission yeast genome interact with Pcu3p/Pip1p complexes; (2) BTB/POZ domains are structurally related to SKP1; (3) N-terminal residues invariably conserved in all CUL3 homologs, including Pcu3p, cluster in the same region of CUL1 that mediates its interaction with SKP1/F box adaptor dimers. Both the Btb3p/Pcu3p interaction and Pcu3p-dependent Btb3p degradation depend on the integrity of this conserved N-terminal region. (4) Btb3p is ubiquitylated in vitro in a Pcu3p-dependent manner, a finding reminiscent of CUL1-dependent ubiquitylation and degradation of F box proteins. Taken together, these findings strongly suggest that the BTB/POZ domain proteins ubiquitously present in eukaryotes define a family of substrate-specific adaptors for CUL3. Since fission yeast encodes three different BTB/POZ domain proteins, all of which interact with Pcu3p and Pip1p, it may form a minimum of three distinct RING/cullin 3/BTB complexes (Geyer, 2003).


Vertebrate homologue of Drosophila GAGA factor

Polycomb group (PcG) and trithorax group (trxG) proteins are chromatin-mediated regulators of a number of developmentally important genes including the homeotic genes. In Drosophila, one of the trxG members, Trithorax like (Trl), encodes the essential multifunctional DNA binding protein called GAGA factor (GAF). While most of the PcG and trxG genes are conserved from flies to humans, a Trl-GAF homologue has been conspicuously missing in vertebrates. This study reports the first identification of c-Krox/Th-POK as the vertebrate homologue of GAF on the basis of sequence similarity and comparative structural analysis. The in silico structural analysis of the zinc finger region showed preferential interaction of vertebrate GAF with GAGA sites similar to that of fly GAF. Cross-immunoreactivity studies show that both fly and vertebrate GAFs are highly conserved and share a high degree of structural similarity. Electrophoretic mobility shift assays show that vertebrate GAF binds to GAGA sites in vitro. Finally, in vivo studies by chromatin immunoprecipitation confirmed that vertebrate GAF binds to GAGA-rich DNA sequences present in hox clusters. Identification of vertebrate GAF and the presence of its target sites at various developmentally regulated loci, including hox complexes, highlight the evolutionarily conserved components involved in developmental mechanisms across the evolutionary lineage and answer a long-standing question of the presence of vertebrate GAF (Matharu, 2010).

Regulation | Developmental Biology | Effects of Mutation | References

date revised: 30 August 2000

Home page: The Interactive Fly © 1995, 1996 Thomas B. Brody, Ph.D.

The Interactive Fly resides on the
Society for Developmental Biology's Web server.