pipsqueak: Biological Overview | Evolutionary Homologs | Regulation | Developmental Biology | Effects of Mutation | References
Gene name - pipsqueak

Synonyms -

Cytological map position - 47A13--B1

Function - transcription factor

Keywords - Polycomb group, transcriptional activation and silencing of homeotic genes posterior group, eye

Symbol - psq

FlyBase ID: FBgn0004399

Genetic map position - 2R

Classification - Psq motif, BTB/POZ domain

Cellular location - nuclear



NCBI link: Entrez Gene
psq orthologs: Biolitmine
Recent literature
Chetverina, D. A., Gorbenko, F. V., Lomaev, D. V., Georgiev, P. G. and Erokhin, M. M. (2022). Recruitment to Chromatin of (GA)n-Associated Factors GAF and Psq in the Transgenic Model System Depends on the Presence of Architectural Protein Binding Sites. Dokl Biochem Biophys 506(1): 210-214. PubMed ID: 36303054
Summary:
Polycomb group (PcG) repressors and Trithorax group (TrxG) activators of transcription are essential for the proper development and maintenance of gene expression profiles in multicellular organisms. In Drosophila, PcG/TrxG proteins interact with DNA elements called PRE (Polycomb response elements). Previous work has shown that the repressive activity of inactive PRE in transgenes can be induced by architectural protein-binding sites. It was shown that the induction of repression is associated with the recruitment of PcG/TrxG proteins, including the DNA-binding factors Pho and Combgap. The present study the association of the two other PRE DNA-binding factors, GAF and Psq, with bxdPRE in the presence and absence of sites for architectural proteins. As a result, it was shown that both factors can be efficiently recruited to the bxdPRE only in the presence of adjacent binding sites for architectural proteins Su(Hw), CTCF, or Pita.
BIOLOGICAL OVERVIEW

pipsqueak is a sequence-specific DNA binding protein that targets a Polycomb group protein complex to Polycomb response elements (PREs). The Polycomb (Pc) group (Pc-G) of repressors is essential for transcriptional silencing of homeotic genes that determine the axial development of metazoan animals. It is generally believed that the multimeric complexes formed by these proteins nucleate certain chromatin structures to silence promoter activity upon binding to PREs. Little is known, however, about the molecular mechanism involved in sequence-specific binding of these complexes. An immunoaffinity-purified Pc protein complex has been shown to contains a DNA binding activity specific to the (GA)n motif in a PRE from the bithoraxoid region of Ultrabithorax. This activity can be attributed primarily to the large protein isoform encoded by pipsqueak (psq) instead of to the well-characterized GAGA factor Trithorax-like (Trl). The functional relevance of psq to the silencing mechanism is strongly supported by its synergistic interactions with a subset of Pc-G that cause misexpression of homeotic genes (Huang, 2002).

The biological properties of Pipsqueak are not, however, confined to targeting of Pc-G repressors. Pipsqueak, can directly bind to Trl and is associated with Trl in vivo. Genetic interaction studies provide evidence that Psq and Trl act together in the transcriptional activation as well as the transcriptional silencing of homeotic genes. A complete colocalization of Psq and Trl on polytene interphase chromosomes and mitotic chromosomes suggests that the two proteins cooperate as general partners not only at homeotic loci, but also at hundreds of other chromosomal sites (Schwendemann, 2002).

Several salient features have been noted about PREs. For example, PREs can silence a distant marker gene. PREs can also exhibit a pairing-sensitive silencing effect, resulting in much stronger silencing on the marker gene when the PRE is present on the homologous chromosome. A high incidence of PRE insertion occurs at sites that contain preexisting PRE or PRE-like sequences. In general, PRE insertion creates a new chromosomal binding site for many Pc-G proteins. Further, PREs can confer transcription repression on Ultrabithorax (Ubx) in a Pc-dependent manner in cultured cells. Thus, PREs appear to act as the core sequences upon which Pc-G proteins assemble into large functional silencing complexes. It has been speculated that PREs at different chromosomal sites, when spatially juxtaposed, might cooperate and become more effective (Huang, 2002 and references).

How Pc-G can accomplish these tasks remains largely unclear. To date, less than half a dozen Pc-G genes have been thoroughly studied. Some Pc-G proteins contain domains that are capable of homophilic or heterophilic interaction, potentially facilitating formation and/or interaction of multimeric protein complexes. Consistently, large protein complexes containing Pc-G proteins have been identified. For example, PC, Polyhomeotic (Ph), and Posterior sex combs (Psc) are found in the Pc repression complex 1 of approximately 2 MDa. A smaller protein complex containing Enhancer of Zeste [E(Z)] and Extra sex combs (Esc) has also been reported. Since some Pc-G proteins have not been shown to copurify with these complexes, additional complexes might be expected. Germ line clones of many Pc-G mutations display similar but distinct patterns of embryonic defects, suggesting partially overlapping functions. Chromatin immunoprecipitation has also revealed substantial variation in the composition of the Pc-G complexes at different sites. Surprisingly, some of these sites are found in actively expressed genes. Thus, multiple Pc-G complexes might function in different contexts during development (Huang, 2002 and refereces therein).

A fundamental question yet to be addressed fully is how the Pc-G protein complexes recognize specific sequences in PRE. With the exception of pleiohomeotic (pho), which encodes the homolog of mammalian YY1, no existing Pc-G has been shown to bind specific DNA sequences. The Pho binding site is a functional constituent of PRE; however, the inability of a LexA-Pho fusion protein to silence a linked reporter gene as other Pc-G fusion proteins suggests that Pho alone may not be sufficient to target functional Pc-G complexes. The (GA)n motif present in PRE has been suggested to be critical for homeotic gene silencing. It has been further suggested that the GAGA factor (Trithorax-like), a well-characterized DNA binding protein for the GAGA motif, is involved in PRE binding. Contrary to the expected silencing effect, Trl has also been shown to act either as an antirepressor to alleviate the negative effects of histone H1 or as a transactivator in vitro, in cultured cells, and in stress response. In addition, Trl has been formerly classified as a member of the trithorax group of genes (trx-G) that antagonize Pc-G. Therefore, the role of GAF remains unresolved (Huang, 2002 and refereces therein).

An ~440-bp DNA fragment from the bithoraxoid region of Ubx can recapitulate both positive and negative effects of trx and Pc, respectively. Immunoaffinity chromatography has been used to purify tagged Pc-G complexes and then their DNA binding activity was assayed. The (GA)n motif in this fragment has been found to be the primary binding site for the Pc-G complexes. Several lines of evidence are presented to show that the DNA binding protein for the Ubx PRE is encoded by pipsqueak (Huang, 2002).

Several lines of evidence are provided to show that a novel DNA binding factor encoded by psq is a constituent of CHRASCH (chromatin-associated silencing complex for homeotics), a previously characterized major Pc-G protein complex (Chang, 2001). Since CHRASCH also contains a histone modification factor, HDAC1, it is suggested that this complex may represent a fully functional entity that can nucleate certain chromatin structures at and around specific sequences (i.e., PRE) of homeotic genes (Huang, 2002).

Biochemical purification of Pc-G protein complexes has been limited by their apparent instability. Thus, a balance between biochemical purity and functional integrity might be considered. Different approaches are required subsequently to substantiate the physiological relevance of copurified proteins. To meet these criteria, the strategy was adopted of purifying Pc-G protein complexes to sufficient homogeneity mainly by immunoaffinity chromatography under moderate conditions, then examining the biochemical functions potentially relevant to these complexes, followed by identifying the functional constituents of the complex and corresponding genes, and finally validating their roles with genetic studies (Huang, 2002).

The bxd region has been extensively examined for polycomb response elements. Although different fragments ranging from ~400 bp to ~1 kb have been studied, they share a common region represented almost entirely by the B-151 fragment analyzed in this study. Among the three binding motifs of this fragment, it was found that the (GA)n motif represents the most prominent binding site for CHRASCH. In recent studies, the role of this motif in silencing has been demonstrated in transgenic flies. Thus, it is believed that this motif plays a critical role in anchoring one of the major Pc-G complexes (i.e., CHRASCH). These results, however, are not mutually exclusive to the possibility that other motifs may be required for different functional aspects of PRE (Huang, 2002).

One of the most critical issues concerning the specific targeting of the Pc-G complex appears to reside in the identity of the DNA binding factor. These results support the conclusion that Psq-A plays a primary role in such a function for the following reasons: (1) Psq-A, but not Trithorax-like, is copurified with CHRASCH; (2) UV cross-linking studies strongly indicate that Psq-A binds directly to the (GA)n motif. Additional proteins, however, were also evident in these studies. At present, it is not possible to distinguish between the possibilities that these proteins represent degradation products of Psq-A, other novel binding proteins, or spurious cross-linking to sterically adjacent proteins in the complex. Nevertheless, it is clear that Psq-A is involved in the binding of the (GA)n motif in vitro. (3) Psq is colocalized with Pc-G protein at both ANTP-C and BX-C sites on polytene chromosomes. (4) There is a remarkably strong genetic interaction between Pc-G and psq that gives rise to leg transformation and ectopic Ubx expression. (5) It has been shown that the lack of Psq-A in one mutant (i.e., psqDelta18) is sufficient to account for genetic interaction with Pc (Huang, 2002).

Recent studies have indicated that Trithorax-like (Horard, 2000) or a combination of novel forms of Trithorax-like and Psq (Hodgson, 2001) is responsible for the binding of the Pc-G complex to the (GA)n motif. In one study, embryonic nuclear extracts were used to form the DNA-protein complex, followed by immunodetection with Trithorax-like antibody. Since multiple (GA)n motifs are present in the probes, it is difficult to exclude the possibility that Trithorax-like and Pc-G complexes might bind these motifs independently. Similar problems also arise from subsequent studies in which fusion proteins of LexA and Pc-G have been used to bind probes containing LexA binding sites, since the minimal Trithorax-like binding site, the GAG trinucleotide, is also present in the LexA probe. Although more purified fractions were used for DNA binding analysis in the other study (Hodgson, 2001), a combination of Bio-Rex 70 and Q-Sepharose may not provide sufficient resolving power to exclude the possibility that a large number of unrelated proteins are copurified. In addition, the final fractions appear to be enriched for a GAGA factor of ~54 kDa and to exclusively contain Psq (~70 kDa). Both proteins appear substantially smaller than the smallest forms detected in the original extracts (~67 kDa for Trithorax-like and ~95 kDa for Psq). Since both Trithorax-like and Psq antisera have nonspecific cross-reactivities (see Horowitz, 1996 for Psq), the identities of these proteins remain obscure. Nonetheless, despite these uncertainties, it is possible that Trithorax-like may play a role in certain aspects of the silencing mechanism as suggested by genetic studies (Huang, 2002 and references therein).

Other sequence-specific DNA binding factors have also been implicated for Pc-G targeting by genetic and/or biochemical studies. Pho is the only one that has been formally categorized as a Pc-G. Its binding sites are present in many PRE. In addition, mutations of the Pho binding site compromise the ability of PRE to silence reporter genes in larval tissues. However, Pho does not appear to be directly associated with many Pc-G proteins. Thus, despite its important role in homeotic gene silencing, it is not clear whether Pho is directly involved in the targeting of Pc-G complexes (Huang, 2002 and references therein).

Another potential candidate involved in the binding of the Pc-G complex is the Zeste protein for its copurification with Pc repression complex 1. It has been speculated that Zeste proteins act as the scaffold via self-multimerization to bring together regulatory sequences situated on the same chromosome or different chromosomes. Its binding site has also been found in several PRE. In contrast to the proposed role for silencing, however, previous molecular and genetic studies have shown that the Zeste protein is most likely an activator. For example, it stimulates transcription of the Ubx promoter in vitro. Expression of a Ubx-LacZ transgene is completely abolished by a zeste mutation. For its transactivating effect, zeste has been considered a trx-G. Consistent with this notion, direct physical interaction has recently been demonstrated between the Zeste protein and two trx-G proteins, Moire and Osa, of the Brahma nucleosome remodeling complex. Genetically, zeste has also been defined as a transactivator involved in transvection of several genes, including Ubx. In addition, several Pc-G have been identified as suppressors of zeste. These observations cast some doubts on the physiological relevance of the Zeste protein in homeotic gene silencing. It is important to note that the two best characterized PRE (i.e., bxd and Fab7) also respond to trx-G. Thus, the mere existence of binding sites in PRE may not necessarily provide an unambiguous indication of their functions. While the manuscript was under review, however, a recent study has shown that zeste mutations result in an extended expression of a Ubx transgene containing a replacement of the proximal promoter with a combination of multiple Zeste and NTF-1 binding sites (Hur, 2002), suggesting a role for zeste in Ubx silencing. However, since extended expression was also observed for a Ubx transgene containing multiple NTF-1 binding sites at the proximal promoter region, the exact role of zeste may need to be more thoroughly examined (Huang, 2002 and references therein).

In conclusion, these results provide direct evidence that a specific Psq isoform is critically involved in the targeting of a major Pc-G protein complex CHRASCH to the (GA)n motifs that are commonly found in PRE. Earlier studies have demonstrated that a functional HDAC1 is associated with CHRASCH and is required for the silencing in vivo (Chang, 2001). A simple model is suggested for homeotic gene silencing that involves the assembly of multimeric complexes by known Pc-G proteins and other novel proteins yet to be identified, direct binding to specific sequences of PRE, and subsequent modification of N-terminal tails of core histones to establish a silencing code for stable maintenance of an inactive state (Huang, 2002).

It is also relevant to note that the functions of Pc-G silencing complexes may not be fully revealed by previous genetic or biochemical approaches because of the lack of suitable mutations, easily tractable phenotypes, or sufficient stability of the protein complexes. In the case of psq, a grandchildless class of mutations, sufficient amounts of Psq remain detectable in most homozygous mutant adults, yet embryos produced by these adults become severely defective before the manifestation of homeotic genes (Horowitz, 1996). In addition, the presence of more Psq sites than Pc-G sites on polytene chromosomes suggests a much wider spectrum of target genes for Psq. These effects altogether could conceivably obscure the homeotic effect caused by psq mutations, unless a more-sensitized genetic background (e.g., Pc mutations) is provided. The roles of MI-2 and HDAC1 in homeotic gene silencing also become apparent with similar approaches. It is speculated that some novel functions of the silencing complex may be defined by more-systematic studies (Huang, 2002).

Ecdysone-induced 3D chromatin reorganization involves active enhancers bound by Pipsqueak and Polycomb

Evidence suggests that Polycomb (Pc) is present at chromatin loop anchors in Drosophila. Pc is recruited to DNA through interactions with the GAGA binding factors GAF and Pipsqueak (Psq). Using HiChIP in Drosophila cells, this study found that the psq gene, which has diverse roles in development and tumorigenesis, encodes distinct isoforms with unanticipated roles in genome 3D architecture. The BR-C, ttk, and bab domain (BTB)-containing Psq isoform (Psq(L)) colocalizes genome-wide with known architectural proteins. Conversely, Psq lacking the BTB domain (Psq(S)) is consistently found at Pc loop anchors and at active enhancers, including those that respond to the hormone ecdysone. After stimulation by this hormone, chromatin 3D organization is altered to connect promoters and ecdysone-responsive enhancers bound by Psq(S). These findings link Psq variants lacking the BTB domain to Pc-bound active enhancers, thus shedding light into their molecular function in chromatin changes underlying the response to hormone stimulus (Gutierrez-Perez, 2019).

Genomes are organized in the three-dimensional (3D) nuclear space to ensure that processes such as transcription are fine-tuned in time and space. The first experiments using Hi-C described the segregation of chromatin into A (active) and B (inactive) compartments that interact with other genomic regions in a similar transcriptional state. More recently, experiments using high-resolution Hi-C data have found that the segregation of active and inactive chromatin scales to small compartmental domains of tens to hundreds of kilobases (kb). In addition, high-resolution Hi-C in mammalian cells has led to the discovery of thousands of point-to-point interactions representing CCCTC-binding factor (CTCF) loops. Drosophila cells lack loops anchored by CTCF. Instead, Hi-C heatmaps in Drosophila cultured cells and embryos have shown the existence of two classes of loops formed by contacts between specific sites. The first class represents hundreds of point-to-point interactions present in early embryos and whose anchors are enriched in RNA polymerase II (Pol II) and the transcription factor Zelda. The second class of loops was originally discovered in Kc167 cells and represents a few hundred point-to-point interactions whose anchors are enriched in other architectural proteins. These loops are frequently located within B compartmental domains, and their anchors are enriched in Polycomb (Pc), a member of the Polycomb repressor complex 1 (PRC1) that mediates recognition and binding to the histone modification histone H3 lysine 27 trimethylation (H3K27me3). Pc and most components of PRC1 lack DNA binding activity and are recruited to Polycomb response elements (PREs) containing GAGA sequence motif (GAGA) consensus binding sites by sequence-specific transcription factors. Deletion of PREs or GAGA motifs present at loop anchors results in loss of the corresponding loops and decreased Polycomb group (PcG)-mediated gene silencing during development. However, the GAGA binding factor or factors that mediate point-to-point interactions leading to the formation of these loops remain undefined. Two sequence-specific transcription factors, Trithorax-like/GAF and Pipsqueak (Psq) bind GAGA sequences. Interactions among GAF, Psq, and Pc have also been identified, and these physical interactions are supported by genetic interactions among these genes (Gutierrez-Perez, 2019).

The psq gene is a complex locus encoding two types of isoforms containing or lacking a BTB domain. The BR-C, ttk, and bab domain (BTB)-containing Psq isoforms are refered to as PsqL and the Psq isoforms lacking the BTB domain as PsqS. Both types of isoforms share a helix-turn-helix (HTH) DNA binding domain. The long PsqL isoforms, like GAF, contain a conserved BTB domain involved in protein-protein interactions. BTB domain-containing proteins have the ability to oligo- and multimerize in solution with other BTB or non-BTB-containing proteins. This ability and their location in distinct nuclear substructures suggest that BTB-containing proteins may interact with distant proteins in the genome, altering chromatin structure. Several Drosophila architectural proteins, such as CP190 and Mod(mdg4), contain BTB domains, and they colocalize in different combinations and levels of occupancy at architectural protein binding sites (APBSs). Several studies have suggested that interactions among GAF, Psq, and Pc involve the BTB domains of GAF and Psq. Therefore, it has been assumed that the BTB-containing Psq is responsible for the recruitment of Pc to GAGA sequences. However, this notion is at odds with earlier findings showing that the presence of a BTB domain inhibits DNA binding. Thus, the role of the BTB domain in the function of sequence-specific transcription factors and in the recruitment of PcG complexes remains unresolved. Given the expanding role of BTB domain-containing proteins and the PcG machinery in chromatin structure and cancer, it is important to characterize how variants lacking the BTB domain act in transcriptional regulation and cell differentiation (Gutierrez-Perez, 2019).

This study has characterize the in vivo function of Psq isoforms containing or lacking the BTB domain, their differential chromatin binding, and their associated long-range chromatin contacts using chromatin immunoprecipitation sequencing (ChIP-seq) and chromosome conformation capture (3C) coupled with sequencing combined with chromatin immunoprecipitation (HiChIP). PsqL colocalizes with Suppressor of Hairy wing (Su(Hw)) and other architectural proteins at sequences classically defined as insulators. In contrast to what was previously assumed, PsqS, rather than BTB-containing PsqL, colocalizes with GAF and Pc at enhancer elements and may therefore be responsible for the classical GAGA binding function assigned to the Psq protein. HiChIP analysis identifies two types of Pc-associated interactions. The first corresponds to Pc loops established by high-frequency point-to-point interactions between anchors containing PsqS. The second type corresponds to contacts between large, repressive Pc domains that form broad interactions similar to those mediated by B compartmental domains. To analyze the functional role of these interactions, changes in the 3D organization of chromatin were examined during the ecdysone-inducible response; PsqS-bound enhancers undergo dramatic changes in their contacts with the promoters of ecdysone-induced genes. These findings suggest distinct roles for Psq isoforms containing or lacking the BTB domain in Pc function and 3D chromatin architecture in response to developmental cues elicited by ecdysone (Gutierrez-Perez, 2019).

The BTB domain of human PLZF, Bcl-6, and Drosophila Psq have been shown to contribute to the oncogenic roles of these proteins. Most BTB-containing transcription factors also encode isoforms that lack the BTB domain and the role of these short isoforms is uncertain. This study shows that different isoforms of Psq appear to play different roles in nuclear function, which may explain their opposing roles in tumorigenesis ascribed to the gene. The BTB-containing PsqL isoform colocalizes with a specific class of architectural proteins that includes Su(Hw), CP190, and Mod(mdg4)2.2. In contrast, the PsqS isoform, which lacks the BTB domain, colocalizes with GAF and Pc at dCP enhancers and is mainly associated with active chromatin states. Therefore, PsqS appears to contribute to enhancer function, whereas PsqL is an architectural protein that binds to sequences that have insulator function. How these two isoforms display different genomic distributions while sharing the same DNA binding domain is unclear. However, based on previous findings, it is speculated that the conformation adopted by the protein in the presence of the BTB-interaction domain might inhibit its direct binding to DNA. In addition, the two isoforms coincide in regions in which both Pc and architectural proteins are found. This may explain the reported involvement of PsqL in the recruitment of PcG proteins to chromatin, where it might act with the help of other architectural proteins. In addition to its canonical role, Pc is found, together with PsqS, ISWI, GAF, and CBP, in regions containing H3K27ac and previously characterized experimentally as hkCP or dCP enhancers. These findings, suggesting an association of Pc with active enhancers, agree with previous observations showing that PRC1 can be recruited to active genes by the cohesin complex, where it affects phosphorylation of Pol II and Spt5 occupancy (Gutierrez-Perez, 2019).

H3K27me3 is present in the genome of Kc167 cells at very high levels in Pc-repressed domains such as Hox genes. The rest of the genome containing silenced genes in Kc167 cells has low but significant levels of H3K27me3 that represent B compartment sequences. Pc HiChIP analysis provides insights into the dual role of Pc in regulating chromatin organization. Classical Pc-repressed domains interact with each other and with other B compartments with a frequency that correlates with the amount of H3K27me3 present in these compartments. Distinct from these interactions, Pc also forms punctate point-to-point contacts. Two types of loops, defined as puncta of an intense signal in Hi-C heatmaps, have been identified when analyzing changes in 3D organization during Drosophila embryonic development. These loops were classified as active loops containing H3K27ac, Zelda, and Pol II at their anchors or as Pc loops bound by GAF. Zelda loops are absent from Kc167 cells. Like Pc loop anchors observed in embryos, loops represented by puncta in Hi-C heatmaps of Kc167 cells are located within regions enriched in H3K27me3. However, this study found that the center of these sites in Kc167 cells is depleted of H3K27me3 and enriched in H3K27ac. The exact roles of H3K27ac, Pc, PsqS, and GAF found at these loop anchors are unknown, but it is speculated that maintaining a localized active chromatin state may be important for the binding of these proteins and the establishment of these loops. These results suggest a dual and context-dependent function of regulatory elements and agree with previous studies showing that dCP enhancers can act as PREs, and vice versa, during Drosophila embryogenesis (Gutierrez-Perez, 2019).

Analysis of the distribution of sites containing Pc and PsqS in the genome also uncovered enrichment of these proteins around ecdysone-inducible genes, although most EcR, Pc, and PsqS peaks do not change significantly after ecdysone treatment. This is consistent with previous observations indicating that EcR does not change at most enhancers induced by ecdysone. Results from PsqS HiChIP experiments in control and ecdysone-treated cells suggest that hormone treatment leads to the establishment of new ecdysone-induced enhancer-promoter interactions without changes to pre-established Pc/PsqS loops. Early genes directly activated by ecdysone are paused before induction, and their expression is regulated at the level of Pol II release from promoter-proximal pausing. This suggests that activation of early gene expression by ecdysone requires the establishment of new enhancer-promoter interactions. The possible involvement of PsqS and Pc in the establishment of these interaction networks in the Drosophila embryo will be an interesting topic for future analyses (Gutierrez-Perez, 2019).


GENE STRUCTURE

cDNA clone length - 5162 (primary ovarian transcript coding for PsqA)

Bases in 5' UTR - 733

Exons - 10

Bases in 3' UTR - 1230


PROTEIN STRUCTURE

Amino Acids - 1065 (PsqA), 1085 and other smaller splice variants

Structural Domains

At the amino terminus, PsqA contains a BTB domain (Godt, 1993; also referred to as a POZ domain by Bardwell, 1994), a motif that has been shown to function in protein-protein interactions. Although BTB domains are often found near the N terminus of Cys2-His2 zinc finger proteins, PsqA does not appear to contain a zinc finger. Downstream of the BTB domain, PsqA contains 34 alternating histidine residues, (HX)n, a motif that is present in a number of other Drosophila proteins, primarily transcription factors. It has been proposed that these histidine repeats could mediate protein-protein interactions by coordinating metal ions to form a 'histidine-metal zipper' between two proteins containing the repeats. The presence of two potential protein-protein interaction domains suggests that PsqA monomers may interact with each other or with heterologous protein species. Additionally, PsqA contains four tandem copies of a conserved sequence of unknown function at its carboxy terminus, termed the psq motif (Horowitz, 1996).

Pipsqueak (Psq) belongs to a family of proteins defined by a phylogenetically old protein-protein interaction motif. Like the GAGA factor and other members of this family, Psq is an important developmental regulator in Drosophila, having pleiotropic functions during oogenesis, embryonic pattern formation, and adult development. The GAGA factor controls the transcriptional activation of homeotic genes and other genes by binding to control elements containing the GAGAG consensus motif. Binding is associated with formation of an open chromatin structure that makes the control regions accessible to transcriptional activators. Psq contains a novel DNA-binding domain, which binds, like the GAGA factor zinc finger DNA-binding domain, to target sites containing the GAGAG consensus motif. Binding is suppressed, as in the GAGA factor and other proteins of the family, by the associated protein-protein interaction motif. The DNA-binding domain, which is called the Psq domain, is identical with a previously identified region consisting of four tandem repeats of a conserved 50-amino acid sequence, the Psq motif. The Psq domain seems to be structurally related to known DNA-binding domains, both in its repetitive character and in the putative three-alpha-helix structure of the Psq motif, but it lacks the conserved sequence signatures of the classical eukaryotic DNA-binding motifs. Psq may thus represent the prototype of a new family of DNA-binding proteins (Lehmann, 1998).

It was asked if the Psq domain of D. melanogaster would exhibit DNA binding specificity. A 0.8-kilobase polymerase chain reaction fragment encoding the Drosophila Psq domain was cloned and the polypeptide was expressed by in vitro-translation. When this polypeptide is incubated with the hspGAGA2 oligonucleotide, a strong complex is formed. Formation of this complex is inhibited by increasing amounts of unlabeled hspGAGA2 but not by unrelated oligonucleotides shown to be ineffective in competing for binding of the A. mellifera Psq domain. Psq is thus the second GAGA-binding protein, other than the GAGA factor, that has been identified in D. melanogaster (Lehmann, 1998).

The similarity of the target sites recognized by Psq and GAGA factor suggests that binding of full-length Psq to GAGA sites in vivo might require the help of the GAGA factor. Binding of full-length Psq and Psq Delta240 to hspGAGA2 was tested in the absence and presence of the full-length GAGA-519 isoform of D. melanogaster. GAGA-519 in vitro translation products proved to be able to bind to hspGAGA2 with high affinity. Full-length Psq showed no binding and Psq Delta240 showed strong binding to this oligonucleotide. When either of these two proteins is mixed with the in vitro translated GAGA-519 isoform, the resulting pattern of DNA-protein complexes is the sum of the complex patterns observed in the presence of only the single proteins. Thus, the GAGA-519 isoform does not seem to be able to promote DNA-binding of full-length Psq in vitro. It remains to be shown if Psq isoforms containing both the BTB/POZ and Psq domains in fact bind to GAGA sites or other DNA-binding sites in vivo or if they exert their functions independent of DNA binding. Since isoforms also lacking the BTB/POZ domain seem to be expressed in vivo, binding to GAGA sites or related target sites may be reserved to these isoforms (Lehmann, 1998).

Psq cannot be easily assigned to any of the known families of eukaryotic DNA-binding proteins. Repeats with homology to the Psq motif are present in at least one additional Drosophila protein, the TKR protein (Haller, 1987), suggesting that this protein is also able to bind to DNA. Interestingly, a Drosophila BTB/POZ domain-encoding gene (BTB-III) has been identified (Zollman, 1994) that has an embryonic RNA distribution pattern very similar to that of Tkr and that maps to the same chromosomal position. It is thus interesting to speculate that the Tkr locus is more complex than previously supposed, encoding several protein isoforms, one of which contains a BTB/POZ domain in addition to the Psq domain. Beyond Drosophila, a homology to the Psq motif is found in a polypeptide predicted by an open reading frame of Caenorhabditis elegans cosmid T01C1. The Psq domain may thus define a new class of DNA-binding domains. At present, an extensive search of the protein sequence data bases reveals no other eukaryotic proteins with clear cut homology to the Psq motif. However, searching the Blocks Data base with a multiple alignment of the eight Psq repeats reveals significant sequence similarities to the DNA-binding domain of prokaryotic recombinases and thereby provides a link between the Psq domain and the homeodomain, for which such similarities have been described as well. Cocrystal structures with DNA of two recombinases, Hin recombinase gammaDelta-resolvase, show that their DNA-binding domains consist of three alpha-helices flanked by extended arms, which make contacts to the minor groove. The highest similarity between the Psq motif and the recombinase DNA-binding domain is observed within the C-terminal recognition helix, which forms a helix-turn-helix motif with helix 2 and inserts into the major groove. Remarkably, the recognition helix of members of the Hin recombinase family makes specific major groove contacts to a sequence that is clearly related to the GAGA motif. The Psq motif has the same size of about 50 amino acid residues as the recombinase DNA-binding domains, and secondary structure predictions for the Psq motif are compatible with the triple-helix structure of these domains. A similar triple-helix structure is formed by the homeodomain and Myb DNA-binding domain. It is interesting to note that, like the Psq domain, also the Myb DNA-binding domain consists of imperfect tandem repeats of a conserved sequence motif. The Psq domain thus seems to be structurally related, both in its conformation and in its repetitive structure, to known DNA-binding motifs, but it eludes the classification into one of the prevalent categories of eukaryotic DNA-binding domains. Identification of additional members of the Psq family and determination of the structure of the Psq domain complexed with DNA will help to better define this new class of DNA-binding domains (Lehmann, 1998).


pipsqueak: Evolutionary Homologs | Regulation | Developmental Biology | Effects of Mutation | References

date revised: 2 February 2023

Home page: The Interactive Fly © 1995, 1996 Thomas B. Brody, Ph.D.

The Interactive Fly resides on the
Society for Developmental Biology's Web server.