extradenticle: Biological Overview | Evolutionary Homologs | Regulation | Protein Interactions | Developmental Biology | Effects of Mutation | References

Gene name - extradenticle

Synonyms -

Cytological map position - 14A1-B1

Function - transcription factor

Keyword(s) - cofactor with homeodomain transcription factors, oncogene

Symbol - exd

FlyBase ID:FBgn0000611

Genetic map position - 14A1-B1

Classification - homeodomain PBX class

Cellular location - nuclear and cytoplasmic

NCBI links: Entrez Gene

extradenticle orthologs: Biolitmine
Recent literature
Jimenez-Mejia, G., Montalvo-Mendez, R., Hernandez-Bautista, C., Altamirano-Torres, C., Vazquez, M., Zurita, M. and Resendez-Perez, D. (2022). Trimeric complexes of Antp-TBP with TFIIEbeta or Exd modulate transcriptional activity. Hereditas 159(1): 23. PubMed ID: 35637493
Hox proteins finely coordinate antero-posterior axis during embryonic development and through their action specific target genes are expressed at the right time and space to determine the embryo body plan. This study reports Antennapedia (Antp) Hox protein-protein interaction with the TATA-binding protein (TBP) and the formation of novel trimeric complexes with TFIIEβ and Extradenticle (Exd), as well as its participation in transcriptional regulation. Using Bimolecular Fluorescence Complementation (BiFC), this study detected the interaction of Antp-TBP and, in combination with Forster Resonance Energy Transfer (BiFC-FRET), the formation of the trimeric complex with TFIIEβ and Exd in living cells. Mutational analysis showed that Antp interacts with TBP through their N-terminal polyglutamine-stretches. The trimeric complexes of Antp-TBP with TFIIEβ and Exd were validated using different Antp mutations to disrupt the trimeric complexes. Interestingly, the trimeric complex Antp-TBP-TFIIEβ significantly increased the transcriptional activity of Antp, whereas Exd diminished its transactivation. These findings provide important insights into the Antp interactome with the direct interaction of Antp with TBP and the two new trimeric complexes with TFIIEβ and Exd. These novel interactions open the possibility to analyze promoter function and gene expression to measure transcription factor binding dynamics at target sites throughout the genome.
Winant, M., Buhler, K., Clements, J., De Groef, S., Hens, K., Vulsteke, V. and Callaerts, P. (2022). Genome-wide analysis identifies Homothorax and Extradenticle as regulators of insulin in Drosophila Insulin-Producing cells. PLoS Genet 18(9): e1010380. PubMed ID: 36095003
Drosophila Insulin-Producing Cells (IPCs) are the main production site of the Drosophila Insulin-like peptides or dilps which have key roles in regulating growth, development, reproduction, lifespan and metabolism. To better understand the signalling pathways and transcriptional networks that are active in the IPCs publicly available transcriptome data of over 180 highly inbred fly lines were queried for dilp expression, and dilp expression was used as the input for a Genome-wide association study (GWAS). This resulted in the identification of variants in 125 genes that were associated with variation in dilp expression. The function of 57 of these genes in the IPCs was tested using an RNAi-based approach. IPC-specific depletion of most genes was found to result in differences in expression of one or more of the dilps. Then on one of the candidate genes with the strongest effect on dilp expression, Homothorax, a transcription factor known for its role in eye development, was examined further. Homothorax and its binding partner Extradenticle were found to be involved in regulating dilp2, -3 and -5 expression; genetic depletion of both TFs shows phenotypes associated with reduced insulin signalling. Furthermore, evidence is provided that other transcription factors involved in eye development are also functional in the IPCs. In conclusion, this study showed that this expression level-based GWAS approach identified genetic regulators implicated in IPC function and dilp expression.
Buffry, A. D., Kittelmann, S. and McGregor, A. P. (2023). Characterisation of the role and regulation of Ultrabithorax in sculpting fine-scale leg morphology. Front Cell Dev Biol 11: 1119221. PubMed ID: 36861038
Hox genes are expressed during embryogenesis and determine the regional identity of animal bodies along the antero-posterior axis. However, they also function post-embryonically to sculpt fine-scale morphology. To better understand how Hox genes are integrated into post-embryonic gene regulatory networks, this study further analysed the role and regulation of Ultrabithorax (Ubx) during leg development in Drosophila melanogaster. Ubx regulates several aspects of bristle and trichome patterning on the femurs of the second (T2) and third (T3) leg pairs. Repression of trichomes in the proximal posterior region of the T2 femur by Ubx is likely mediated by activation of the expression of microRNA-92a and microRNA-92b by this Hox protein. Furthermore, this study identified a novel enhancer of Ubx that recapitulates the temporal and regional activity of this gene in T2 and T3 legs. Transcription factor (TF) binding motif analysis was used in regions of accessible chromatin in T2 leg cells to predict and functionally test TFs that may regulate the Ubx leg enhancer.The role of the Ubx co-factors Homothorax (Hth) and Extradenticle (Exd) in T2 and T3 femurs was also tested. Several TFs were found that may act upstream or in concert with Ubx to modulate trichome patterning along the proximo-distal axis of developing femurs and that the repression of trichomes also requires Hth and Exd. Taken together our results provide insights into how Ubx is integrated into a post-embryonic gene regulatory network to determine fine-scale leg morphology.
BIOLOGICAL OVERVIEW extradenticle behaves like a homeotic gene, causing transformation of segmental identities. EXD is the best described example of how other proteins cooperatively interact with homeotic proteins to increase the specificity of homeotic protein binding to DNA. EXD acts as a cofactor with homeotic genes in transcriptional activation. This can be shown phenotypically by generating clones of mutant exd cells in flies and observing the resultant adult patterns. Such mutants show ectopic transformation of head structures and legs, similar to those found in homeotic gene mutants (Gonzalez-Crespo, 1995, Rauskolb, 1995 and Mann, 1995).

One area of research investigates how transcription factors interact to achieve specific binding to DNA and consequent transcriptional activation. This information is important because of the pervasive influence of homeotic proteins on morphogenesis. Homeobox genes recognize more or less the same consensus DNA sequence. With low sequence recognition specificity, how does specific DNA recognition take place? Thus there is a problem in explaining the specificity of homeobox binding to DNA. Cofactors such as EXD increase the binding specificity of homeotic genes to DNA, and thus provide a mechanism that explains the specific effects of homeobox genes.

The biochemical basis of EXD interaction with homeotic genes has been studied intensively. Ultrabithorax is one EXD partner in gene activation. The best characterized enhancer element DNA binding site for EXD-UBX interaction is the parasegment 7 enhancer responsible for activation of decapentaplegic in the visceral mesoderm. The interaction between EXD and UBX requires three surface-exposed homeodomain residues of UBX and the UBX C-tail, adjacent to the C-terminal end of the homeodomain (Chan, 1994).

Binding sites for the two transcription factors EXD and UBX on a dpp midgut enhancer fragment are close enough to partially overlap. On this fragment EXD and UBX bind cooperatively, with UBX binding increasing 6 to 30 fold in the presence of EXD. Dissociation studies reveal that EXD stabilizes the DNA-bound form of UBX. Although ANTP binds on the fragment, EXD does not work cooperatively with ANTP (Chan, 1994).

One effect of homeotic gene function, phenotypic suppression (described in detail at the labial site), resembles these results. When two homeotic genes are expressed in the same segment, the resulting segment identity is usually governed by only one of the two. Ubiquitous expression of Ubx does not activate dpp posterior to PS7 because UBX cannot override repression by ABD-A and ABD-B. In effect UBX is phenotypically suppressed by ABD-A and ABD-B. Homeotic genes may compete for an interaction with EXD or the binding of some homeotic proteins may block EXD binding (Chan, 1994).

The homeodomain proteins encoded by the Hox complex genes do not bind DNA with high specificity. In vitro, Hox specificity can be increased by binding to DNA cooperatively with the homeodomain protein Extradenticle or its vertebrate homologs, the PBX proteins (when considered together, known as the PBC family). One of the best characterized Hox-PBC binding sites is present in a 20 bp oligonucleotide repeat 3, which was identified in the 5' promoter region of the mouse Hoxb-1 gene. Hoxb-1 protein or its Drosophila ortholog Labial are both able to bind cooperatively with Exd to the binding site whereas other Hox proteins, such as Ultrabithorax or Hoxb-4 cannot. A two basepair change in a Hox-PBC binding site, from GG to TA, switches the Hox-dependent expression pattern generated in vivo from labial to Deformed. The change in vivo correlates with an altered Hox binding specificity in vitro. Similar Deformed-PBC binding sites were identified in the Deformed and Hoxb-4 genes. The Deformed sites include well characterized epidermal (EAE) and neural (NAE) autoregulatory enhancers. Two repeats containing TA sequence binding sites were found in the 2.7 kb EAE and two were found in the 600 bp NAE. These sites generate Deformed or Hoxb-4 expression patterns in Drosophila and mouse embryos, respectively. These results suggest a model in which Hox-PBC binding sites play an instructive role in Hox specificity by promoting the formation of different Hox-PBC heterodimers in vivo. Thus, the choice of Hox partner, and therefore Hox target genes, depends on subtle differences between Hox-PBC binding sites (Chan, 1997).

To regulate their target genes, the Hox proteins of Drosophila often bind to DNA as heterodimers with the homeodomain protein Extradenticle. For Exd to bind DNA, it must be in the nucleus, and its nuclear localization requires a third homeodomain protein, Homothorax (Hth). A conserved N-terminal domain of Hth directly binds to Exd in vitro, and is sufficient to induce the nuclear localization of Exd in vivo. However, mutating a key DNA binding residue in the Hth homeodomain abolishes many of its in vivo functions. Hth binds to DNA as part of a Hth/Hox/Exd trimeric complex; this complex is essential for the activation of a natural Hox target enhancer. Using a dominant negative form of Hth, evidence is provided that similar complexes are important for several Hox- and exd-mediated functions in vivo. These data suggest that Hox proteins often function as part of a multiprotein complex, composed of Hth, Hox, and Exd proteins, bound to DNA (Ryoo, 1999).

Exd directly binds to Hth and to the mammalian Hth homolog, MEIS1 (Rieckhof, 1997), suggesting that Exd interacts with a domain that is conserved between these two proteins. Hth and MEIS1 have two highly conserved domains: the HM (Homothorax-Meis) domain near the N terminus, and the homeodomain near the C terminus. In addition, based on sequence comparisons with the related vertebrate protein PREP1, the HM domain can be considered to have two subdomains, HM A and HM B , that are more highly conserved. A glutathione S-transferase (GST) pull-down assay was used to determine which part of Hth interacts with Exd. GST-Hth and GST-HM are both able to interact with Exd protein in vitro. In contrast, neither GST-(HM B +HD), which begins in the middle of the HM domain and extends to the end of the protein, nor GST-HD, which spans the homeodomain, interacts with Exd. These results demonstrate that the HM domain of Hth is necessary and sufficient for the interaction with the PBC-A domain of Exd [EXD (amino acids 144-376) which is necessary for the HTH-EXD interaction]. Further, these results are consistent with the interaction domains defined in the vertebrate proteins MEIS1 and PBX1 (Ryoo, 1999).

To determine the function of the HM and homeo domains in vivo, mutant and wild-type Hth coding sequences were fused to green fluorescent protein (GFP), and these fusion genes were expressed in flies under the control of the yeast transcription factor Gal4. In wild-type Drosophila imaginal wing discs, Exd is cytoplasmic in cells that will generate the future wing blade, but is nuclear in cells surrounding the wing blade region. Exd is usually nuclear only in those cells where Hth is present, but when expressed at high levels or when fused to an additional nuclear localization sequence (NLS-Exd), Exd becomes partially nuclear. When GFP-Hth expression is driven in wing discs by the ptc:Gal4 driver line (which is expressed in a stripe of cells that bisects the wing blade), the endogenous Exd is shifted into the nucleus in GFP-Hth-expressing cells. To test if the Hth homeodomain is required for Exd’s nuclear localization, two mutant proteins were tested: GFP-HM and GFP-Hth 51A (which has Asn 51 of the Hth homeodomain mutated to alanine). Asn 51 is conserved in all known homeodomains and makes essential DNA contacts. GFP-Hth 51A is able to induce the nuclear localization of Exd in wing pouch cells, suggesting that the Hth homeodomain does not need to bind to DNA for this function. GFP-HM is also able to induce the nuclear localization of Exd, demonstrating that the HM domain is sufficient for this activity. GFP-HD, which lacks the HM domain but contains an intact homeodomain, is unable to induce Exd’s nuclear localization. These data suggest that hth does not induce the nuclear localization of Exd by transcriptionally regulating a third factor. Instead, together with the in vitro interaction data, they suggest that Hth induces the nuclear localization of Exd via a direct interaction between the Hth HM domain and the Exd PBC-A domain (Ryoo, 1999).

During leg development, expression of the homeobox gene Distal-less, which is required for ventral limb development, is mutually antagonistic with Hth/Exd function: Dll is a repressor of hth and Hth can also repress Dll. Hth’s ability to repress Dll requires Hth's homeodomain. From ectopic expression assays, it is concluded that although the Hth homeodomain is not required to induce Exd’s nuclear localization, it is necessary for many Hth functions, including the regulation of specific target genes such as Dll. The one known exception is that all forms of Hth, including GFP-Hth 51A and GFP-HM, are able to interfere with distal leg development when expressed with the Dll:Gal4 driver. This phenotype, however, is also observed when wild-type Exd is expressed with this driver, and therefore does not require any Hth input. The different in vivo activities of Hth and Hth 51A indicate that Hth has functions in addition to localizing Exd to nuclei, and that these functions require Hth to bind DNA (Ryoo, 1999).

The tight interaction between Hth and Exd proteins, together with the requirement for the Hth homeodomain for many of Hth’s functions, suggested that Hth might be binding to the same target enhancers as Hox/Exd heterodimers. One well characterized Hox/Exd target is an autoregulatory enhancer from the labial (lab) gene, called lab550. A 48 bp fragment of lab550, lab48/95, is necessary for lab550 activity and, in one copy, is sufficient to direct a labial- and exd-dependent pattern of expression in endodermal cells. In lab48/95 there is a single Lab/Exd heterodimer binding site, TGATGGATTG; this binding site is necessary for the activity of lab550. Also in lab48/95 is a binding site that resembles a high affinity site for MEIS1: GACTGTCA, a murine Hth homolog. To test if this site is a bona fide Hth binding site, band shift experiments were performed with Lab, Hth, and Exd proteins on the wild-type lab48/95 oligo, and on an oligo with point mutations in the putative Hth binding site, GACTtatA (lab48/95 hth). Neither Lab, Exd, nor Hth are able to bind lab48/95 on their own. The combination of Exd plus Hth is able to weakly bind this DNA. Because binding is diminished on lab48/95 hth, these data suggest that Exd and Hth exhibit weak cooperative binding to lab48/95, consistent with previous studies with MEIS1 and PBX1. Lab cooperatively binds with Exd to lab48/95 and the binding of this heterodimer requires both the Exd and Lab half sites. In contrast, no complex formation is observed when Hth and Lab are combined. However, when increasing amounts of Hth are added to a constant amount of Lab plus Exd, the Lab/Exd band disappears and in its place a Hth/Lab/Exd trimeric complex is observed. The Hth/Lab/Exd band is more intense than the Lab/Exd band, suggesting that Hth contributes to the DNA binding affinity of the trimeric complex. Additonal tests show that the Hth/Lab/Exd complex requires the putative Hth binding site; use of truncated proteins show that protein-protein interaction between Hth and Exd is necessary for the formation of the Hth/LAB/Exd complex, but that DNA binding by the Hth homeodomain contributes to the stability of this complex. Also, the Hth binding site is required for lab48/95 activity in embryos. Thus a DNA bound Hth/LAB/Exd triple complex is capable of activating lab48/95-lacZ in vivo. This was confirmed by interfering with the stable assembly of this complex by expressing the HM domain, which binds to Exd and therefore competes with the interaction between Exd and Hth (Ryoo, 1999).

If GFP-HM is interfering with Hth and Exd function in vivo, its over-expression should be able to phenocopy other hth or exd mutant phenotypes. One function of hth is to direct antennal development; in the absence of either hth or exd activities, antennal structures are autonomously transformed into leg identities. Consistent with GFP-HM acting as a dominant negative, its expression in the Dll domain transforms distal antenna into distal leg. The antenna to leg transformations observed in GFP-HM-expressing animals show bristles with bracts, typical of a distal leg identity. In contrast, expression of GFP-Hth 51A does not generate this transformation. Together with the noted effect on the reporter genes, these data suggest that GFP-HM, but not GFP-Hth 51A, interferes with hth function. This would indicate that GFP-HM has dominant negative activity whereas GFP-Hth 51A behaves as a hypomorph. GFP-HM can also alter the segment identity of the adult abdomen which, unlike antennal development, requires input from both exd and Hox genes. In wild-type male abdomens, posterior tergites have darker pigmentation and a lower density of small hairs (trichomes) than anterior tergites. hth minus clones, like exd minus clones, in the second or third tergite of a male fly show an increase in pigmentation and a decrease in trichome density, consistent with a transformation into a more posterior abdominal identity. When GFP-HM is expressed using pnr-Gal4, an increase in pigmentation in anterior tergites results, consistent with an anterior-to-posterior transformation of abdominal segment identity. However, no effect on trichome density is observed following GFP-HM expression, suggesting that this transformation is incomplete. In contrast, expression of wild-type GFP-Hth using pnr-Gal4 results in a decrease in pigmentation and an increase in trichome density in tergites 5 and 6, consistent with a posterior-to-anterior shift in cell fate. Expression of GFP-Hth 51A generates a weak version of this transformation. These results suggest that interfering with hth function by expressing the HM domain can interfere with a Hox-dependent function, such as tergite identity in the adult abdomen. Moreover, they suggest that different amounts of hth activity in the abdomen contribute to differences in tergite identity (Ryoo, 1999).

Functional specificity of a Hox protein mediated by the recognition of minor groove structure

The recognition of specific DNA-binding sites by transcription factors is a critical yet poorly understood step in the control of gene expression. Members of the Hox family of transcription factors bind DNA by making nearly identical major groove contacts via the recognition helices of their homeodomains. In vivo specificity, however, often depends on extended and unstructured regions that link Hox homeodomains to a DNA-bound cofactor, Extradenticle (Exd). Using a combination of structure determination, computational analysis, and in vitro and in vivo assays, this study showed that Hox proteins recognize specific Hox-Exd binding sites via residues located in these extended regions that insert into the minor groove but only when presented with the correct DNA sequence. These results suggest that these residues, which are conserved in a paralog-specific manner, confer specificity by recognizing a sequence-dependent DNA structure instead of directly reading a specific DNA sequence (Joshi, 2007).

It is well established that homeodomain-DNA recognition utilizes hydrogen bonds formed between recognition helix side chains and base-specific moieties in the major groove. However, the residues making these contacts are identical in all Hox proteins. While some N-terminal arm residues have been seen in the minor groove, these interactions have not been sufficient to account for specificity differences among Hox proteins. In particular, although Arg5 is often observed in the minor groove, it is common to all homeodomains. Conversely, residues 1 to 4 are important for Hox specificity, but are often not observed in homeodomain-DNA structures. The structure reported in this study, of a complex formed between a Scr-Exd dimer and an in vivo paralog-specific binding site, fkh250 (see Overview of structures and sequences), reveals Hox-DNA contacts that provide new insights into the molecular basis of Hox specificity. Minor groove contacts from linker (His-12) and N-terminal arm (Arg3) residues are critical for Scr's specific in vitro and in vivo properties. Moreover, both residues insert into an unusually narrow region of the minor groove, which in turn creates a local dip in electrostatic potential through the phenomenon of electrostatic focusing (see Protein-DNA contacts). In contrast, in the fkh250con* complex, the minor groove does not have these features, and, like many of the previous structures, there are no DNA contacts N-terminal to Arg5 (Joshi, 2007).

Based on these findings, it is suggested that there are two conceptually separable components to Hox-DNA binding. First, contacts between the DNA major groove and the recognition helix are sufficient to target Hox homeodomains to 'AT-rich' DNA sequences. Second, contacts made between the DNA minor groove and N-terminal arm/linker residues help to discriminate among AT-rich binding sites. Unlike recognition-helix residues in the major groove, the residues that insert into the minor groove recognize a specific DNA structure instead of forming base-specific hydrogen bonds. Below the implications are discussed of these findings for binding site recognition by Hox proteins as well as other DNA binding proteins (Joshi, 2007).

Consecutive ApA, TpT, or ApT base pair steps are known to result in a narrow minor groove due to negative propeller twisting that is stabilized by inter-base pair interactions in the major groove. In contrast, due to poor base stacking interactions, TpA steps tend to widen the minor groove and, for example, produce significant unwinding in the case of the TATA box. It is suggested that these sequence-dependent effects on DNA structure can account for the conformations of the two DNAs observed in this study. The fkh250con* binding site is TGATTTATGG (TpA steps are underlined). ATTT is expected to have the observed narrow minor groove where Arg5 binds. The AT sequence 3' to the TpA step is too short to produce the pattern of inter-base pair contacts required for minor grove narrowing. Moreover, the minor groove that is widened by this TpA step remains wide, in part due to the 3' guanines which introduce amino groups into the minor groove. In contrast, the fkh250 binding site is AGATTAATCG. Here, the ATT and AAT sequences flanking the TpA step both have the pattern of inter-base pair contacts and propeller twisting required for minor groove narrowing. Consequently, two minor groove width minima are observed. The second minimum, where His-12/Arg3 insert, is reinforced by a positive roll introduced by a 3′-CpG step (Joshi, 2007).

The DNA conformations observed in the crystal structures were qualitatively reproduced by Monte Carlo simulations, and the importance of the TpA steps and 3' flanking G-C base pairs in affecting DNA structure were supported by the simulations of DNAs containing individual base pair differences. Interestingly, the standard deviations observed in these simulations are different for fkh250 and fkh250con. This difference, which may reflect an inherent difference in flexibility, is also consistent with known sequence-dependent properties of DNA. The fkh250con* sequence, which shows a smaller standard deviation, is expected to be rigid due to the presence of an 'A-tract', a sequence that consists of at least three consecutive ApA, ApT or TpT steps. In contrast, the larger deviations seen in the fkh250 simulations indicate greater conformational flexibility that can be attributed to the absence of an A-tract and the presence of a TpA step in the middle of the sequence (Joshi, 2007).

The N-terminal arm has been known for some time to play an important role in Hox specificity. Consistent with this idea, this study found Arg3 and Arg5 in the minor groove of fkh250. However, Arg5 is conserved in all homeodomains and Arg3 is present in many Hox proteins, raising the question of what makes Scr's N-terminal arm unique. One answer is that other N-terminal arm differences are important for Scr's properties. In agreement with this notion, it was found that changing RQR to RGR reduced the affinity for fkh250 by ∼six-fold, similar to the effect observed when Arg3 was mutated to Ala. These data suggest that, unlike RQR of Scr, it is energetically unfavorable for the RGR motifs of Antp, Ubx, and AbdA to assume the conformation of the RQR motif as seen in the fkh250 complex. This may be due in part to the increased entropic cost associated with fixing a Gly in any given conformation but also to the fact that its lack of a Cβ precludes the formation of the hydrophobic contact formed between Gln4 and Thr6 in the fkh250 complex (the distance between the Cδ of Gln4 and the Cγ of Thr6 is about 4.7 Å) (Joshi, 2007).

Taken together, these results suggest that the conformational preferences of Hox N-terminal arms are an important determinant of Hox specificity. However, there is clearly more to the story because, like Scr, Deformed (Dfd) also has an RQR motif in its N-terminal arm, but Dfd does not activate fkh250-lacZ in vivo. Thus, while the sequence of the N-terminal arm plays an important role, and allows Hox proteins to be categorized into RGR and RQR subgroups, other specificity-determining factors must also exist. Based on these results, and as discussed below, it is suggested that other important contributors are the paralog-specific residues neighboring the YPWM motif (Joshi, 2007).

His-12 is located in Scr's linker region, four residues away from its YPWM motif. Interestingly, not only is His-12 conserved in all Scr orthologs, residues on both sides of its YPWM motif are also well conserved (see Overview of structures and sequences). This pattern is not unique to Scr and its orthologs: residues in the vicinity of Hox YPWM motifs are generally conserved in a paralog-specific manner. In fact, the evolutionarily conserved sequences in the vicinity of YPWM are sufficient to distinguish between Hox paralogs, and can even discriminate between Scr and Deformed (Dfd), which, like Scr, also has a His in the same position relative to its YPWM motif. These observations suggest that paralog-specific residues near the YPWM motif, together with the N-terminal arm, may be considered as specificity-determining 'signature' residues. Analogous to the findings with Scr-fkh250, it is suggested that these paralog-defining residues in other Hox proteins are critical for the recognition of specific binding sites in vivo. These residues may, as shown here for His-12 and Arg3 of Scr, contact DNA. Alternatively, as shown here for Scr's Gln4, they may be important for specifying the correct conformation of the DNA-contacting residues. A general role for linker and N-terminal arm residues in Hox specificity is supported by the in vivo specificities of Hox protein chimeras (Joshi, 2007).

Although His-12 is conserved among all Scr orthologs, mutating it to an Ala had, for most readouts, only a partial effect on binding or in vivo activity. In contrast, the Arg3 to Ala mutation had a much larger effect, and the strongest effect was observed when both His-12 and Arg3 were mutated to Ala. Some simple considerations can in principle account for the data. First, it is suggested that the main contribution of His-12/Arg3 is to provide a positive charge and, consequently, a favorable electrostatic interaction between Scr and fkh250. Second, given the N-N distance of 2.9 Å in the His-Arg hydrogen bond, His-12 is likely neutral in the fkh250 complex, so that the net charge for both residues is +1. In the double mutant this charge is lost. The His-12 to Ala mutation leaves Arg3 intact and the net charge unchanged. The Arg3 to Ala mutation would likely result in the protonation of His-12 given the negative electrostatic environment in the minor groove, also leaving the net charge of the protein unchanged. While these considerations can explain why the effect of the double mutant is stronger than of either single mutant, they do not explain why ScrArg3A binds more weakly to fkh250 than ScrWT or ScrHis-12A. One possibility is that there is an unfavorable free energy cost of proton uptake to His-12 when it is bound to DNA since, as opposed to Arg3, the free His is only partially protonated (Joshi, 2007).

These results suggest that the interaction of Hox proteins with Exd/Pbx through the YPWM motif is important, not only because the presence of two homeodomains allows for a larger and more specific DNA sequence readout in the major groove, but also because it favors conformations of the linker and N-terminal arm residues such that they can recognize structural patterns in the minor groove. Indeed, it appears that these residues are unable to assume these conformations in the absence of Exd/Pbx. That these residues have not been observed in two other Hox-Exd/Pbx ternary complexes may suggest that their intrinsic flexibility is designed to inhibit binding to the wrong DNA site. That is, only when the protein sequence is compatible with the structure of the minor groove will the stabilizing interaction be strong enough to overcome the entropic loss associated with binding (Joshi, 2007).

Studies on homeodomain-DNA binary complexes also suggest that the N-terminal arm has a tendency to be disordered, unless presented with a DNA structure that provides sufficient stabilizing interactions to compete with conformational entropy. For example, residues 1 to 4 are not observed in the Antp and Engrailed X-ray complexes. In contrast, most of the N-terminal arm is structured in an Even-skipped-DNA complex where, notably, both Arg3 and Tyr4 insert into the minor groove. In that complex the minor groove is quite narrow where Arg3 inserts, consistent with the idea that a narrow groove is required to structure a region of the protein which is intrinsically disordered. In the HoxA9-Pbx-DNA ternary complex, the N terminal arm is also ordered but in that case, a very short linker severely limits the conformational freedom of the N-terminal arm (Joshi, 2007).

As seen in the crystal structure, binding of Scr-Exd to fkh250con* involves residues that are present in all Hox proteins, thus providing an explanation for why this site is not specific for a particular paralog. As discussed above, the answer to the inverse question, of why fkh250 preferentially binds Scr-Exd, involves the insertion of His-12 and Arg3 into the minor groove, which is narrower than the equivalent region in fkh250con*. That a narrow groove is an inherent feature of the fkh250 site suggests the more general idea that Hox proteins recognize their specific binding sites by reading a sequence-dependent DNA structure which, in turn, enhances the negative electrostatic potential and attracts the positively charged Arg/His pair. Thus, local differences in electrostatic potential provide an explanation for why sequence-dependent DNA conformations can attract basic amino acids. This shape-dependent DNA recognition mechanism is distinct from 'direct readout' mechanisms that involve specific hydrogen bond formation and hydrophobic contacts between amino acid side chains and bases. It is also distinct from 'indirect readout' where protein binding is influenced by the global shape of a DNA molecule or by sequence-dependent DNA bending and deformability (Joshi, 2007).

Scr's ability to recognize the shape of the minor groove via basic residues may provide an example of a more general class of protein-DNA recognition mechanisms. For example, an Arg of phage 434 repressor inserts into the minor groove of its operator and a His in the DNA binding domains of interferon regulatory factors (IRFs) inserts into a compressed minor groove. Moreover, the sequence (either FGR, RGR or RGGR) in the minor groove binding region of monomeric human estrogen related receptors, hERR, is an important specificity determinant for that family of transcription factors. The analogy between Hox and hERR2, a nuclear receptor, is particularly striking as the Zn finger domain of nuclear receptors makes major groove contacts while a normally extended peptide expands the binding site by making minor groove contacts. It will be interesting to determine if, as suggested in this study for Hox proteins, other families of DNA binding proteins use a common set of major groove contacts to recognize large sets of degenerate binding sites with individual family members distinguishing among these sites via more specific minor groove contacts. For Hox proteins, it is suggested that such a two-tiered recognition system gives them the flexibility to bind both shared and paralog-specific binding sites (Joshi, 2007).

Transcription factor paralogs orchestrate alternative gene regulatory networks by context-dependent cooperation with multiple cofactors

In eukaryotes, members of transcription factor families often exhibit similar DNA binding properties in vitro, yet orchestrate paralog-specific gene regulatory networks in vivo. The serially homologous first (T1) and third (T3) thoracic legs of Drosophila, which are specified by the Hox proteins Scr and Ubx, respectively, offer a unique opportunity to address this paradox in vivo. Genome-wide analyses using epitope-tagged alleles of both Hox loci in the T1 and T3 leg imaginal discs, the precursors to the adult legs and ventral body regions, show that ~8% of Hox binding is paralog-specific. Binding specificity is mediated by interactions with distinct cofactors in different domains: the Hox cofactor Exd acts in the proximal domain and is necessary for Scr to bind many of its paralog-specific targets, while in the distal leg domain, the homeodomain protein Distal-less (Dll) enhances Scr binding to a different subset of loci. These findings reveal how Hox paralogs, and perhaps paralogs of other transcription factor families, orchestrate alternative downstream gene regulatory networks with the help of multiple, context-specific cofactors (Feng, 2022).

This study used a combination of whole-genome and mechanistic approaches to understand how serially homologous appendages, such as the fly T1 and T3 legs, obtain their unique morphologies due to the activities of parallel Hox gene networks. The very similar transcriptomes in the three pairs of leg discs suggest that the different morphologies are largely a consequence of changing the expression patterns of the same sets of genes. By comparing the genome-wide DNA-binding profiles of the two relevant Hox paralogs, Scr and Ubx, in their native physiological contexts, this study found hundreds of paralog-specific Hox targets, accounting for ~8% of all binding events for these two Hox proteins. Next, differences in chromatin accessibility and Hox monomer binding preferences were shown to be unlikely to account for paralog-specific binding. Instead, it was demonstrated that interaction with the Hox cofactor Exd explains a large fraction of Scr's paralog-specific binding events. Finally, this study identified Dll as a Hox cofactor in the complementary distal domain of the leg disc. Results from RNA-seq, CBP ChIP, and reporter assays suggest that about 1/3 of the paralog-specific Scr-binding events are functional and lead to tissue-specific gene regulation. Thus, paralog-specific Hox-DNA binding, which is mediated by multiple cofactors including Exd and Dll, contribute significantly to paralog-specific Hox gene networks (Feng, 2022).

Previous in vitro studies provided compelling evidence that the DNA-binding specificities of different Hox-Exd dimers are more divergent from each other than those of Hox monomers, a phenomenon termed latent specificity. There have also been several in vivo examples in which paralog-specific Hox-DNA binding and target regulation was shown to depend on an interaction with Exd. This study shows that, on a genome-wide scale, the interaction with Exd explains a significant fraction of paralog-specific Hox binding, which often leads to paralog-specific gene regulation (Feng, 2022).

Earlier work also suggested that there is a tradeoff between specificity and affinity for Hox-Exd-binding motifs, where high affinity binding motifs are more likely to have low specificity for different Hox-Exd heterodimers. The paralog-specific, Exd-dependent CRMs characterized in this study (ac-1, h-1, and fj<-1), have higher affinity Hox-Exd-binding motifs than those previously described in the shavenbaby (svb) gene: the major Scr-Exd motif in the fj-1 CRM has an affinity of about 0.06 relative to the optimal motif in the genome, while the motifs in ac-1-1 and h-1 have even higher relative affinities of nearly 0.15 and 0.2, respectively. In contrast, the Ubx-Exd-binding motifs in CRMs from svb have a relative affinity of <0.0118. One possible explanation for this difference is that the svb CRMs are active in embryos, which have many different cell types, while the CRMs characterized in this study are active in leg discs, which have significantly less cell-type complexity. Embryonic CRMs may require especially low-affinity binding motifs to distinguish their activities in a context with many cell types. Consistent with this idea, the fkh250 CRM, which is also active in embryos, uses an Scr-Exd-binding motif with a low relative affinity of 0.01718,36. Notably, the relative affinities for the Scr-Exd-binding motifs in ac-1, h-1, and fj-1 are at least eightfold higher than for Ubx-Exd. Manual inspection of other intergenic and intronic loci with ScrT1 > UbxT3 binding suggests that there are many other CRMs that follow this same rule. Thus, for specificity to occur, the most relevant feature may be that the affinity for the 'correct' TFs, in this case Scr-Exd, must be significantly greater compared to the affinity for other 'incorrect' TFs that are co-expressed in the same or homologous cells (Feng, 2022).

Because Exd is only nuclear in a subset of cells during Drosophila development, such as the proximal domain of the leg disc, it was unlikely that Exd was the only Hox cofactor. In fact, CRMs that are directly regulated by Ubx have been described in cells where Exd is not available to be a cofactor. However, it has remained an unresolved question whether non-Exd cofactors are used in these examples. More generally for the leg imaginal disc, the entire distal domain, extending from the trochanter to the tarsus, is without nuclear Exd, yet has Hox-dependent segment-specific morphological characteristics, such as the sex combs on the male T1 leg. Although several candidate TFs have been proposed to be Hox cofactors, none have been confirmed. This study provides evidence that Dll is a distally acting Hox cofactor in leg discs (Feng, 2022).

There are many differences between how Exd and Dll interact with Hox proteins when bound to DNA. The Scr-Exd-binding motif is comprised of two partially overlapping half-sites, while the Scr-Dll motif consists of two HD-binding motifs separated by a spacer of several base pairs. Another difference is that the amount of cooperativity observed for Hox-Exd is far greater than that observed for Hox-Dll. The overlapping nature of the Hox- and Exd-binding motifs may be important for latent specificity, which for Scr requires an Exd-induced conformational change of the homeodomain. In contrast, there is no evidence that latent specificity occurs as a consequence of Hox-Dll binding. Instead, the modest cooperativity observed for the Scr-Dll heterodimer is likely a consequence of increasing Scr-DNA-binding affinity via a protein-protein interaction and closely spaced Dll and Scr-binding motifs (Feng, 2022).

More generally, it is suggested that mode of DNA binding exhibited by Hox-Exd, which is highly cooperative and reveals latent specificity, may be the exception rather than the rule for TF-TF interactions within CRMs, and that the Scr-Dll example, with weak cooperativity between TFs stemming from a protein-protein interaction, may be the more common mode of interaction to distinguish the binding of paralogous TFs. In support of this notion, a systematic in vitro study identified 315 TF-TF interactions, only five of which exhibited latent specificity (Feng, 2022).

The TALE homeodomain proteins, which include Exd and Hth, are very ancient TFs that were present before the split of plants and animals, and TALE-mediated nuclear localization analogous to the Hth-Exd example in flies has been described in plants. In contrast, the Hox gene family is only present in metazoans, and Dll is specific to bilaterians. Moreover, it has been proposed that Dll initially functioned in the CNS, and was later co-opted to pattern the distal appendage. Based on these observations, it is plausible that the Hox-Dll interaction evolved more recently than the Hox-Exd interaction, accounting for why Exd interacts with all Hox paralogs, while Dll may be a more limited Hox cofactor. This is supported by the results from a small-scale bimolecular fluorescence complementation (BiFC) screen that revealed Dll interacts with some Hox proteins, but not others (Feng, 2022).

Notably, the combined activities of Exd and Dll still do not account for all ScrT1 > UbxT3-binding events genome-wide and reporter analysis suggests the presence of additional, yet to be identified Hox cofactors that have the capacity to promote Scr-specific binding. It is suggested that the Hox-Dll mode of binding uncovered in this study may be representative of additional TFs that also have the ability to promote paralog-specific Hox binding and activity at specific CRMs. Further, it is noted that the differentiation of the T1 and T3 leg fates is a continuous developmental process and that the observations described in this study are limited to the late 3rd instar stage. Nevertheless, it is expected that the principles governing Hox paralog specificity uncovered in this study will likely extend to other developmental stages and tissues. Finally, although this study focused on the role of paralog-specific TF-DNA binding, there may be additional mechanisms that do not depend on differences in DNA binding between paralogous TFs that also contribute to their specific functions (Feng, 2022).


Bases in 5' UTR - 206

Introns - none

Bases in 3' UTR - 1541


Amino Acids - 376

Structural Domains

Exd has a homeodomain of the pbx class. Exd is homologous to human proto-oncogene PBX1 and two other family members, PBX2 and PBX3 (Rauskolb, 1993).

extradenticle: Evolutionary Homologs | Regulation d | Protein Interactions | Developmental Biology | Effects of Mutation | References

date revised: 22 August 2023 

Home page: The Interactive Fly © 1995, 1996 Thomas B. Brody, Ph.D.

The Interactive Fly resides on the
Society for Developmental Biology's Web server.