POU domain protein 2

POU domain protein 2

Gene name - POU domain protein 2

Synonyms - dOct2, miti-mere

Cytological map position - 33F1--33F2

Function - transcription factor

Keyword(s) - neuron fate determination

Symbol - pdm-2

FlyBase ID:FBgn0004394

Genetic map position - 2-[46]

Classification - homeodomain and POU domain

Cellular location - nuclear

NCBI link: Entrez Gene

GENE orthologs: Biolitmine

Recent literature

Corty, M.M., Tam, J. and Grueber, W.B. (2016). Dendritic diversification through transcription factor-mediated suppression of alternative morphologies. Development 143: 1351-1362. PubMed ID: 27095495
Summary:
Neurons display a striking degree of functional and morphological diversity, and the developmental mechanisms that underlie diversification are of significant interest for understanding neural circuit assembly and function. This study finds that the morphology of Drosophila sensory neurons is diversified through a series of suppressive transcriptional interactions involving the POU domain transcription factors Pdm1 (Nubbin) and Pdm2, the homeodomain transcription factor Cut, and the transcriptional regulators Scalloped and Vestigial. Pdm1 and Pdm2 are expressed in a subset of proprioceptive sensory neurons and function to inhibit dendrite growth and branching. A subset of touch receptors show a capacity to express Pdm1/2, but Cut represses this expression and promotes more complex dendritic arbors. Levels of Cut expression are diversified in distinct sensory neurons by selective expression of Scalloped and Vestigial. Different levels of Cut impact dendritic complexity and, consistent with this, it was found that Scalloped and Vestigial suppress terminal dendritic branching. This transcriptional hierarchy therefore acts to suppress alternative morphologies to diversify three distinct types of somatosensory neurons.

BIOLOGICAL OVERVIEW

pdm-1 and pdm-2 are closely linked genes. They are coordinately regulated for the most part, with overlapping functions. pdm-2 is transcribed in elements of the developing nervous system suggesting a functional role in neurogenesis. It determines cell fates in a very specific subset of neuroblasts (Yang,1993). Both pdm-1 and pdm-2 are transcribed in the first progeny of neuroblast NB4-2, the ganglion mother cell GMC4-2a.

One gene required for specification or formation of NB4-2 is wingless, on a cell nonautonomous basis. fushi tarazu and even-skipped are also expressed in GMC4-2a and the RP2 motor neuron, one of the progeny of GMC4-2a.

prospero is yet another gene expressed in GMC4-2a. Loss of prospero results in loss of the RP2 motorneuron. Neither pdm1 nor pdm2 are required for the birth of GMC4-2a, but mutant embryos fail to produce mature RP2 neurons. pdm-1 and pdm-2 are expressed downstream of pros and ftz, but upstream of eve. Thus pdm-1 and pdm-2 are involved in cell fate specification of GMC4-2a. Quantitatively, loss of pdm-2 results in a greater mutant phenotype than loss of pdm-1 (Yeo, 1995).

Upregulation of Mitimere and Nubbin acts through Cyclin E to confer self-renewing asymmetric division potential to neural precursor cells

In the Drosophila CNS, neuroblasts undergo self-renewing asymmetric divisions, whereas their progeny, ganglion mother cells (GMCs), divide asymmetrically to generate terminal postmitotic neurons. It is not known whether GMCs have the potential to undergo self-renewing asymmetric divisions. It is also not known how precursor cells undergo self-renewing asymmetric divisions. Maintaining high levels of Mitimere or Nubbin, two POU proteins, in a GMC causes it to undergo self-renewing asymmetric divisions. These asymmetric divisions are due to upregulation of Cyclin E in late GMC and its unequal distribution between two daughter cells. GMCs in an embryo overexpressing Cyclin E, or in an embryo mutant for archipelago, also undergo self-renewing asymmetric divisions. Although the GMC self-renewal is independent of inscuteable and numb, the fate of the differentiating daughter is inscuteable and numb-dependent. These results reveal that regulation of Cyclin E levels, and asymmetric distribution of Cyclin E and other determinants, confer self-renewing asymmetric division potential to precursor cells, and thus define a pathway that regulates such divisions. These results add to understanding of maintenance and loss of pluripotential stem cell identity (Bhat, 2004).

Maintenance of a self-renewing fate can be viewed as a state where activities of certain genes maintain that state. Once the activity of such genes is switched off, the cells become committed to a differentiation pathway. The results reported in this study indeed support this type of mechanism. That POU genes might be a class of genes that maintain a self-renewing capacity is indicated by the fact that the Oct4 POU gene (Pou5f1 -- Mouse Genome Informatics), which is expressed in pluripotent stem cells of the mouse early embryo, is turned off when these cells begin to differentiate (Rosner, 1990). Similarly, SCIP is expressed in the progenitors of oligodendrocytes, but it is downregulated when these cells are induced to differentiate (Collarini, 1992). The current results provide direct evidence that these genes can induce a cell that is committed to a differentiation pathway to acquire a self-renewing capability in a lineage specific manner. Moreover, studies undertaken in the past several years using the Drosophila nervous system as a paradigm have revealed how asymmetry can be generated during cell division to produce two distinct postmitotic cells. However, there is very little information on how an asymmetric self-renewing division pattern is determined. In this paper, results are presented that provide insight into this particular process. (Bhat, 2004).

The strongest evidence that a GMC-1 undergoes a self-renewing type of asymmetric division in embryos overexpressing miti/nub or CycE, and in embryos mutant for ago, comes from the presence of hemisegments with two sibs and one RP2. There are two ways the second sib cell can be generated: (1) a self-renewed GMC-1 generates another sib when it divides, and (2) some other cell is transformed into a sib. The following set of evidence indicates the former scenario: (1) the second sib cell always appears later in development, i.e. at ~8.5 hours of age (as opposed to in wild type where the GMC-1 terminally divides by ~7.5 hours of age into an RP2 and a sib); (2) the dynamics of Eve expression itself in the sib -- expression of eve is switched off in a sib during the asymmetric division of GMC-1 and there is no de novo synthesis of Eve thereafter. If a postmitotic cell from an Eve-negative lineage transforms into a sib, it would be negative for Eve and would not be detected. The development of the other Eve-positive neuronal lineages is normal in these embryos, thus it is unlikely that a cell from those Eve-positive lineages is transformed into a sib. (3) The Eve and Spectrin staining of UAS-nub; ftz-GAL4 embryos provides more direct evidence for the self-renewal of GMC-1. In ~8. 5-hour-old UAS-nub; ftz GAL4 embryos, the larger GMC-1 (this Eve-positive cell is Zfh1 negative, indicating that it is indeed a GMC-1) can be observed undergoing asymmetric cytokinesis for the second time. From the heat-shock induction experiments of nub or miti mutant embryos, it can be argued that higher levels of these proteins in the parental NB4-2 cause later born GMCs to adopt a GMC-1 fate. However, the GMC-1 self-renewing phenotype observed following targeted expression of nub using the ftz-GAL driver makes this scenario unlikely. (4) The results obtained with the miti^P; insc and miti^P; nb double mutant embryos (P referring to prolonged expression), and the mis-localization of Insc in GMC-1 of these embryos, are also consistent with this conclusion. (Bhat, 2004).

These results indicate that the level, timing and duration of presence of Miti or Nub proteins determine the dynamics of the GMC-1 division pattern. For example, the asymmetric divisions (which generate the 3-cell phenotypes) and the symmetric divisions (which generate the 4-cell phenotype) were observed when the transgenes were induced for 20-25 minutes. However, the multiple cell-phenotype was observed only when the transgenes were induced for 90 minutes. Once the induction was stopped and the levels returned to normal, the two GMC-1s appeared to exit from the cell cycle to generate postmitotic cells. Similarly, when the transgene was induced with ftz-GAL4, only the 3-cell phenotypes, and not the 4-cell or multi-cell phenotypes were observed. Thus, the following picture emerges from these results. Although high levels of Miti and Nub proteins are required for the specification of GMC-1 identity, their level must be downregulated in order for the GMC-1 to divide asymmetrically into postmitotic RP2 and sib. Maintaining a high level of these proteins in GMC-1 commits that cell to adopt a self-renewing stem cell type of division pattern. The results described here also show that Miti and Nub prevent GMC-1 from exiting the cell cycle by upregulation of CycE (Bhat, 2004).

The results clearly show that upregulation of CycE in late GMC-1 is the cause for the adoption of a self-renewing asymmetric division pattern. In other words, presence of high levels of CycE in late GMC-1 and its unequal distribution to one of the two daughter cells prevents this cell from exiting the cell cycle. Since this daughter cell still maintains the GMC-1 identity and has sufficient CycE to divide again, a further asymmetric division(s) is ensured. The cell that has lower amounts of CycE becomes committed to a differentiation pathway (RP2 or sib) (Bhat, 2004).

What lines of evidence support this conclusion? (1) In contrast with wild type, there is a significant amount of CycE present in a late GMC-1 in embryos overexpressing miti or nub. This CycE preferentially segregates to one of the two daughters of that GMC-1, usually the larger cell. When miti or nub genes are overexpressed only briefly, the level of CycE is downregulated after just one additional round of division, whereas with prolonged induction, the level is maintained at high levels in one or two cells of the multi-cell cluster for a prolonged duration of time (Bhat, 2004).

(2) Upregulation of CycE in a late GMC-1 is also observed in embryos mutant for ago, which is known to regulate CycE levels. In ago mutants, the two daughter cells of such a GMC-1 have unequal CycE levels accompanied by a self-renewing asymmetric division phenotype. The CycE is always downregulated after one additional GMC-1 division, which is consistent with the finding that the self-renewal occurs only once in these embryos. Since penetrance in ago mutants is partial, and CycE is downregulated in this lineage after just one additional division, there must be additional factors that mediate the downregulation of CycE in this lineage (Bhat, 2004).

(3) Embryos expressing high levels of CycE from a CycE transgene exhibit the same GMC-1 phenotypes as embryos expressing high levels of Miti or Nub. Thus, these results indicate that upregulation of CycE alone is sufficient for the GMC-1 to adopt a self-renewing type of division pattern. Finally, miti^P phenotypes are found to be dependent on CycE. That is, no multi-cell clusters were observed in miti^P; CycE double mutant embryos (Bhat, 2004).

In wild type, the downregulation of CycE in GMCs appears to occur through switching off CycE transcription and degradation of the protein by factors such as Ago. At what level does Miti or Nub regulate CycE? Since POU genes are thought to be transcriptional activators, they can regulate transcription of CycE either directly or indirectly. However, this does not seem to be the case since expressing high levels of miti does not have a discernible effect on the levels of CycE mRNA in GMC-1, as assessed by whole-mount RNA in situ hybridization. In addition, the putative promoter/enhancer region of CycE gene does not contain any consensus POU protein-binding sites. Therefore, it seems likely that Miti and Nub regulate factors that are involved in the degradation of CycE in late GMC-1 (Bhat, 2004).

The question arises as to how only one cell has a high level of CycE. There are several ways this can happen. There might be an asymmetric degradation of CycE. This scenario seems unlikely since there is only one of two daughter cells with high levels of CycE in ago mutants. Given that Ago downregulates CycE via a protein degradation mechanism, if there was an asymmetric degradation, in those hemisegments where the levels of CycE was elevated in GMC-1, it would initially be expected that both the daughter cells would have high CycE levels. However, this was not the case. An asymmetric transcription of the CycE gene also seems unlikely since the transcription of CycE ceases prior to GMC-1 division, as judged by whole-mount RNA in situ hybridization. The most likely possibility is that CycE is unequally distributed between the two daughter cells of GMC-1. The unequal distribution of CycE could be a passive process due to the size difference between daughter cells, especially in the GMC-1-->RP2/sib lineage. Moreover, no cytoplasmic crescent of CycE was observed during mitosis. By contrast, it could also be an active process. For instance, the size difference between an aCC and a pCC (or between a GMC1-1a and an aCC) is very small, and the fact that GMC1-1a undergoes a self-renewing asymmetric division suggests that the segregation of CycE may not be entirely a passive process (Bhat, 2004).

Finally, the results indicate that while a GMC that does not normally express Miti or Nub is insensitive to its ectopic expression (e.g., GMC1-1a of NB1-1; this GMC produces an aCC/pCC pair of neurons), a brief induction of CycE in the same GMC causes it to undergo self-renewing asymmetric division. Therefore, CycE can confer a stem cell type of division potential to more than one GMC. Another important conclusion one can draw from this result is that the segregation of CycE may be an active process. In the case of GMC1-->RP2/sib lineage, the cytokinesis of GMC-1 is asymmetric, and the size difference between an RP2 and a sib is significant. Thus, CycE can be asymmetrically segregated because of this size difference. However, the size difference between an aCC and a pCC (or between a GMC1-1a and an aCC) is very small, and the fact that GMC1-1a undergoes a self-renewing asymmetric division suggests that the segregation of CycE may not be entirely a passive process. It is possible that the difference between the levels of CycE needed to retain a cell within the cell cycle and the levels that do not support maintaining the cell within the cell cycle are quite small. Thus, even a minor change in the amount that a cell receives during division might be sufficient to make a difference. Thus, the segregation of CycE can still be a passive process. Nonetheless, these results reveal how a cell can adopt a self-renewing asymmetric division potential through CycE. (Bhat, 2004).

Pros has been implicated in inhibiting the ability of GMCs to divide more than once by preventing continued expression of cell-cycle genes. The caveat of this study, however, is that none of the GMC lineage was examined using cell-specific markers to determine whether GMCs continue to divide in embryos mutant for pros. The conclusion that Pros inhibits GMC division was mainly based on the presence of additional BrdU-positive cells in late stage (post 15-hours-old) pros mutant embryos. Pros is expressed in GMC-1 of the RP2/sib lineage and, in null alleles, this GMC-1 identity is not specified. In pros¹⁷, a loss-of-function allele, ~5% of the hemisegments had an RP2/sib lineage specified. In these hemisegments, the GMC-1 divides only once to generate an RP2 and a sib cell as in wild type. Moreover, specification of U and CQ lineages was observed in ~20% and ~13% of the hemisegments, respectively, and no additional cell division appeared to occur in these lineages. A previous study found that the aCC/pCC neurons (from GMC1-1a) have an abnormal axon morphology, but it did not find any additional neurons in this lineage. Similarly, NB6-4 of the thoracic segment produced the normal number of progeny in pros mutant embryos. These results suggest that Pros does not regulate cell division in RP2/sib, U and CQ lineages, and possibly not in many other neuronal lineages, and therefore it is unlikely to function in the miti/nub pathway (Bhat, 2004).

How is the specification of identity of one of the two progeny, either as an RP2 or as a sib, from a self-renewing asymmetric division of GMC-1 regulated? (Specification of the other progeny as GMC-1 is by high levels of CycE.) The results indicate that specification of an RP2 versus a sib identity to this differentiating cell is through a combination of low levels of CycE and localization of Insc. This is indicated by the finding that overexpression of Miti and Nub causes localization of Insc to be non-asymmetric. Non-asymmetric Insc also causes non-asymmetric localization of Numb. The cell that has lower levels of CycE and also has Numb becomes an RP2. Whenever the cell with lower levels of CycE fails to inherit Numb (the effect of overexpression of Miti or Nub on the localization of Insc is partially penetrant) that cell will become a sib. That the generation of an RP2 during the asymmetric division of GMC-1 is tied to Numb is also indicated by the analysis of miti^P; numb embryos. Although the self-renewal of GMC-1 in miti^P embryos is numb-independent, the commitment of a progeny to become a sib is numb-dependent. Thus, in ~13-hour-old miti^P; numb embryos, multiple cells are observed adopting a sib fate. An often overlooked fact is that in insc mutants the GMC-1 division is normal in ~30% of the hemisegments despite having no insc. Similarly, the penetrance of the symmetrical division of GMC-1 in pins (where Insc localization is affected as in miti^P embryos) is also partial, indicating the presence of additional (partially redundant) pathways for Insc that mediate asymmetric fate specification. These very same additional pathways must also influence the choice between a sib and an RP2 when the GMC-1 in miti^P embryos undergoes a self-renewing type of asymmetric division (Bhat, 2004).

CycE and Ago are part of a mechanism that converts a normal cell into a cancer cell. In ago mutants, CycE protein is not degraded and a number of cancer cell lines carry a mutation in ago. The current results showing that these genes are also involved in a stem cell type of division suggest a commonality between stem cells and cancer cells. These results also provide a molecular mechanism of how self-renewing asymmetric divisions are possible (Bhat, 2004).

GENE STRUCTURE

pdm-1 and pdm-2 are both transcribed in the same direction with respect to the centromere (distal to proximal), and are separated by 60 kb (Yeo, 1995).

cDNA clone length - 2.2kb

Bases in 5' UTR - 211

Exons - three

Bases in 3' UTR - 578

PROTEIN STRUCTURE

Amino Acids - 601

Structural Domains

PDM-2 has a homeodomain (the POU homeodomain) and a POU domain (Lloyd, 1991).

The 75 amino acid POU-specific (POUs) domain and a 60 amino acid carboxy-terminal homeo (POUh) domain are joined by a hypervariable linker segment that can vary from 15 to 56 amino acids in length in different POU domain proteins. Thus the POU domain is not a single structural domain; indeed, the POUs and POUh segments form separate structurally independent domains. The POUs and POUh domains are, however, always found together and have therefore coevolved. Both POUs and POUh domains contain helix-turn-helix motifs. The POUs-domain structure is very similar to that of lambda and 434 bacteriophage proteins, but there are significant differences in the length of the first alpha helix, and the "turn" connecting the two HTH alpha helices is also longer. Both POUs and POUh bind DNA, and the length of the linker regulates the efficacy of binding various DNA sequence motifs, especially because POUs and POUh DNA binding sites have different spacings in different promoter elements (Herr, 1995).

Evolutionary Homologies

pdm-1 and pdm-2 belong to the class II type of POU genes.

As a POU domain transcription factor, VVL is related to an evolutionary complex group of genes consisting of at least 5 classes. Pit-1 is a class one POU domain transcription factor. Pit-1, involved in the development of the anterior pituitary gland in mammals. Class II POU domain transcription factors include mammalian Oct1, Oct2, Oct11 and Drosophila PDM-1 and PDM-2. Mammalian Brn1, Brn2, Brn4, SCIP/Oct6 and Xenopus XLPOU1 and XLPOU2 and well as Drosophila Ventral veins lacking (Drifter/Cf1a) and C. elegans ceh6 are Class III proteins. I-POU is in POU domain group IV, along with C. elegans unc86 and vertebrate Brn3. Oct-3/4 is a class V POU domain protein. There is no known Drosophila class 1 or class V homolog (Verrijzer, 1993).

The POU-specific domain seems to confer on its DNA a high-affinity site-specific DNA binding and to mediate cooperative protein-protein interactions on DNA, while the homeodomain is critical for DNA binding and for specific protein-protein interactions (Treacy, 1991). An additional POU-homeodomain protein, Ventral veins lacking/Drifter, is found in Drosophila, and is also involved in regulation of neurally transcribed genes.

The ExPASy World Wide Web (WWW) molecular biology server of the Geneva University Hospital and the University of Geneva provides extensive documentation for 'POU' domain signatures.

The murine homologue of high mobility group (HMG) protein 2 interacts with Oct2. HMG2 and Oct2 interact via their HMG domains and POU homeodomains, respectively. This interaction is not restricted to Oct2, as other members of the octamer transcription factor family like Oct1 and Oct6 also interact with HMG2. The interaction with HMG2 results in a marked increase in the sequence-specific DNA binding activity of the Oct proteins. A chimeric protein, in which the strong transactivation domain of VP16 was fused directly to the HMG domains of HMG2, stimulated the activity of an octamer-dependent reporter construct upon cotransfection. Furthermore, the expression of antisense RNA for HMG2 specifically reduces octamer-dependent transcription. These results suggest that one of the functions of HMG2 is to support the octamer transcription factors in their role as transcriptional activators (Zwilling, 1995).

TBP efficiently associates with Oct1 and Oct2. The interaction is direct and does not depend on the presence of DNA or additional proteins. N- and C-terminal deletions of the different proteins were used to localize the domains involved in the interaction. The POU homeodomain of Oct2 and the evolutionarily conserved C-terminal core domain of TBP are both required and sufficient for the interaction. The Oct1 POU domain, which is highly homologous to the Oct2 POU domain, likewise mediates interaction with TBP. The interaction can also be observed in vivo, as TBP can be co-precipitated with either Oct1 or Oct2. Co-transfection of human TBP and Oct2 expression vectors into B cells resulted in a synergistic activation of an octamer motif containing promoter (Zwilling, 1994).

The bipartite POU domain of transcription factor Oct-1 stimulates adenovirus DNA replication through an interaction with the octamer sequence present in the auxiliary origin. The POU domain enhances the formation of a pre-initiation complex composed of the viral precursor terminal protein-DNA polymerase (pTP-pol) complex and the origin. Protein-protein interactions between the POU domain and the pTP-pol complex could be detected. A direct correlation between pTP-pol binding and stimulation of DNA replication in vitro was observed, suggesting that stimulation by the POU domain is caused by an interaction with the viral pTP-pol complex (Coenjaerts, 1994).

Whereas the C. elegans POU homeodomain protein, UNC-86, a class IV POU domain protein, alone is able to activate transcription of the mec-3 promoter in vitro, the LIM homeodomain protein, MEC-3, fails to bind DNA or activate transcription on its own. However, in the presence of both UNC-86 and MEC-3, cooperative binding to the mec-3 promoter is observed and synergistic activation of transcription in vitro. Protein-protein interaction assays revealed that UNC-86 can bind directly to MEC-3, and in vitro transcription studies indicate that both proteins contain a functional activation domain. Thus, formation of a heteromeric complex containing two activation domains results in a highly potent activator. These studies provide direct functional evidence for coordinated transcriptional activation by two C.elegans DNA binding proteins that have been defined genetically as regulators of gene expression during embryogenesis (Lichtsteiner, 1995).

A pituitary LIM homeodomain factor, P-Lim, is expressed as Rathke's pouch forms and as specific pituitary cell phenotypes are established, suggesting functional roles throughout pituitary development. While selectively expressed in both anterior and intermediate pituitary in mature mice, P-Lim is also transiently expressed in the developing ventral neural cord and brainstem. P-Lim binds to and activates the promoter of the alpha-glycoprotein subunit gene, a marker of early pituitary development, and synergizes with Pit-1, a class 1 POU domain protein, in transcriptional activation of genes encoding terminal differentiation markers. The LIM domain of P-Lim specifically interacts with the Pit-1 POU domain and is required for synergistic interactions with Pit-1, but not for basal transcriptional activation events (Bach, 1995).

Binding sites for the tissue-specific POU-homeodomain transcription factor, Pit-1, are required for basal and hormonally induced prolactin gene transcription. The 3P DNA element or the rat prolactin gene is found to contain a consensus binding site for the Ets family of proteins (Homologs of Drosophila proteins Pointed and Yan). Mutation of the Ets binding site greatly decreases the ability of epidermal growth factor, phorbol esters, Ras, or the Raf kinase to induce reporter gene activity. Mutation of the Ets site had little effect on basal enhancer activity. In contrast, mutation of the consensus Pit-1 binding site in the 3P element essentially abolished all basal enhancer activity. Overexpression of Ets-1 in GH3 pituitary cells enhanced both basal and Ras induced activity from the 3P enhancer. These data describe a composite element in the prolactin gene containing binding sites for two different factors and the studies suggest a mechanism by which Ets proteins and Pit-1 functionally cooperate to permit transcriptional regulation by different signaling pathways (Howard, 1995).

REGULATION

cis-Regulatory Sequences and Functions

The identification of sequences that control transcription in metazoans is a major goal of genome analysis. Searching for clusters of predicted transcription factor binding sites can discover active regulatory sequences; 37 regions of the Drosophila melanogaster genome have been identified with high densities of predicted binding sites for five transcription factors involved in anterior-posterior embryonic patterning. Nine of these clusters overlapped known enhancers. This study reports the results of in vivo functional analysis of 27 remaining clusters. Transgenic flies were generated carrying each cluster attached to a basal promoter and reporter gene, and embryos were assayed for reporter gene expression. Six clusters are enhancers of adjacent genes: giant, fushi tarazu, odd-skipped, nubbin, squeeze and pdm2; three other clusters drive expression in patterns unrelated to those of neighboring genes; the remaining 18 clusters do not appear to have enhancer activity. The Drosophila pseudoobscura genome was used to compare patterns of evolution in and around the 15 positive and 18 false-positive predictions. Although conservation of primary sequence cannot distinguish true from false positives, conservation of binding-site clustering accurately discriminates functional binding-site clusters from those with no function. Conservation of binding-site clustering was incorporated into a new genome-wide enhancer screen, and several hundred new regulatory sequences, including 85 adjacent to genes with embryonic patterns, have been preducted. It is concluded that measuring conservation of sequence features closely linked to function (such as binding-site clustering) makes better use of comparative sequence data than commonly used methods that examine only sequence identity (Berman, 2004).

Each of the 37 pCRMs were assigned an identifier (of the form PCEXXXX). The first nine overlap previously known enhancers of runt, even-skipped, hairy, knirps and hunchback. To determine whether any of the remaining 28 pCRMs also function as enhancers, P-element constructs were generated containing the pCRM sequence with minimal flanking sequence on both sides fused to the eve basal promoter and a lacZ reporter gene. Since the margins of the tested sequences do not precisely correspond to the margins of the clusters, a unique identifier (of the form CEXXXX) was assigned to each tested fragment (identical CE and PCE numbers correspond to the same pCRM) (Berman, 2004).

Multiple independent transgenic fly lines were sucessfully generated for 27 of the 28 pCRMs. Transgenes containing CE8007 could not be generated. This sequence contains five copies of an approximately 358 base-pair (bp) degenerate repeat. One additional pCRM (CE8002) also contains tandem repeats. While it was possible to generate transgenes for CE8002 and assay its expression, these two tandem repeat-containing pCRMs (CE8007 and CE8002) were excluded from subsequent analyses (Berman, 2004).

The expression of these constructs was examined by in situ RNA hybridization to the lacZ transcript in embryos at different stages in at least three independent transformant lines. Nine of the 27 transgenes showed mRNA expression during embryogenesis, while the remaining 18 assayed transgenes showed no detectable expression at any stage during embryogenesis.

To identify the genes regulated by the nine pCRMs with embryonic expression, the expression patterns were examined of genes containing the pCRM in an intron and genes with promoters within 20 kb of the CRM. The embryonic microrarray and whole-mount in situ expression data available in the Berkeley Gene Expression Database were used, supplemented with additional whole-mount in situ experiments where necessary. Six of the active pCRMs drive lacZ expression in patterns that recapitulate portions of the expression of a gene adjacent to or containing the pCRM. Four of these new enhancers act in the blastoderm and two during germ-band elongation (Berman, 2004).

CE8001 is 5' of the gene for the gap transcription factor giant and recapitulates the posterior domain (65%-85% egg length measuring from the anterior end of the embryo) of gt expression in the blastoderm. CE8011 is 5' of the gene for the POU-homeobox transcription factor nubbin (nub). The CRM recapitulates the endogenous blastoderm expression pattern of nub, first detected as a broad band extending from 50% to 75% egg length. Although nub expression continues in later embryonic stages, CE8011 expression is limited to the blastoderm stage. CE8010 is 5' of the pair-rule gene odd-skipped (odd) and drives expression of two of its seven stripes: stripe 3 at 55% and stripe 6 at 75% egg length. This CRM also has the ability to drive later, more complex, patterns of expression. During stages 6 and 7, expression is detected in the procephalic ectoderm anlage and in the primordium of the posterior midgut. By stage 13, expression is also detected in the anterior cells of the midgut which will give rise to the proventriculus, the first midgut constriction, the posterior midgut and microtubule primordial as well as cells in the hindgut, all similar to portions of the pattern of wildtype Odd protein expression. CE8024 is 3' of the pair-rule gene fushi-tarazu and drives expression of two of its stripes: stripe 1 at 35% and stripe 5 at 65% egg length. CE8012 is in the third intron of POU domain protein 2 (pdm2) and appears to completely recapitulate its stage-12 expression pattern, which is limited to a subset of the developing neuroblasts and ganglion mother cells of the developing central nervous system. A similar pattern of expression was previously described for the protein product of pdm2. It is worth noting that expression of CE8012 was not detected in the blastoderm stage, whereas the endogenous gene exhibits a blastoderm expression pattern similar to nub. CE8027 is 3' of the gene for the Zn-finger transcription factor squeeze (sqz) and recapitulates the wild-type expression pattern of sqz RNA in a subset of cells in the neuroectoderm at stage 12 (Berman, 2004).

The remaining three active pCRMs cannot be easily associated with a specific gene. CE8005 drives expression in the ventral region of the embryo. It is 3' of a gene encoding a ubiquitously expressed Zn-finger containing protein (CG9650) that is maternally expressed and deposited in the embryo. This strong maternal expression potentially obscures a zygotic expression pattern. Two additional adjacent genes, CG32725 and CG1958, showed no expression in whole-mount in situ hybridization of embryos. CE8016 drives a seven-stripe expression pattern in the blastoderm. It is in the first intron of CG14502 which shows very low level expression by microarrays in the blastoderm, and has no obvious detectable pattern of expression in whole-mount in situ hybridization of embryos. This pCRM is approximately 2 kb 5' of scribbler (sbb), which is expressed maternally, possibly obscuring an early zygotic expression pattern (a few in situ images show a hint of striping). sbb is also expressed later in development in the ventral nervous system. An additional potential target, Otefin (Ote), is also expressed maternally and relatively ubiquitously through germ-band extension. All other nearby genes showed no embryonic expression in whole-mount in situ hybridization or by microarray. CE8020 drives an atypical four-stripe pattern in the blastoderm -- two stripes at 7% and 26% that are anterior to the first ftz stripe and two stripes at 39% and 87%. It is in the first intron of ome (CG32145), which is not expressed maternally and has no blastoderm expression, but is expressed late in salivary gland, trachea, hindgut and a subset of the epidermis. All other nearby genes showed no embryonic expression in whole-mount in situ hybridization or by microarray (Berman, 2004).

Enhancer loops appear stable during development and are associated with paused polymerase

Developmental enhancers initiate transcription and are fundamental to our understanding of developmental networks, evolution and disease. Despite their importance, the properties governing enhancer-promoter interactions and their dynamics during embryogenesis remain unclear. At the β-globin locus, enhancer-promoter interactions appear dynamic and cell-type specific, whereas at the HoxD locus they are stable and ubiquitous, being present in tissues where the target genes are not expressed. The extent to which preformed enhancer-promoter conformations exist at other, more typical, loci and how transcription is eventually triggered is unclear. This study generated a high-resolution map of enhancer three-dimensional contacts during Drosophila embryogenesis, covering two developmental stages and tissue contexts, at unprecedented resolution. Although local regulatory interactions are common, long-range interactions are highly prevalent within the compact Drosophila genome. Each enhancer contacts multiple enhancers, and promoters with similar expression, suggesting a role in their co-regulation. Notably, most interactions appear unchanged between tissue context and across development, arising before gene activation, and are frequently associated with paused RNA polymerase. These results indicate that the general topology governing enhancer contacts is conserved from flies to humans and suggest that transcription initiates from preformed enhancer-promoter loops through release of paused polymerase (Ghavi-Helm, 2014).

Drosophila embryogenesis proceeds very rapidly, taking 18 h from egg lay to completion. Underlying this dynamic developmental program are marked changes in transcription, which are in turn regulated by characterized changes in enhancer activity. However, the role and extent of dynamic enhancer looping during this process remains unknown. To address this, 4C-seq (chromosome conformation capture sequencing) experiments were performed, anchored on 103 distal or promoter-proximal developmental enhancers (referred to as 'viewpoints'), and absolute and differential interaction maps were constructed for each, varying two important parameters: (1) developmental time, using embryos at two different stages, early in development when cells are multipotent (3-4 h after egg lay; stages 6-7), and mid-embryogenesis during cell-fate specification (6-8 h; stages 10-11); and (2) tissue context, comparing enhancer interactions in mesodermal cells versus whole embryo. To perform cell-type-specific 4C-seq in embryos, a modified version of BiTS-ChIP (batch isolation of tissue-specific chromatin for immunoprecipitation) was established. Nuclei from covalently crosslinked transgenic embryos, expressing a nuclear-tagged protein only in mesodermal cells, were isolated by fluorescence-activated cell sorting (FACS; (>98% purity) and used for 4C-seq on 92 enhancers at 6-8 h and a subset of 14 enhancers at 3-4 h. The same 92 enhancers, and 11 additional regions, were also used as viewpoints in whole embryos at both time points. The enhancers were selected based on dynamic changes in mesodermal transcription factor occupancy between these developmental stages and the expression of the closest gene. This study was thereby primed to detect dynamic three-dimensional (3D) interactions, focusing on developmental stages during which the embryo undergoes marked morphological and transcriptional changes (Ghavi-Helm, 2014).

All 4C-seq experiments had the expected signal distribution, with high concordance between replicates. To assess data quality further, ten known enhancer-promoter pairs (of the ap, Abd-b, E2f, pdm2, Con, eya, stumps, Mef2, sli and slp1 genes) were compared, and in all cases the expected interactions were recovered. For example, using an enhancer of the apterous (ap) gene, the expected interaction was detected with the ap promoter, 17 kilobases (kb) away, illustrating the high quality and resolution of the data (Ghavi-Helm, 2014).

In chromosome conformation capture assays, interaction frequencies decrease with genomic distance between regions. To adjust for this, the 4C signal decay was modelled as a function of distance using a monotonously decreasing smooth function. Subtracting this trend, the residual interaction signal was converted to z-scores and interacting regions defined by merging neighbouring high-scoring fragments within 1 kb. Using this stringent approach, 4,247 high-confidence interactions were identified across all viewpoints and conditions, representing 1,036 unique interacting regions (Ghavi-Helm, 2014).

Each enhancer (viewpoint) interacted with, on average, ten distinct genomic regions, less than half (41%) of which were annotated enhancers or promoters. Distal enhancers had a higher than expected interaction frequency with other enhancers. Similarly, promoter-proximal elements had extensive interactions with distal active promoters, 98% of which are >10 kb away. Enhancer-promoter interactions, although not significantly enriched, involve active promoters, with high enrichment for H3K27ac and H3K4me3, and active enhancers, defined by H3K27ac, RNA Pol II and H3K79me3. These results are similar to recent findings in human cells and the mouse β-globin locus, indicating similarities in 3D regulatory principles from flies to human (Ghavi-Helm, 2014).

The extent of 3D connectivity is surprising given the relative simplicity of the Drosophila genome. On average, each promoter-proximal element interacted with four distal promoters and two annotated enhancers, whereas each distal enhancer interacted with two promoters and three other enhancers. These numbers are probably underestimates, as 60% of interactions involved intragenic or intergenic fragments containing no annotated cis-regulatory elements. Despite this, the level of connectivity is similar to that recently observed in humans, where active promoters contacted on average 4.75 enhancers and 25% of enhancers interacted with two or more promoters. The multi-component contacts that were observed for Drosophila enhancers indicate topologically complex structures and suggest that, despite its non-coding genome being an order of magnitude smaller than humans, Drosophila may require a similar 3D spatial organization to ensure functionality (Ghavi-Helm, 2014).

Insulators, and associated proteins, are thought to have a major role in shaping nuclear architecture by anchoring enhancer-promoter interactions or by acting as boundary elements between topologically associated domains (TADs). Occupancy data from 0 to 12 h Drosophila embryos revealed a 50% overlap of interacting regions with occupancy of one or more insulator protein. Insulator-bound interactions are enriched in enhancer elements, suggesting that insulators may have a role in promoting enhancer-enhancer interactions. In contrast to mammalian cells, this study observed no association between insulator occupancy and the genomic distance spanned by chromatin loops, although there was a modest increase in average interaction strength. Conversely, 50% of interacting regions are not bound by any of the six Drosophila insulator proteins, suggesting that these 3D contacts are formed in an insulator-independent manner, or are being facilitated by neighbouring interacting regions (Ghavi-Helm, 2014).

If enhancer 3D contacts are involved in transcriptional regulation, then genes linked by interactions with a common enhancer should share spatio-temporal expression. For the four loci examined-pdm2, engrailed, unc-5 and charybde-this is indeed the case. For example, the pdm2 CE8012 enhancer interacts with both the pdm2 and nubbin (nub, also known as pdm1) promoters, located 2.5 and 47 kb away, respectively. Both genes, producing structurally related proteins, are co-expressed in the ectoderm, overlapping the activity of the pdm2 enhancer. Although there are examples of long-range interactions in Drosophila, often involving Polycomb response elements (PREs) and insulator elements, the vast majority of characterized enhancers are within 10 kb of their target gene, with few known to act over 50 kb. However, as investigators historically tested regions close to the gene of interest, characterized Drosophila enhancers are generally close to the gene they regulate. In contrast, although 4C cannot assess the full extent of short-range interactions, it provides an unbiased systematic measurement of the distance of enhancer interactions, far beyond 10 kb (Ghavi-Helm, 2014).

The distance distribution of all significant interactions reveals extensive long-range interactions within the ~180 megabase (Mb) Drosophila genome; 73% span >50 kb, with the median interaction-viewpoint distance being 110 kb. Two striking examples of long-range interactions are the unc-5 and charybde loci. The unc-5 promoter interacts with multiple regions, including a weak but significant interaction with the promoter of slit (sli), at a distance of >500 kb. These genes produce structurally unrelated proteins that are co-expressed in the heart, and are essential for heart formation (Ghavi-Helm, 2014).

A promoter-proximal element near the charybde (chrb) promoter has a strong interaction with the promoter of the scylla (scyl) gene, almost 250 kb away. Both genes are closely related in sequence and co-expressed throughout embryogenesis. These long-range interactions were confirmed by reciprocal 4C, using either the promoter of chrb or scyl, or an interacting putative enhancer as viewpoint. This interaction was further verified using DNA fluorescence in situ hybridization (FISH) in embryos. As a control, the distance was assessed between the chrb promoter (probe A) and an overlapping probe A' or a region on another chromosome (probe D), to determine the distances between regions very close or far away, respectively. Comparing the distance between the chrb and scyl promoters (probes A and B) showed a high, statistically significant co-localization, in contrast to the distance between the chrb promoter and a non-interacting region with equal genomic distance (probes A and C) (Ghavi-Helm, 2014).

The reciprocal 4C revealed several intervening interactions that are consistently associated with loops to both the scyl and chrb promoter. The activity was examined of two of these in transgenic embryos. Both interacting regions can function as enhancers in vivo, recapitulating chrb expression in the visceral mesoderm and nervous system (Ghavi-Helm, 2014).

When considering a 1-Mb scale around this region, the 4C interaction signal drops to almost zero just after the promoters of both genes. This 'contained block' of interactions is reminiscent of TADs, although the boundaries don't exactly match TADs defined at late stages of embryogenesis, which may reflect differences in the developmental stages used. However, the boundaries do overlap a block of conserved microsynteny between drosophilids spanning ~50 million years of evolution, suggesting a functional explanation underlying the maintained synteny. Expanding this analysis across all viewpoints, ~60% of interactions are located within the same TAD and the same microsyntenic domain as the viewpoint. In the case of the chrb and scyl genes, this constraint may act to maintain a regulatory association between a large array of enhancers, facilitating their interaction with both genes' promoters (Ghavi-Helm, 2014).

These examples, and the other 555 unique interactions >100 kb, provide strong evidence that long-range interactions are widely used within the Drosophila genome, potentially markedly increasing the regulatory repertoire of each gene. As enhancer-promoter looping can trigger gene expression, it follows that enhancer contacts should reflect the dynamics of transcriptional changes during development and therefore be temporally associated with gene expression. To assess this, looping interactions were directly compared between the two different time points and tissue contexts. Given the non-discrete nature of chromatin contacts, the quantitative 4C-seq signal was used to identify differential interactions based on a Gamma-Poisson model, and they were defined as having >2-fold change and false discovery rate <10% (Ghavi-Helm, 2014).

Despite the marked differences in development and enhancer activity between these conditions, surprisingly few changes were found in chromatin interaction frequencies, with ~6% of interacting fragments showing significant changes between conditions. Of these, 87 interactions were significantly reduced during mid-embryogenesis (6-8 h) compared to the early time point (3-4 h), and 90 interactions significantly increased. Similarly, 105 interactions had a higher frequency in mesodermal cells, compared to the whole embryo, and For example, a promoter-proximal viewpoint in the vicinity of the Antp promoter identified many interactions, two of which are significantly decreased at 6-8 h, although the expression of the Antp gene itself increases. For one region, the reduction in 4C interaction at 6-8 h corresponds to a loss in a H3K4me3 peak from 3-4 h to 6-8 h, suggesting that this 3D contact is associated with the transient expression of an unannotated transcript. The activity of the other interacting peak was examined in transgenic embryos, and it was shown to act as an enhancer, driving specific expression in the nervous system overlapping the Antp gene at 6-8 h. Along with the two enhancers discovered at the chrb locus, this demonstrates the value of 3D interactions to identify new enhancer elements, even for well-characterized loci like Antp (Ghavi-Helm, 2014).

A viewpoint in the vicinity of the Abd-B promoter interacted with a number of regions spanning the bithorax locus, three of which correspond to previously characterized Abd-B enhancers; iab-5, iab-7 and iab-8. The iab-7 and iab-8 enhancers are active in early embryogenesis, and have much reduced or no activity at the later time point. Notably, although the loop to those two enhancers is strong at the early time point, it becomes significantly reduced later in development, when both enhancers' activities are reduced. Conversely, the iab-5 enhancer contacts the promoter at a much higher frequency later in development, at the stage when the enhancer is most active. This locus therefore exhibits dynamic 3D promoter-enhancer contacts that reflect the transient activity of three developmental enhancers. It is interesting to note that in all loci examined, the dynamic contacts of specific elements are neighboured by stable contacts, as seen in the Antp and Abd-B loci. Dynamic changes, therefore, appear to operate in the context of larger, more-stable 3D landscapes (Ghavi-Helm, 2014).

Ninety-four per cent of enhancer interactions showed no evidence of dynamic changes across time and tissue context, which is remarkable given the marked developmental transitions during these stages. To investigate this further, enhancer-promoter interactions were examined of genes switching their expression state between time points or tissue contexts. The ap gene, for example, is not expressed at 2-4 h but is highly expressed during mid-embryogenesis (6-8 h). Despite the absence of expression, the interaction between the apME680 enhancer and the ap promoter is already present at 3-4 h, several hours before the gene's activation. To examine this more globally, differentially expressed genes, going either from on-to-off or off-to-on, were selected. Even for these dynamically expressed genes, there was no correlation with changes in their promoter-enhancer contacts. Similar 'stable' interactions were observed between tissue contexts. Genes predominantly expressed in the neuroectoderm at 6-8 h, for example, have interactions at the same locations in whole embryos and purified mesodermal nuclei at 6-8 h, despite the fact that they are not expressed in the mesoderm at this stage (Ghavi-Helm, 2014).

Pre-existing loops were recently observed in human and mouse cells, and suggested to prime a locus for transcriptional activation. However, why they are formed and how transcription is eventually triggered remains unclear. To investigate this, this study focused on the subset of genes that have both off-to-on expression and no evidence for differential interactions (20 genes; differentially expressed with stable loops (DS) genes). Despite changes in their overall expression, DS genes have similar levels of RNA polymerase II (Pol II) promoter occupancy at both time points. The presence of promoter-bound Pol II in the absence of full-length transcription is indicative of Pol II pausing. Using global run-on sequencing (GRO-seq) data to define a stringent set of paused genes, it was observed that most (75%) DS genes are paused (15 of 20 DS genes), and have a significantly higher pausing index. This percentage is significantly higher than expected by chance when sampling over all off-to-on genes, and is robust to using a strict or more relaxed) definition of Pol II pausing. This association is very evident when examining specific loci, showing Pol II occupancy, short abortive transcripts, and loop formation before the gene's expression. Taken together, these results indicate that 'stable' chromatin loops are associated with the presence of paused Pol II at the promoter (Ghavi-Helm, 2014).

To understand how transcription is ultimately activated, changes were examined in DNase I hypersensitivity at the promoter of DS genes. DNase I hypersensitivity is significantly increased at interacting promoters at the stages when the gene is expressed, suggesting that the recruitment of additional transcription factor(s) later in development might act as the trigger for transcriptional activation (Ghavi-Helm, 2014).

In summary, these data reveal extensive long-range interactions in an organism with a relatively compact genome, including pairs of co-regulated genes contacting common enhancers often at distances greater than 200 kb. Comparing enhancer contacts in different contexts revealed that chromatin interactions are very similar across developmental time points and tissue contexts. Enhancers therefore do not appear to undergo long-range looping de novo at the time of gene expression, but are rather already in close proximity to the promoter they will regulate. Within this 3D topology, highly dynamic and transient contacts would not be visible when averaging over millions of nuclei. As transcription factor binding is sufficient to force loop formation, these results suggest a model where through transcription factor-enhancer occupancy, an enhancer loops towards the promoter and polymerase is recruited, but paused in the majority of cases. The subsequent recruitment of transcription factor(s) or additional enhancers at preformed 3D hubs most likely triggers activation by releasing Pol II pausing. Such preformed topologies could thereby promote rapid activation of transcription. At the same time, as paused promoters can exert enhancer-blocking activity, the presence of paused polymerase within these 3D landscapes could safeguard against premature transcriptional activation, but yet keep the system poised for activation (Ghavi-Helm, 2014).

cis-regulatory analysis of the Drosophila pdm locus reveals a diversity of neural enhancers

One of the major challenges in developmental biology is to understand the regulatory events that generate neuronal diversity. During Drosophila embryonic neural lineage development, cellular temporal identity is established in part by a transcription factor (TF) regulatory network that mediates a cascade of cellular identity decisions. Two of the regulators essential to this network are the POU-domain TFs Nubbin and Pdm-2, encoded by adjacent genes collectively known as pdm. The focus of this study is the discovery and characterization of cis-regulatory DNA that governs their expression. Phylogenetic footprinting analysis of a 125 kb genomic region that spans the pdm locus identified 116 conserved sequence clusters. To determine which of these regions function as cis-regulatory enhancers that regulate the dynamics of pdm gene expression, this study tested each for in vivo enhancer activity during embryonic development and postembryonic neurogenesis. The screen revealed 77 unique enhancers positioned throughout the noncoding region of the pdm locus. Many of these activated neural-specific gene expression during different developmental stages and many drove expression in overlapping patterns. Sequence comparisons of functionally related enhancers that activate overlapping expression patterns revealed that they share conserved elements that can be predictive of enhancer behavior. To facilitate data accessibility, the results of this analysis are catalogued in cisPatterns, an online database of the structure and function of these and other Drosophila enhancers. These studies reveal a diversity of modular enhancers that most likely regulate pdm gene expression during embryonic and adult development, highlighting a high level of temporal and spatial expression specificity. In addition, clusters of functionally related enhancers were discovered throughout the pdm locus. A subset of these enhancers share conserved elements including sequences that correspond to known TF DNA binding sites. Although comparative analysis of the nubbin and pdm-2 encoding sequences indicate that these two genes most likely arose from a duplication event, only partial evidence of sequence duplication between their enhancers was found, suggesting that after the putative duplication their cis-regulatory DNA diverged at a higher rate than their coding sequences (Ross, 2015).

This study found 41 enhancers that directed embryonic expression, an overlapping set of 46 activated larval expression, and another overlapping set of 46 activated expression in the adult CNS. While many of these enhancers were activated only in the nervous system, a subset activated reporter gene expression outside of the nervous system, including in larval appendages and in the trachea. Roughly a third of the tested CSCs did not exhibit any detectable cis-regulatory activity in the nervous system. Since this study focused on identifying neural enhancers, the possibility exists that some or all of these CSCs that lack neural system activity may regulated gene expression in the larval and adult tissues that were not examined (Ross, 2015).

There are other online resources of documented enhancers in the Drosophila genome, namely, FlyLight and Vienna Tiles. While these cis-regulatory libraries provide useful information, the coverage of the pdm locus in these databases is not complete. For example, FlyLight analysis did not detect 14 enhancers that flank the nub transcribed sequence. These include those located upstream to the nub long transcript (nub-12 and nub-13), its first intron (nub-28), second exon (nub-32a), second intron (nub-32b, nub-32c, nub-33, nub-36, nub-40b, nub-41, nub-42, nub-44, and nub-45a), and third intron (nub-49b). The FlyLight library also does not include seven pdm-2 enhancers: located in the upstream region (pdm2-21); within the second intron (pdm2-27 and pdm2-28) and lacks information regarding its downstream region (pdm2-45, pdm2-46, pdm2-47 and pdm2-48). Vienna Tiles also provides only partial coverage of the pdm locus, omitting the following 11 pdm locus enhancers: nub-58a, nub-58b, pdm2-13, pdm2-17, pdm2-21, pdm2-22, pdm2-23a, pdm2-31b, pdm2-32, pdm-33, and pdm2-48 . While the Vienna Tiles database provides information on embryonic and adult enhancers, it does not supply information on cis-regulatory activity during larval development. In addition, based on the current analysis, most of the reporter transgenes in these two libraries contain multiple enhancers. For example, he Vienna Tiles enhancer denoted as VT6436 enhancer is made up of two embryonic enhancers (nub-28 and nub-29) (Ross, 2015).

Analysis of the pdm locus enhancers identified four functionally related enhancers (nub-46, nub-49b, pdm2-34, and pdm2-37a) that activated expression during NB lineage development. The nub-46 and pdm2-34 enhancers are both located in the third intron of the nub and pdm-2 long transcript, respectively, whereas nub-49b and pdm2-37a are positioned immediately 5' to the transcriptional start site of their respective short isoform. While the nub-46 and pdm2-34 enhancers drove overlapping but nonidentical expression during embryonic and larval NB lineage development, nub-49b and pdm2-37a regulated similar expression patterns during postembryonic NB lineage development. Analysis of nub-46 and pdm2-34 revealed that these enhancers share multiple conserved DNA elements, albeit in largely unique configurations. Although these observations suggest these enhancers are related, additional studies are needed to further resolve subtle differences between their regulatory activities (Ross, 2015).

Comparative analysis of the nub and pdm-2 coding sequences revealed that their sequence relationship was mostly limited to the exons that encode their POU domains and homeodomains. In contrast, no evidence of collinearity was detected within their noncoding regions, suggesting that they have diverged at a faster rate than the coding sequences. Only one pdm ortholog was found in the mosquito, whereas the medfly and housefly carry both genes. Given this observation and accounting for the divergence of Drosophila from these distant Diptera, the pdm duplication event may have occurred in the Dipteran line between 100 and 260 million (Ross, 2015).

Given the presence of the pdm genes in the medfly and housefly genomes, it was asked whether some or all of the Drosophila CSCs could also be identified in these distant species. Submitting the D. melanogaster genomic sequences surrounding nub and pdm-2 to BLAST searches using the medfly and housefly genomes revealed sequences conserved in the three Dipteran species within several pdm locus CSCs (see Three-way alignment of ultraconserved sequences in conserved sequence clusters identified in Drosophila, housefly, and medfly) that were typically found within their longest conserved sequence blocks (CSBs). For example, a 48 bp sequence within the pdm2-26 CSC that is conserved in all drosophilids, in addition to the medfly and housefly (see The pdm2-26 enhancer contains ultraconserved sequences detected in multiple Diptera)(Ross, 2015).

These studies revealed that two-thirds of the CSCs function as cis-regulatory enhancers that regulate gene expression in a diverse array of spatiotemporal aspects, which taken together reflect pdm expression domains. These observations suggest that the pdm genes are dynamically regulated by multiple cis-regulatory modules, and that these enhancers are more amenable to evolutionary restructuring than their protein encoding exons. This is in agreement with recent reviews on the evolution of Dipteran enhancers highlighting the flexibility of enhancers to maintain their function after loss and/or gain of TF DNA binding sites. Also consistent with these observations, functionally related enhancers were found within the pdm locus that share conserved sequences, albeit in different arrangements and orientations (Ross, 2015).

From a mechanistic perspective, these observations suggest that enhancer behavior can be predicted based on the combination of the conserved elements shared among functionally related enhancers. Similar observations have been made by others. Hierarchical clustering analysis of shared conserved sequences revealed that pdm SOG enhancers may be grouped based on shared elements that are for the most part not present within other pdm locus CSCs. A similar analysis of adult median neurosecretory cell (mNSC) enhancers revealed that they grouped together, as evidenced by sharing of conserved sequence elements, which were largely absent in non-mNSC CSCs with the pdm locus. While further work is required to determine whether these shared elements are important for enhancer activity, these findings suggest a level of structural complexity in the presence and clustering of enhancers that requires further analysis. To construct a better representation of enhancer structure and thus cis-regulatory prediction, one would ideally prefer to use a larger training set of enhancers to improve the accuracy of prediction. These approaches will be addressed in future studies (Ross, 2015).

One of the principal findings of this study is the discovery of 77 enhancers that exhibit a remarkably diverse range of cis-regulatory activities during embryonic and postembryonic development. The biological significance of this enhancer diversity most likely reflects the diversity of the developmental programs in which these transcription factors participate. Functionally related enhancers that share multiple conserved DNA sequences were also identified, and these enhancers could be classified using hierarchical clustering techniques. In addition, this analysis has revealed that the collinearity between the pdm genes is predominantly confined to their POU domain and homeodomain exons, suggesting that their noncoding sequences are diverging at a faster rate than their coding sequences. These results should provide further insight into the regulatory logic that controls cis-regulatory function and thus gene regulation (Ross, 2015).

Transcriptional Regulation

The closely linked POU domain genes pdm-1 and pdm-2 are first expressed early during cellularization in the presumptive abdomen in a broad domain that soon resolves into two stripes. This expression pattern is regulated by the same mechanisms that define gap gene expression domains. The borders of pdm-1 expression are set by the terminal system genes torso and tailless, and the gradient morphogen encoded by hunchback. The resolution into two stripes is controlled by the gap gene knirps. Ectopic expression of pdm-1 at the cellular blastoderm stage leads to disruptions in pair rule gene expression and in anterior segmentation. The broad abdominal domain of pdm-1 protein is lacking in nanos- mutant embryos, and ectopic pdm-1 expression in nanos- embryos leads to a partial restoration of abdominal segmentation (Cockerill, 1993).

To determine if Castor is a pdm repressor, Pdm-1 and Pdm-2 expression were analyzed in cas null embryos. In stage 9 and in younger embryos, no differences were detected between the cas- and wild-type expression patterns of Pdm-1 or -2. However, starting at stage 10, NBs fail to terminate expression of both Pdms. Ectopic Pdm expression is observed in most, if not all, late developing sublineages in all CNS ganglia. The sustained Pdm expression is most likely due to transcriptional derepression. In a cas- background transgenes bearing a pdm-1 proximal promoter fragment are ectopically expressed in NBs during late sublineage development. This result demonstrates that the enhancer(s) within the 6.3 kb regulatory DNA are negatively regulated by Cas. Binding studies reveal that Cas can bind to the same DNA sites as Hb, raising the possibility that it modulates transcriptional activities of genes also regulated by Hb. DNA sequence analysis of Cas fragments reveals 32 potential DNA-binding sites, all sharing at least 8 out of the 10 bp with the Hb consensus sites. Cas is shown to be able to bind to these sites. These results suggest that Hb and Cas regulate pdm expression by interacting directly with their cis-regulators to deactivate controlling enhancer(s), with Hb repressing the pdm genes early in CNS development, while Cas silencing acts late in CNS development (Kambadur, 1998).

The bipotential ganglion mother cells, or GMCs, in the Drosophila CNS asymmetrically divide to generate two distinct post-mitotic neurons. The midline repellent Slit (Sli), via its receptor Roundabout (Robo), promotes the terminal asymmetric division of GMCs. In GMC-1 of the RP2/sib lineage, Slit promotes asymmetric division by down regulating two POU proteins, Nubbin and Mitimere. The down regulation of these proteins allows the asymmetric localization of Inscuteable, leading to the asymmetric division of GMC-1. Consistent with this, over-expression of these POU genes in a late GMC-1 causes mis-localization of Insc and symmetric division of GMC-1 to generate two RP2s. Similarly, increasing the dosage of the two POU genes in sli mutant background enhances the penetrance of the RP2 lineage defects whereas reducing the dosage of the two genes reduces the penetrance of the phenotype. These results tie a cell-non-autonomous signaling pathway to the asymmetric division of precursor cells during neurogenesis (Mehta, 2001).

Since previous results tie the two POU genes, miti and nub, to the normal elaboration of the GMC-1->RP2/sib lineage, the expression of these genes was examined in sli mutant embryos. In wild type, the levels of Nub (or Miti), which are normally high in a newly formed GMC-1, are down regulated prior to the asymmetric division of GMC-1. In sli mutants the expression of Nub (or Miti) in a newly formed GMC-1 is comparable to that of wild type, but, in a late GMC-1 the level remains high compared to wild type. A brief ectopic expression of these POU genes from the hsp70 promoter prior to GMC-1 division induces GMC-1 to divide symmetrically to generate two GMC-1s; each then divides asymmetrically to generate an RP2 and a sib. If the symmetric division of GMC-1 in these mutants has anything to do with the lack of down regulation of Nub and Miti in GMC-1, ectopic expression of miti or nub should also induce GMC-1 to divide symmetrically to generate two RP2 neurons. Indeed, a brief over-expression of miti (or nub) in a late GMC-1 causes this GMC to divide symmetrically into two RP2 neurons in 27% of the hemisegments (Mehta, 2001).

The loss-of-function effects of sli on the distribution of Insc in GMC-1 (and thus the symmetrical division of GMC-1) could be due to this lack of down regulation of Miti and Nub in GMC-1. To test this possibility, the miti transgene was ectopically expressed from the hsp70 promoter. A 25-minute induction of miti was sufficient to alter the localization of Insc and the distribution of Insc in these embryos resembled the distribution of Insc in sli embryos (Mehta, 2001).

The penetrance of the symmetrical division phenotype in sim mutant is sensitive to the dosage of nub and miti genes. The penetrance of the symmetric division of GMC-1 phenotype in sli and sim mutants is ~10%, indicating a partial genetic redundancy for this pathway. Since the loss of asymmetric division of GMC-1 in sli or sim appears to be due to a failure in the down regulation of Nub and Miti, it was reasoned that the penetrance of the phenotype might be enhanced by increasing the copy numbers of these POU genes in sli or sim background. Using a duplication for nub and miti embryos were examined for the GMC-1 division phenotype. The penetrance of the phenotype in these embryos was enhanced to 42%. Similarly, halving the copy numbers of the two POU genes in sim background suppresses the phenotype to 1.4% (Mehta, 2001).

The above results indicate that the symmetrical division of GMC-1 in sli mutants is due to the up regulation of the two POU genes and that these two POU genes are the targets of Sli signaling in GMC-1; however, the partial penetrance of these phenotypes in sli mutants indicate that additional pathways also mediate this very same process and regulate the levels of the two POU proteins in GMC-1. Since the penetrance in insc mutants is also partial, additional pathways must exist to mediate the asymmetric division of GMC-1 to partially complement the loss of the Insc/Sli pathway (Mehta, 2001).

The following picture emerges from this study. The Sli-Robo signaling down regulates the levels of Nub and Miti in late GMC-1, allowing the asymmetric localization of Insc and the asymmetric division of GMC-1. The possibility is entertained that loss of sibling cells in sli mutants would mean that some projections will be duplicated, while others are eliminated. Depending upon the extent, this might have an overall bearing on the pathfinding defects in sli mutants. Since Sli signaling is conserved in vertebrates, it is possible that this signaling may regulate generation of asymmetry during vertebrate neurogenesis as well (Mehta, 2001).

Recombineering Hunchback identifies two conserved domains required to maintain neuroblast competence and specify early-born neuronal identity

The Hunchback/Ikaros family of zinc-finger transcription factors is essential for specifying the anterior/posterior body axis in insects, the fate of early-born pioneer neurons in Drosophila, and for retinal and immune development in mammals. Hunchback/Ikaros proteins can directly activate or repress target gene transcription during early insect development, but their mode of action during neural development is unknown. This study used recombineering to generate a series of Hunchback domain deletion variants and assay their function during neurogenesis in the absence of endogenous Hunchback. Previous studies have shown that Hunchback can specify early-born neuronal identity and maintain 'young' neural progenitor (neuroblast) competence. Two conserved domains required for Hunchback-mediated transcriptional repression were identified; transcriptional repression is necessary and sufficient to induce early-born neuronal identity and maintain neuroblast competence. pdm2 was identified as a direct target gene that must be repressed to maintain competence, but additional genes must also be repressed. It is proposed that Hunchback maintains early neuroblast competence by silencing a suite of late-expressed genes (Tran, 2010).

Hb acts as an activator and repressor of gene expression in the CNS, but only its transcriptional repressor function is essential for maintaining neuroblast competence and specifying early-born neuronal identity. Two repression domains within the Hb protein were identified: the Mi2-binding D domain and the dimerization (DMZ) domain (Tran, 2010).

How do the D and DMZ domains repress gene expression? It is interesting to note that the D and DMZ domains are not dedicated repression domains, such as the one found in Engrailed. Instead, both are known to mediate protein-protein interactions. The DMZ allows Hb dimerization, leading to the proposal that high Hb levels promote dimerization and thus transcriptional repression (Papatsenko, 2008). For example, at cellular blastoderm stages, high levels of Hb in the anterior of the embryo are required to repress Kr, whereas low Hb levels activate Kr, and mutations in the DMZ lead to an anterior expansion of the Kr expression domain (Hulskamp, 1994). Yet it remains unknown how Hb dimerization leads to gene repression. The D domain is also involved in protein-protein interactions. The region of Hb containing the D domain is known to bind the chromatin regulator Mi2, and this interaction promotes epigenetic silencing of the Hb target gene Ubx during early embryonic patterning. The current results suggest that the D and DMZ domains could act in distinct processes that are both required for transcriptional repression, or that they could act in a common pathway such as dimerization-dependent recruitment of Mi2 and/or other repressor proteins to the D domain (Tran, 2010).

Hb proteins lacking the D or DMZ domain have very similar phenotypes in the CNS. Although both the D and DMZ domains appear to be required for Hb-mediated transcriptional repression, they do not have identical functions. Overexpression of Hb^δD leads to the specification of two U5 neurons at the expense of the U4 cell identity, whereas overexpression of Hb^δDMZ results in normal U4 and U5 identities. Perhaps Hb^δDMZ retains some ability to repress cas expression, allowing the production of the Cas^- U4 identity. Alternatively, Hb might use the D and DMZ domains to repress different target genes. Currently, it is not possible to distinguish between these models owing to the limited number of known Hb direct target genes (Tran, 2010).

Both Hb and the related mammalian protein Ik have major roles as transcriptional repressors, but are also weak transcriptional activators. How does Hb activate gene expression within the CNS? It was not possible to identify a discrete activation domain despite the fact that the systematic deletion series covered the entire protein. It can be ruled out that the activation domain maps to the D region, as it does in the closely related Ik protein, because the Hb^δD protein has no effect on Kr transcriptional activation or the specification of U3 neuronal identity. The presence of a single activation domain within the A, B, B', E or DMZ domains can also be ruled out for the same reason. Mechanisms for Hb-mediated transcriptional activation consistent with these data are: (1) Hb activates transcription indirectly by blocking DNA binding of a repressor; (2) Hb has multiple activation domains; or (3) the Hb activation domain is tightly linked to an essential domain, such as the DBD. In any case, VP16::Hb experiments, together with repression domain deletion experiments, show that Hb-mediated transcriptional repression, not transcriptional activation, is essential for maintaining neuroblast competence and specifying early-born neuronal identity (Tran, 2010).

What are the Hb-repressed target genes that are involved in extending neuroblast competence? One negatively regulated target is pdm, as co-expression of Pdm with wild-type Hb failed to extend neuroblast competence. However, overexpression of VP16::Hb in a pdm mutant background (lacking both pdm1 and pdm2) was incapable of extending neuroblast competence, showing that Hb must repress multiple genes to extend competence. In the future, further characterization of Hb function in the CNS will require genomic analyses, such as chromatin immunoprecipitation to identify Hb binding sites within the genome, or TU-tagging experiments to identify all the genes regulated by Hb within the CNS. Such comparative analyses might help to elucidate the complex gene interactions involved in regulating neuroblast competence (Tran, 2010).

Targets of Activity

When pdm-1 and pdm-2 functions are removed, there is no even-skipped expression in GMC4-2 or its progeny, the RP2 motorneuron (Yeo, 1995).

DEVELOPMENTAL BIOLOGY

Embryonic

Pdm-2 is first detected in the cellular blastoderm stage, as two bands, each 8-10 cells wide, in the primordia of the abdominal segment, and in the head region in the anlage of the clypeolabium. Pdm-2 exhibits a striped pattern at germ band extension, where its pattern overlaps that of ftz. Later it is expressed in a subset of CNS and PNS cells (Lloyd, 1991).

A comparison of Drosophila virilis and Drosophila melanogaster expression patterns finds that virtually all aspects of pdm-2 expression are conserved: expression in a gap gene-like posterior domain, expression in ectodermal stripes during germ band extension [Images], broad expression in the neuroectoderm followed by limitation to discrete subsets of CNS cells, and expression in specific PNS neurons and support cells (Poole, 1995). Thus, very specific neuroblasts express pdm-2. pdm-1 and pdm-2 are expressed in the early part of the NB4-2 lineage, from which the RP2 motor neuron is derived. These genes are not required for the birth of the first ganglion mother cell (GMC4-2a) but both are involved in specifying its identity (Yeo, 1995). pdm-2 plays a more dominant role in the process (Yeo, 1995).

Effects of mutation or deletion

The pdm-2 gene product specifies the cell fate of GMC-1 in the NB4-2 lineage. Its ectopic expression in the two progeny cells of GMC-1 is sufficient to cause both the cells to adopt a GMC-1 cell identity (Yang, 1993)

Embryos carrying excessive pdm-2 show deletions in abdominal segment 2, and occasionally in segment 6. The head is also deleted at times. Low levels of pdm-2 produce similar phenotypes (Bhat, 1994).

The role of pdm1 has been investigated during the elaboration of the GMC-1-->RP2/sib lineage. Also studied in this lineage was the functional relationship between pdm1 and pdm2. Deletion of pdm1 causes a partially penetrant GMC-1 defect, while deletion of both pdm2 and pdm1 results in a fully penetrant defect. This GMC-1 defect in pdm2 and pdm1 mutant embryos can be rescued by the pdm1 or pdm2 transgene. Rescue is observed only when these genes are expressed at the time of GMC-1 formation. Overexpression of pdm1 or pdm2 well after GMC-1 is formed results in the duplication of RP2 and/or sib cells. These results indicate that both genes are required for the normal development of this lineage and that the two collaborate during the specification of GMC-1 identity (Bhat, 1995).

REFERENCES

Bach, I., et al. (1995). P-Lim, a LIM homeodomain factor, is expressed during pituitary organ and cell commitment and synergizes with Pit-1. Proc. Natl. Acad. Sci. 92: 2720-2724. PubMed Citation: 7708713

Berman, B. P., et al. (2004). Computational identification of developmental enhancers: conservation and function of transcription factor binding-site clusters in Drosophila melanogaster and Drosophila pseudoobscura. Genome Biol. 5(9):R61. 15345045

Bhat K. and Schedl. P. (1994). The Drosophila miti-mere gene, a member of the POU family is required for the specification of the RP2/sibling lineage during neurogenesis. Development 120: 1483-1501. PubMed Citation: 8050358

Bhat, K. M., Poole, S. J. and Schedl, P. (1995). The miti-mere and pdm1 genes collaborate during specification of the RP2/sib lineage in Drosophila neurogenesis. Mol Cell Biol 15: 4052-4063. PubMed Citation: 7623801

Bhat, K. M. and Apsel, N. (2004). Upregulation of Mitimere and Nubbin acts through Cyclin E to confer self-renewing asymmetric division potential to neural precursor cells. Development 131: 1123-1134. 14973280

Cockerill, K.A., Billin, A.N. and Poole, S.J. (1993). Regulation of expression domains and effects of ectopic expression reveal gap gene-like properties of the linked pdm genes of Drosophila. Mech Dev. 41: 139-153. PubMed Citation: 8518192

Coenjaerts, F. E., van Oosterhout, J. A. and van der Vliet, P. C. (1994). The Oct-1 POU domain stimulates adenovirus DNA replication by a direct interaction between the viral precursor terminal protein-DNA polymerase complex and the POU homeodomain. EMBO J. 13: 5401-9. PubMed Citation: 7957106

Collarini, E. J., Kuhn, R., Marshall, C. J., Monuki, E. S., Lemke, G. and Richardson, W. D. (1992). Down-regulation of the POU transcription factor SCIP is an early event in oligodendrocyte differentiation in vivo. Development 116: 193-200. 1483387

Ghavi-Helm, Y., Klein, F. A., Pakozdi, T., Ciglar, L., Noordermeer, D., Huber, W., Furlong, E. E. (2014) Enhancer loops appear stable during development and are associated with paused polymerase. Nature 512(7512): 96-100. PubMed ID: 25043061

Herr, W. and Cleary, M. A. (1995). The POU domain: versatility in transcriptional regulation by a flexible two-in-one DNA-binding domain. Genes Dev. 9:1679-93. PubMed Citation: 7622033

Hulskamp M., et al. (1994). Differential regulation of target genes by different alleles of the segmentation gene hunchback in Drosophila. Genetics 138: 125-134. PubMed Citation: 8001780

Howard, P. W. and Maurer, R. R. (1995). A composite Ets/Pit-1 binding site in the prolactin gene can mediate transcriptional responses to multiple signal transduction pathways. J. Biol. Chem. 270: 20930-20936. PubMed Citation: 7673116

Kambadur, R., Koizumi, K., Stivers, C., Nagle, J., Poole, S. J. and Odenwald, W. F. (1998). Regulation of POU genes by castor and hunchback establishes layered compartments in the Drosophila CNS. Genes Dev. 12(2): 246-60. PubMed Citation: 9436984

Lichtsteiner, S. and Tjian, R. (1995). Synergistic activation of transcription by UNC-86 and MEC-3 in Caenorhabditis elegans embryo extracts. EMBO J 14: 3937-3945. PubMed Citation: 7664734

Lloyd, A. and Sakonju, S. (1991). Characterization of two Drosophila POU domain genes related to oct1 and oct2, and the regulation of their expression pattern. Mech Dev 36: 87-102. PubMed Citation: 1685891

Mehta, B. and Bhat, K. M. (2001). Slit signaling promotes the terminal asymmetric division of neural precursor cells in the Drosophila CNS. Development 128: 3161-3168. 11688564

Papatsenko D. and Levine M. S. (2008). Dual regulation by the Hunchback gradient in the Drosophila embryo. Proc. Natl. Acad. Sci. 105: 2901-2906. PubMed Citation: 18287046

Poole, S. J. (1995). Conservation of complex expression domains of the pdm-2 POU domain gene between Drosophila virilis and Drosophila melanogaster. Mech. Dev 49: 107-116. PubMed Citation: 7748782

Rosner, M. H., Vigano, M. A., Ozato, K., Timmons, P. M., Poirier, F., Rigby, P. W. J. and Staudt, L. M. (1990). A POU-domain transcription factor in early stem cells and germ cells of mammalian embryo. Nature 345: 686-692. 1972777

Ross, J., Kuzin, A., Brody, T. and Odenwald, W. F. (2015). cis-regulatory analysis of the Drosophila pdm locus reveals a diversity of neural enhancers. BMC Genomics 16: 700. PubMed ID: 26377945

Tran, K. D., Miller, M. R. and Doe, C. Q. (2010). Recombineering Hunchback identifies two conserved domains required to maintain neuroblast competence and specify early-born neuronal identity. Development 137(9): 1421-30. PubMed Citation: 20335359

Treacy, M. N., He, X. and Rosenfeld, M. G. (1991). I-POU: a POU-domain protein that inhibits neuron-specific gene activation. Nature 350: 577-584. PubMed Citation: 1673230

Verrijzer, C. P. and Van der Vliet, P. C. (1993). POU domain transcription factors. Biochim. Biophys. Acta 1173: 1-21. PubMed Citation: 8485147

Yang, X. et al. (1993). The role of a Drosophila POU homeodomain gene in the specification of neural precursor cell identity in the developing embryonic central nervous system. Genes Dev. 7: 504-516. PubMed Citation: 8095484

Yeo, S.L., Lloyd, A., Kozak, K., Dinh, A., Dick, T., Yang, X., Sakonju, S. and Chia, W. (1995). On the functional overlap between two Drosophila POU homeo domain genes and the cell fate specification of a CNS neural precursor. Genes Dev. 9: 1223-1236. PubMed Citation: 7758947

Zwilling, S., Annweiler, A. and Wirth, T. (1994). The POU domains of the Oct1 and Oct2 transcription factors mediate specific interaction with TBP. Nucleic Acids Res 22: 1655-62. PubMed Citation: 8202368

Zwilling, S., Konig, H. and Wirth, T. (1995). High mobility group protein 2 functionally interacts with the POU domains of octamer transcription factors. EMBO J. 14: 1198-1208. PubMed Citation: 7720710

date revised: 15 November 2015

The Interactive Fly resides on the
Society for Developmental Biology's Web server.