always early: Biological Overview | Evolutionary Homologs | Regulation | Developmental Biology | Effects of Mutation | References
Gene name - always early

Synonyms -

Cytological map position - 63A3--4

Function - regulator of chromatin structure

Keywords - chromatin architecture, spermatid development, spermatogenesis, male meiosis

Symbol - aly

FlyBase ID: FBgn0004372

Genetic map position - 3-4.4

Classification - lin-9 homolog

Cellular location - nuclear and cytoplasmic



NCBI link: Entrez Gene
aly orthologs: Biolitmine
Recent literature
Romanov, S. E., Shloma, V. V., Koryakov, D. E., Belyakin, S. N. and Laktionov, P. P. (2023). Insulator Protein CP190 Regulates Expression of Spermatocyte Differentiation Genes in Drosophila melanogaster Male Germline. Mol Biol (Mosk) 57(1): 109-123. PubMed ID: 36976746
Summary:
CP190 protein is one of the key components of Drosophila insulator complexes, and its study is important for understanding the mechanisms of gene regulation during cell differentiation. However, Cp190 mutants die before reaching adulthood, which significantly complicates the study of its functions in imago. To overcome this problem and to investigate the regulatory effects of CP190 in adult tissues development, a conditional rescue system was designed for Cp190 mutants. Using Cre/loxP-mediated recombination, the rescue construct containing Cp190 coding sequence is effectively eliminated specifically in spermatocytes, allowing study of the effect of the mutation in male germ cells. Using high-throughput transcriptome analysis i the function of CP190 on gene expression was determined in germline cells. Cp190 mutation was found to have opposite effects on tissue-specific genes, which expression is repressed by CP190, and housekeeping genes, that require CP190 for activation. Mutation of Cp190 also promoted expression of a set of spermatocyte differentiation genes that are regulated by tMAC transcriptional complex. These results indicate that the main function of CP190 in the process of spermatogenesis is the coordination of interactions between differentiation genes and their specific transcriptional activators.
BIOLOGICAL OVERVIEW

In spermatogenesis, a major transition occurs as the mitotically amplifying population of spermatogonia cease mitosis and develop into primary spermatocytes. These primary spermatocytes become committed to undergoing the meiotic divisions, and then differentiating into spermatozoa. This change in cell behavior is associated with a dramatic switch in the transcript profile: some genes are downregulated and many are upregulated or switched on for the first time. The 'meiotic arrest' genes of Drosophila are crucial for regulating transcription in primary spermatocytes. The Drosophila always early (aly) gene is involved in this switch in spermatocyte transcriptional regulation. aly coordinately regulates meiotic cell cycle progression and terminal differentiation during male gametogenesis. aly is required for transcription of key G2-M cell cycle control genes and of spermatid differentiation genes, and for maintenance of normal chromatin structure in primary spermatocytes. aly encodes a homolog of the C. elegans gene lin-9, a negative regulator of vulval development that acts in the same SynMuvB genetic pathway as the LIN-35 Rb-like protein. The aly gene family is conserved from plants to humans. Aly protein is both cytoplasmic and nuclear in early primary spermatocytes, then resolves to a chromatin-associated pattern. It remains cytoplasmic in a loss-of-function missense allele, suggesting that nuclear localization is critical for Aly function, and that other factors may alter Aly activity by controlling its subcellular localization. MAPK activation occurs normally in aly mutant testes. Therefore aly, and by inference lin-9, act in parallel to, or downstream of, activation of MAPK by the RTK-Ras signalling pathway (White-Cooper, 2000).

aly appears to play a crucial role at the head of the pathway controlling transcription of both cell cycle and differentiation genes. aly is required for expression of both cyclin B and twine in primary spermatocytes. In addition to regulating the expression of cell cycle genes, aly also regulates spermatid differentiation by controlling transcription in primary spermatocytes of a suite of spermatid differentiation genes (White-Cooper, 1998). The behavior of Aly protein, its homology to the C. elegans SynMuvB gene lin-9, and the abnormal chromatin structure in aly mutant spermatocytes suggest that aly may control transcription of cell cycle and terminal differentiation genes by regulating a chromatin remodelling complex (White-Cooper, 2000).

Numerous genes require the activity of the meiotic arrest genes for their transcription. aly is different from the other meiotic arrest genes cannonball (can), meiosis I arrest (mia) and spermatocyte arrest (sa), in that it is required for the transcriptional activation of more target genes, and for normal chromosome structure. Based on this, it is proposed that aly acts upstream of the can-class genes in primary spermatocytes. cookie monster (comr) is a novel meiotic arrest gene. comr mutant spermatocytes fail to transcribe twine and Male-specific RNA 87F (mst87F) as well as many other target genes, and the cells arrest with abnormal chromatin morphology. The nuclear localization of Aly and Comr proteins are mutually dependent. The demonstration that the comr mutant phenotype is indistinguishable from that of aly supports the segregation of the meiotic arrest genes into aly and can classes. The chromosome morphology defect seen in both aly and comr mutant lines supports the idea that the pathway in which they act has a role in the maintenance of normal chromatin structure (Jiang, 2003)

The aly gene and its homologs act at the intersection of tumor suppressor, cell cycle control and terminal differentiation pathways. The Aly protein of Drosophila regulates both male meiotic cell cycle progression and the terminal differentiation program of spermiogenesis by activating the transcription of genes required for both processes. Germ cells in aly mutant testes fail to progress beyond the mature primary spermatocyte stage, owing to lack of both key cell cycle transcripts required to enter the meiotic divisions, and of transcripts for proteins involved in the morphological changes of spermatid differentiation. aly is expressed in primary spermatocytes, the cells that show defects in aly mutants, suggesting a cell autonomous function. Aly protein is localized to the nucleus of maturing primary spermatocytes, where it appears to be associated with chromatin. Mutations in aly cause defects in the appearance of primary spermatocyte chromosomes, consistent with a role for aly in chromatin structure (Lin, 1996; White-Cooper, 2000).

The C. elegans homolog of aly, lin-9, acts in a pathway with the Rb tumor suppressor protein LIN-35 to antagonize RTK-Ras-MAPK signalling during vulval development, influencing the choice of terminal differentiation pathway. Vulval formation in C. elegans is controlled via an inductive signalling pathway in which the anchor cell of the gonad signals to the overlying ventral ectodermal cells of the vulval equivalence group, P3.p-P8.p. This signal activates an RTK-Ras-MAPK signal transduction pathway in P6.p, causing this cell to adopt a primary vulval cell fate, and to induce its neighbours P5.p and P7.p to adopt a secondary vulval fate. The remaining cells in the equivalence group (P3.p, P4.p and P8.p) do not adopt a vulval cell fate, instead they become hypodermal. The inductive pathway is antagonized by the SynMuv genes, which fall into two groups that represent two genetically redundant pathways. In animals doubly homozygous mutant for any one of the five SynMuvA genes and any one of the 12 SynMuvB genes, including lin-9, all the cells in the vulval equivalence group adopt an induced cell fate. The SynMuvB pathway has been proposed to repress expression of vulval genes via a complex of LIN-35 (an Rb homolog), LIN-53 (an Rb associated protein) and histone deacetylase, with the RTK-Ras-MAPK signal relieving this repression to activate vulval gene transcription (Lu, 1998). Ras pathway signalling has been shown to result directly in inactivation of Rb after mitogen stimulation in proliferating mammalian tissue culture cells via the interaction of Rb with Raf1 (White-Cooper, 2000 and references therein).

Activation of MAP kinase in the testis is not dependent upon the activity of the lin-9 homolog aly. If aly antagonizes a MAPK signalling pathway by preventing the phosphorylation and activation of ERK it would be expected that the mutant testes would have excess di-phosphorylated, active, ERK when compared with wild type. No differences were detected in the level of total or active ERK between wild-type and mutant testes, indicating that aly acts at the level of downstream effectors, or in a parallel pathway. This is consistent with the model proposed by Lu (1998) that the role of the SynMuv B genes is to maintain a repressor complex at the promoters of vulval differentiation genes. The activation of MAP kinase in P6.p in response to the anchor cell signal would then lead to relief of this repression and transcription of target genes (White-Cooper, 2000).

Genetic mosaic analysis of mutations in the SynMuvB pathway have suggested that some members act in the hypodermis (lin-15B, lin-37), while others function in the vulval precursor cells (lin-35, lin-36, lin-53). The SynMuvB pathway was therefore proposed to comprise an intercellular signalling pathway from the hypodermis to the vulval precursor cells (reviewed in Kornfeld, 1997). The lineage requirement for lin-9 function in C. elegans has not been tested. The cell autonomous activity of aly suggests that lin-9 will also have a cell autonomous role. In the nucleus, Aly protein could interact with homologs of other cell autonomous, nuclear, components of the SynMuvB pathway. Drosophila homologs of lin-35 (RbF), lin-53 (p55 subunit of chromatin assembly factor) and hda (histone deacetylase) have been described, although no Drosophila homolog of lin-36 has yet been identified (White-Cooper, 2000).

How might Aly control transcription? Aly protein contains neither a predicted DNA binding domain nor any domain that matches known transcriptional activators, yet it is required for the transcriptional activation of many target genes in primary spermatocytes. The Aly protein could act as a transcriptional co-activator; a physical interaction between Aly and one or more transcription factors could be responsible for the observed localization of Aly protein to chromatin. Drosophila E2F2, which is transcribed in the testis in primary spermatocytes in a pattern very similar to that of aly, is a candidate aly regulated or associated transcription factor since one role of Rb is to bind to and regulate the transcription factor E2F. Drosophila E2F1 promotes S-phase in embryos and induces PCNA expression in tissue culture cells; under the same conditions Drosophila E2F2 inhibits PCNA expression in tissue culture cells (White-Cooper, 2000).

Mutations in several SynMuvB genes dramatically reduce expression of transgenes in repetitive extrachromosomal arrays in C. elegans, without affecting expression of the endogenous genes or transgenes in non-repetitive arrays. Additionally several SynMuv pathway genes encode components of the NURD nucleosomal remodelling and histone deacetylase complex (Solari, 2000). These results suggest that one function of the SynMuvB genes is to activate transcription of genes contained within specialized chromatin architectures. Although de-acetylation of histones is often thought of in the context of transcriptional repression, the yeast histone deacetylase RPD3, a homolog of the NURD complex histone deacetylase HDAC1, was originally identified as a factor that is required to achieve maximal levels of both transcriptional repression and activation. The Aly protein of Drosophila may recruit or regulate a NURD-like complex on the bivalents in primary spermatocytes. Action of this complex could have a dual effect, reducing expression of genes not part of the terminal differentiation program, while allowing transcription of spermatogenic genes in a specialized chromatin domain. The proposed aly modulated specific chromatin domain could then be a target for a downstream transcription factor. The observation that wild-type function of aly is required for the normal appearance of chromatin in primary spermatocytes (Lin, 1996) is consistent with this proposed role for aly in chromatin structure (White-Cooper, 2000).

Translocation of Aly protein from the cytoplasm to the nucleus may represent an important control point. The protein encoded by the alyz3-1393 allele fails to enter the nucleus, despite the presence of two consensus predicted nuclear localization signals. Failure of alyz3-1393 mutant protein to enter the nucleus could be explained if translocation to the nucleus is inhibited by phosphorylation of Aly protein, in a manner similar to that observed for the cell cycle-regulated nuclear localization of the yeast SWI5 transcription factor. Like aly SWI5 contains a bipartite NLS. Phosphorylation of three serine residues close to this NLS prevents the nuclear accumulation of the SWI5 protein. S161 of Aly protein, two residues from the second basic domain of the bipartite NLS, is a good match to the consensus for cAMP-dependent protein kinase. This serine is conserved in all the aly homologs identified, where it lies two residues away from a classical NLS. The defective alyz3-1393 protein has an acidic residue close to the NLS, which may allow it to adopt a conformation mimicking that of the phosphorylated form (White-Cooper, 2000).

The C. elegans SynMuvA and SynMuvB pathways are genetically redundant in vulval development, but not in all tissues. Similarly, defects have been detected only in the male germline in aly mutant flies (Lin, 1996), aly may function at other stages of development if a genetically redundant pathway is active. This remains a possibility since a very low level of aly message is detected in adult females by RT-PCR. Alternatively the second Drosophila lin-9 homolog, 86E4.4, could carry out the lin-9-like function at earlier stages of Drosophila development. The conservation of aly in many phyla suggests a SynMuvB like pathway may be a conserved feature in many different organisms, ranging from plants to vertebrates. It will be interesting to determine whether the aly protein family functions in mammals to coordinate meiotic divisions with gamete production. Transcription of boule lies downstream of aly function (White-Cooper, 1998). If the mechanisms are conserved, it might be expected that transcription of the boule homologs Daz and Dazl in mammalian spermatogenesis are dependant on aly homologs. The recent discovery in plants of both Rb proteins and other components of the Rb pathway supports the hypothesis that a SynMuvB pathway may have a conserved role in allowing multicellular organisms to evolve complex structures consisting of many different cell types, whose normal development depends on the coupling of cell cycle controls with cellular differentiation (White-Cooper, 2000).

The mechanism by which SynMuv genes control the choice of differentiation pathway is still very poorly understood. The SynMuv genes, including Rb and aly, may not always be involved in negative regulation of RTK-RASMAP kinase signalling. Rather, SynMuv pathway genes and aly could play a more general role in regulating differentiation, both repressing transcription of certain genes and being required for activation of others in a specialized chromatin context. In the case of the C. elegans vulval precursor cells, Rb and the SynMuv genes could counteract EGFR pathway signalling by affecting chromatin in the region of EGFR-RAS-MAP kinase target genes. In Drosophila primary spermatocytes, aly and its partners could also affect expression of meiotic cell cycle and terminal differentiation genes via effects on chromatin (White-Cooper, 2000).

Blocking promiscuous activation at cryptic promoters directs cell type-specific gene expression

To selectively express cell type-specific transcripts during development, it is critical to maintain genes required for other lineages in a silent state. This study shows in the Drosophila male germline stem cell lineage that a spermatocyte-specific zinc finger protein, Kumgang (Kmg; CG5204), working with the chromatin remodeler dMi-2 prevents transcription of genes normally expressed only in somatic lineages. By blocking transcription from normally cryptic promoters, Kmg restricts activation by Aly, a component of the testis-meiotic arrest complex, to transcripts for male germ cell differentiation. These results suggest that as new regions of the genome become open for transcription during terminal differentiation, blocking the action of a promiscuous activator on cryptic promoters is a critical mechanism for specifying precise gene activation (Kim, 2017).

Highly specialized cell types such as red blood cells, intestinal epithelium, and spermatozoa are produced throughout life from adult stem cells. In such lineages, mitotically dividing precursors commonly stop proliferation and initiate a cell type-specific transcription program that sets up terminal differentiation of the specialized cell type. In the Drosophila male germ line, stem cells at the apical tip of the testis self-renew and produce daughter cells that each undergo four rounds of spermatogonial mitotic transit amplifying (TA) divisions, after which the germ cells execute a final round of DNA synthesis (premeiotic S-phase) and initiate terminal differentiation as spermatocytes. Transition to the spermatocyte state is accompanied by transcriptional activation of more than 1500 genes, many of which are expressed only in male germ cells. Expression of two-thirds of these depends both on a testis-specific version of the MMB (Myb-Muv B)/dREAM (Drosophila RBF, dE2F2, and dMyb-interacting proteins) complex termed the testis meiotic arrest complex (tMAC) and on testis-specific paralogs of TATA-binding protein-associated factors (tTAFs). Although this is one of the most dramatic changes in gene expression in Drosophila, it is not yet understood how the testis-specific transcripts are selectively activated during the 3-day spermatocyte period (Kim, 2017).

To identify the first transcripts up-regulated at onset of spermatocyte differentiation, germ cells were genetically manipulated to synchronously differentiate from spermatogonia to spermatocytes in vivo using bam-/- testes, which contain large numbers of overproliferating spermatogonia. Brief restoration of Bam expression under heat shock control in hs-bam;bam-/- flies induced synchronous differentiation of bam-/- spermatogonia, resulting in completion of a final mitosis, premeiotic DNA synthesis, and onset of spermatocyte differentiation by 24 hours after Bam expression, eventually leading to production of functional sperm. Comparison by means of microarray of transcripts expressed before versus 24 hours after heat shock of hs-bam;bam-/- testes identified 27 early transcripts that were significantly up-regulated more than twofold in testes from hs-bam;bam-/- but not from bam-/- flies subjected to the same heat shock regime. Among these was the early spermatocyte marker RNA binding protein 4 (Rbp4). At this early time point, the transcript for CG5204 - now named kumgang (kmg), from the Korean name of mythological guardians at the gate of Buddhist temples - had the greatest increase among all 754 Drosophila predicted transcription factors (Kim, 2017).

Kumgang (CG5204) encodes a 747-amino acid protein with six canonical C2H2-type zinc finger domains expressed in testes but not in ovary or carcass. Kmg protein was expressed independently from the tMAC component Always early (Aly) or the tTAF Spermatocyte Arrest (Sa), and both kmg mRNA and protein were up-regulated before Topi, another component of tMAC. Immunofluorescence staining of wild-type testes revealed Kmg protein expressed specifically in differentiating spermatocytes, where it was nuclear and enriched on the partially condensed bivalent chromosomes. Consistent with dramatic up-regulation of kmg mRNA after the switch from spermatogonia to spermatocyte, expression of Kmg was first detected with immunofluorescence staining after completion of premeiotic S-phase marked by down-regulation of Bam, coinciding with expression of Rbp4 protein (Kim, 2017).

Function of Kmg in spermatocytes was required for male germ cell differentiation. Reducing function of Kmg in spermatocytes-either by means of cell type-specific RNA interference (RNAi) knockdown (KD) or in flies trans-heterozygous for a CRISPR (clustered regularly interspaced short palindromic repeats)-induced kmg frameshift mutant and a chromosomal deficiency (kmgΔ7/Df)-resulted in accumulation of mature primary spermatocytes arrested just before the G2/M transition for meiosis I and lack of spermatid differentiation. A 4.3-kb genomic rescue transgene containing the 2.3-kb kmg open reading frame fully rescued the differentiation defects and sterility of kmgΔ7/Df flies, confirming that the meiotic arrest phenotype was due to loss of function of Kmg. In both kmg KD and kmgΔ7/Df, Kmg protein levels were less than 5% that of wild type. kmgΔ7/Df mutant animals were adult-viable and female-fertile but male-sterile, which is consistent with the testis-specific expression (Kim, 2017).

Function of Kmg was required in germ cells for repression of more than 400 genes not normally expressed in wild-type spermatocytes. Although the differentiation defects caused by loss of function of kmg appeared, by means of phase contrast microscopy, to be similar to the meiotic arrest phenotype of testis-specific tMAC component mutants, analysis of gene expression in kmg KD testes showed that many Aly (tMAC)-dependent spermatid differentiation genes were expressed, although some at a lower level than that in wild type. Among the 652 genes with more than 99% lower expression in aly-/- mutant as compared with wild-type testes, only four showed similar reduced expression in kmg KD as compared with that of sibling control (no Gal4 driver) testes. In contrast, transcripts from more than 500 genes were strongly up-regulated in kmg KD testes, with almost no detectable expression in testes from sibling control males. Hierarchical clustering identified 440 genes specifically up-regulated in kmg KD testes compared with testes from wild-type, bam-/-, aly-/-, or sa-/- mutant flies. These 440 genes were significantly associated with Gene Ontology terms such as 'substrate specific channel activity' or 'detection of visible light' that appeared more applicable to non-germ cell types, such as neurons. Analysis of published transcript expression data for a variety of Drosophila tissues revealed that the 440 were normally not expressed or extremely low in wild-type adult testes, but many were expressed in specific differentiated somatic tissues such as eye, brain, or gut. Confirming misexpression of neuronal genes at the protein level, immunofluorescence staining revealed that the neuronal transcription factor Prospero (Pros), normally not detected in male germ cells, was expressed in clones of spermatocytes that are homozygous mutant for kmg induced by Flp-FRT-mediated mitotic recombination. The misexpression of Pros was cell-autonomous, occurring only in mutant germ cells. Mid-stage to mature spermatocytes homozygous mutant for kmg misexpressed Pros, but mutant early spermatocytes did not, indicating that the abnormal up-regulation of Pros occurred only after spermatocytes had reached a specific stage in their differentiation program (Kim, 2017).

A small-scale cell type-specific RNAi screen of chromatin regulators revealed that KD of dMi-2 in late TA cells and spermatocytes resulted in meiotic arrest, similar to loss of function of kmg. Immunofluorescence analysis of testes from a protein trap line in which an endogenous allele of dMi-2 was tagged by green fluorescent protein (GFP) revealed that dMi-2-GFP, like the untagged endogenous protein, was expressed and nuclear in progenitor cells and spermatocytes, as well as in somatic hub and cyst cells. dMi-2-GFP colocalized to chromatin with Kmg in spermatocytes, and the level of dMi-2 protein appeared lower and less concentrated on chromatin in nuclei of kmg-/- spermatocytes than in neighboring kmg+/+ or kmg+/- spermatocytes, suggesting that Kmg may at least partially help recruit dMi-2 to chromatin in spermatocytes. Furthermore, in testis extracts Kmg coimmunoprecipitated with dMi-2 and vice versa, suggesting that Kmg and dMi-2 form a protein complex in spermatocytes. Comparison of microarray data revealed that most of the 440 transcripts up-regulated in testes upon loss of function of kmg were also abnormally up-regulated in dMi-2 KD testes, suggesting that Kmg and dMi-2 may function together to repress expression of the same set of normally somatic transcripts in spermatocytes (Kim, 2017).

Chromatin immunoprecipitation followed by sequencing (ChIP-seq) revealed that Kmg protein localized along the bodies of genes actively transcribed in the testis. ChIP-seq with antibody to Kmg identified 798 genomic regions strongly enriched by immunoprecipitation of Kmg from wild-type but not from kmg KD testes. Of the 798 robust Kmg ChIP-seq peaks, 698 overlapped with exonic regions of 680 different genes actively transcribed in testes. The enrichment was often strongest just downstream of the transcription start site (TSS), but with substantial enrichment along the gene body as well (Kim, 2017).

ChIP-seq with antibody to dMi-2 also showed enrichment along the gene bodies of the same 680 genes bound by Kmg, with a similar bias just downstream of the TSS. The dMi-2 ChIP signal along these genes was partially reduced in kmg KD testes, suggesting that Kmg may recruit dMi-2 to the bodies of genes actively transcribed in the testis (Kim, 2017).

RNA-seq analysis revealed that the 680 genes bound by Kmg were strongly expressed in testes and most strongly enriched in the GO term categories 'spermatogenesis' and 'male gamete generation'. One-third of the genes bound by Kmg were robustly activated as spermatogonia differentiate into spermatocytes and were much more highly expressed in the testes than in other tissues. The median levels of transcript expression of most of the 680 Kmg bound genes did not show appreciable change upon loss of Kmg (Kim, 2017).

Genes that are normally transcribed in somatic cells that became up-regulated upon loss of Kmg function in spermatocytes for the most part did not appear to be bound by Kmg. Only 3 of the 440 genes up-regulated in kmg KD overlapped with the 680 genes with robust Kmg peaks, suggesting that Kmg may prevent misexpression of normally somatic transcripts either indirectly or by acting at a distance (Kim, 2017).

Inspection of RNA-seq reads from kmg and dMi-2 KD testes mapped onto the genome showed that ~80% of the transcripts that were detected with microarray analysis as misexpressed in KD as compared with wild-type testes did not initiate from the promoters used in the somatic tissues in which the genes are normally expressed. Metagene analysis, as well as visualization of RNA expression centered on the TSSs annotated in the Ensembl database, showed that most of the 143 genes that are normally expressed in wild-type heads but not in wild-type testes were misexpressed in kmg or dMi-2 KD testes from a start site different from the annotated TSS used in heads. Transcript assembly from RNA-seq data by using Cufflinks for the 143 genes also showed that the transcripts that are misexpressed in kmg or dMi-2 KD testes most often initiate from different TSSs than the transcripts from the same gene assembled from wild-type heads (Kim, 2017).

Of the 440 genes scored via microarray as derepressed in kmg KD testes, 346 could be assigned with TSSs in kmg KD testes based on visual inspection of the RNA-seq data mapped onto the genome browser. Of these, only 67 produced transcripts in kmg KD testes that started within 100 base pairs (bp) of the TSS annotated in the Ensembl database, based on the tissue(s) in which the gene was normally expressed. In contrast, for the rest of the 346 genes, the transcripts expressed in kmg KD testes started from either a TSS upstream (131 of 346) or downstream (148 of 346) of the annotated TSSs. Of the 346 genes, 262 were misexpressed starting from nearly identical positions in dMi-2 KD as in kmg KD testes, suggesting that Kmg and dMi-2 function together to prevent misexpression from cryptic promoters (Kim, 2017).

Many of the ectopic promoters from which the misexpressed transcripts originated appeared to be bound by Aly, a component of tMAC, in kmg KD testes. ChIP for Aly was performed by using antibody to hemagglutinin (HA) on testis extracts from flies bearing an Aly-HA genomic transgene able to fully rescue the aly-/- phenotype. Of 346 genes with new TSSs assigned via visual inspection, 181 had a region of significant enrichment for Aly as detected with ChIP, with its peak summit located within 100 bp of the cryptic promoter. Motif analysis by means of MEME revealed that these regions were enriched for the DNA sequence motif (AGYWGGC). This motif was not significantly enriched in the set of 165 cryptic promoters at which Aly was not detected in kmg KD testes. Enrichment of Aly at the cryptic promoters was much stronger in kmg KD as compared with wild-type testes, suggesting that in the absence of Kmg, Aly may bind to and activate misexpression from cryptic promoters (Kim, 2017).

Genetic tests revealed that the misexpression of somatic transcripts in kmg KD spermatocytes indeed required function of Aly. The neuronal transcription factor Pros, abnormally up-regulated in kmg KD or mutant spermatocytes, was no longer misexpressed if the kmg KD spermatocytes were also mutant for aly, even though germ cells in kmg KD;aly-/- testes appear to reach the differentiation stage at which Pros turned on in the kmg KD germ cells. Assessment by means of quantitative reverse transcription polymerase chain reaction (RT-PCR) revealed that misexpression of five out of five transcripts in kmg KD testes also required function of Aly. Global transcriptome analysis via microarray of kmg KD versus kmg KD;aly-/- testes showed that the majority of the 440 genes that were derepressed because of loss of function of kmg in spermatocytes were no longer abnormally up-regulated in kmg KD;aly-/- testes. Even genes without noticeable binding of Aly at their cryptic promoters were suppressed in kmg KD;aly-/-, suggesting that Aly may regulate this group of genes indirectly (Kim, 2017).

Together, the ChIP and RNA-seq data show that Kmg and dMi-2 bind actively transcribed genes but are required to block expression of aberrant transcripts from other genes that are normally silent in testes. The mammalian ortholog of dMi-2, CHD4 (Mi-2β), has been shown to bind active genes in mouse embryonic stem cells or T lymphocyte precursors but also plays a role in ensuring lineage-specific gene expression in other contexts. It cannot be ruled out that Kmg and dMi-2 might also act directly at the cryptic promoter sites but that the ChIP conditions did not capture their transient or dynamic binding because several chromatin remodelers or transcription factors, such as the thyroid hormone receptor, have been difficult to detect with ChIP. Kmg and dMi-2 may repress misexpression from cryptic promoters indirectly by activating as-yet-unidentified repressor proteins. However, it is also possible that Kmg and dMi-2 act at a distance by modulating chromatin structure or confining transcriptional initiation or elongation licensing machinery to normally active genes (Kim, 2017).

Changes in the genomic localization of Aly protein in wild-type versus kmg KD testes raised the possibility that Kmg may in part prevent misexpression from cryptic promoters by concentrating Aly at active genes. Of the 1903 Aly peaks identified with ChIP from wild-type testes, the 248 Aly peaks that overlapped with strong Kmg peaks showed via ChIP an overall reduction in enrichment of Aly from kmg KD testes as compared with wild type. In contrast, the Aly peaks at cryptic promoters were more robust in kmg KD testes than in wild type. In general, over the genome 4129 new Aly peaks were identified by means of ChIP from kmg KD testes that were absent or did not pass the statistical cutoff in wild-type testes. More than 30% of the genomic regions with new Aly peaks in kmg KD showed elevated levels of RNA expression starting at or near the Aly peak in kmg KD but not in wild-type testes, suggesting that misexpression of transcripts from normally silent promoters in kmg KD testes is more widespread than initially assessed with microarray. Together, these findings raise the possibility that Kmg may prevent misexpression of aberrant transcript by concentrating Aly to active target genes in wild-type testes, preventing binding and action of Aly at cryptic promoter sites (Kim, 2017).

The results suggest that selective gene activation is not always mediated by a precise transcriptional activator but can instead be directed by combination of a promiscuous activator and a gene-selective licensing mechanism. Cryptic promoters may become accessible as chromatin organization is reshaped to allow expression of terminal differentiation transcripts that were tightly repressed in the progenitor state. It is posited that this chromatin organization makes a number of sites that are accessible for transcription dependent on the testis-specific tMAC complex component Aly. In this context, activity of Kmg and dMi-2 is required to prevent productive transcript formation from unwanted initiation sites, potentially by confining Aly to genes actively transcribed in the testis and limiting the amount of Aly protein acting at cryptic promoters (Kim, 2017).

The initiation of transcripts from cryptic promoters is reminiscent of loss of function of Ikaros, a critical regulator of T and B cell differentiation and a tumor suppressor in the lymphocyte lineage. Like Kmg, Ikaros is a multiple-zinc finger protein associated with Mi-2β, which binds to active genes in T and B cell precursors. In T cell lineage acute lymphoblastic leukemia (T-ALL) associated with loss of function of Ikaros, cryptic intragenic promoters were activated, leading to expression of ligand-independent Notch1 protein, contributing to leukemogenesis. Thus, in addition to being detrimental for proper differentiation, firing of abnormal transcripts from normally cryptic promoters because of defects in chromatin regulators may contribute to tumorigenesis through generation of oncogenic proteins (Kim, 2017).


GENE STRUCTURE

aly was cloned by a combination of fine structure recombination mapping, deletion analysis and mapping of RFLPs associated with insertion alleles. aly encodes a 1.85 kb germline-dependent transcript in males. A probe encompassing 18 kb of genomic sequence from -24 kb to -42 kb of a chromosome walk detected a single 1.85 kb transcript in Northern blots of poly-A+-selected RNA from wild-type males but not from germline-less males. A 1.5 kb cDNA obtained by screening a testis library with the same probe also recognized the 1.85 kb transcript in poly-A+ RNA from wild-type males. This transcript was detected in RNA from males homozygous for aly1, a temperature-sensitive allele, but not in males homozygous for the hybrid-dysgenesis induced alleles aly4, aly5 or aly6. The 1.85 kb transcript is likely to be a product of the aly locus rather than a downstream transcriptional target of the meiotic arrest gene pathway since it was detected in Northern blots of poly-A+ RNA from males homozygous for can1, mia and sa1. A much less abundant transcript at 1.5 kb was also detected in the more heavily loaded lanes using the cDNA probe (White-Cooper, 2000).

Comparison of the sequences of the 1.5 kb cDNA, a 5' RACE product derived from testis RNA and genomic clones from the Aly region indicated that the 1.85 kb transcription unit has two small introns. The first intron is in the 5' UTR, the second 92 codons into the predicted protein. Conceptual translation revealed an ORF encoding a predicted protein of 534 amino acids, 62 kDa (White-Cooper, 2000).

cDNA clone length - 2267

Bases in 5' UTR - 199

Exons - 3

Bases in 3' UTR - 463


PROTEIN STRUCTURE

Amino Acids - 534

Structural Domains

aly is a member of a conserved gene family that includes the C. elegans negative regulator of vulval induction, lin-9 BLAST searches of sequence databases identified aly as one of two Drosophila homologs of the C. elegans gene lin-9 (Beitel). The other homolog (86E4.4) had been identified by the European Drosophila genome project. Sequences with significant homology to this family of proteins were also identified from the Arabidopsis thaliana genomic sequence project (two closely related genes), and from Zea mays (maize), Oryza sativa (rice), Schistosoma mansoni, zebrafish, mouse and human EST projects. Multiple sequence (Clustal W) alignments have revealed two distinct domains of homology. On average, pairwise comparisons (Clustal W) within Region 1 show 32% amino acid identity, 53% similarity. The second homology domain (Region 2) was less well conserved, especially in the Arabidopsis sequence, where a putative divergent Region 2 was identified. On average, pairwise comparisons within Region 2 show 22% aa identity, 42% similarity, excluding those between the very well conserved vertebrate proteins, and the divergent Region 2 from Arabidopsis. The spacing between Regions 1 and 2 varies somewhat between the homologs. No significant similarities between the proteins in pairwise comparisons were detected outside the two conserved domains, nor were any similarities detected to any other proteins in the sequence databases. The second Drosophila homolog has a long C-terminal region that includes a leucine zipper motif. The Arabidopsis homolog appears to have an extended N-terminal region. However, since the Arabidopsis protein is based on GRAIL and GenScan predictions on genomic sequence, rather than on cDNA, it is not clear if these predicted exons are actually present in the mature transcript. All the database entries for the vertebrate sequences were derived from single sequencing runs on cDNAs, so the sequences of the entire transcription units are not available. However, given that both homology regions are present in the zebrafish predicted protein, first homology region are expected to be found in the mouse and human homologs when full-length cDNA sequences become available (White-Cooper, 2000).

A strikingly conserved feature of all the homologs where sequence of the first conserved region was available is the presence of a nuclear localization signal (NLS) predicted by PSORT within this region. In aly, this fits the consensus for a bipartite NLS: two basic residues, a ten residue spacer, and another basic region consisting of at least three out of five basic residues. The predicted NLS of the other homologs fits the classic consensus of four residues, at least three basic, the other any of K, R, P or H. The missense allele aly3-1393 changes Val150 to glutamic acid within region 1 of aly. This residue is conserved as an aliphatic amino acid (I, L or V). V150 of aly falls within the ten amino acid spacer region of the bipartite NLS (White-Cooper, 2000).


always early: Evolutionary Homologs | Regulation | Developmental Biology | Effects of Mutation | References

date revised: 20 March 2003

Home page: The Interactive Fly © 1995, 1996 Thomas B. Brody, Ph.D.

The Interactive Fly resides on the
Society for Developmental Biology's Web server.