gastrulation-defective

gastrulation-defective: Biological Overview | Regulation | Developmental Biology | Effects of Mutation | References

Gene name - gastrulation-defective

Synonyms -

Cytological map position - 11A1--3

Function - protease

Keywords - dorsal/ventral polarity, Toll cascade

Symbol - gd

FlyBase ID: FBgn0000808

Genetic map position - 1-36.8

Classification - serine-type endopeptidase

Cellular location - secreted

NCBI link: Entrez Gene

gd orthologs: Biolitmine

BIOLOGICAL OVERVIEW

In the process of establishing dorsoventral polarity in the Drosophila embryo, cell fates along the dorsoventral axis are determined by a gradient of the extracellular morphogen Spätzle, which activates the receptor Toll. Spätzle, which conveys a 'ventralizing' signal necessary for ventral and lateral development, is activated by proteolytic processing, a reaction that apparently occurs shortly after fertilization and only on the ventral side of the embryo. This crucial processing event requires four serine proteases -- Nudel, Gastrulation defective (Gd), Snake, and Easter -- and a ventrally restricted factor provided during oogenesis by the pipe gene. An important question asks how do these components function to process Spätzle at the right time and place in order to establish embryonic dorsoventral polarity (LeMosy, 2001 and references therein)?

Genetic and molecular studies have suggested that the proteases in the Toll signaling pathway function sequentially in a proteolytic cascade, as seen for mammalian blood clotting. In such a cascade, the proteases exist as zymogens that become activated by cleavage at a defined site between a prodomain and the catalytic domain, and one protease activates the next downstream protease in the cascade. By genetic criteria, Nudel is the most upstream protease in the Toll signaling pathway, followed by Gd, then Snake, and finally Easter, the protease that can process Spätzle to a biologically active form. The zymogen forms of Snake and Easter, as well as the Spätzle protein, appear to be freely diffusible in the extracellular perivitelline space surrounding the embryo. Thus, the activities of these proteases must be ventrally restricted to confine Toll ligand production to the ventral side of the embryo. Spatial control of the blood-clotting proteases, which are also diffusible as zymogens, is partly achieved through dependence of the first and all subsequent zymogen-activation steps on membrane-bound cofactors that are localized to the site of blood-vessel injury. By analogy, it has been proposed that the ventrally restricted factor provided by the pipe gene functions as a cofactor necessary for activation of the proteolytic cascade that produces the Toll ligand (LeMosy, 2001 and references therein).

As the earliest acting protease in the Toll signaling pathway, Nudel would be expected to play an important role in triggering Toll ligand production. The Nudel protease is autoactivated without requiring the activities of the other proteases at the beginning of embryogenesis, which is consistent with its proposed role as the initiator of a protease cascade in which it activates the next downstream protease, presumably Gd. However, Nudel protease activation does not seem to be ventrally restricted or regulated by pipe. The Nudel protease is also required for modification of the extracellular matrix, raising the possibility that Nudel acts indirectly in dorsoventral patterning (LeMosy, 2001 and references therein).

Is Gd the localized determinant responsible for the spatially restricted pattern of Toll activation? Probably not. The level of Gd determines the strength of the ventralizing signal, presumably by controlling the activation of the downstream proteases Snake and Easter. Gd at high levels can ectopically induce the ventralizing signal in the absence of nudel or pipe function. The Gd zymogen is not prelocalized to a ventral site, and Gd zymogen cleavage, although requiring the Nudel protease, does not depend on pipe. Surprisingly, these results suggest that the protease cascade producing the ventralizing signal is initiated by a spatially uniform cleavage of the Gd zymogen, with spatial regulation of a downstream step determining where signaling is to occur. If cleavage of Nudel and Gd, necessary to trigger the proteolytic cascade leading to the Toll ligand, is not ventrally localized, then how could Toll ligand production be restricted to the ventral side of the embryo? The last step in the cascade, processing of Spätzle to the active Toll ligand, is controlled by pipe (Morisato, 1994). Thus, spatial regulation could occur at an intermediate step, perhaps involving the proteolytic activation of Snake or Easter. For example, the Gd protease, although broadly activated, may activate Snake only when its substrate is associated with the ventral factor provided by pipe (LeMosy, 2001 and references therein).

Prevailing models based largely on genetic data have suggested that Gd, Snake, and Easter function in a cascade of sequential zymogen activation, like the mammalian proteases involved in blood clotting. The in vitro experiments provide biochemical evidence in support of this idea. Although the data strongly imply that Gd directly activates Snake and that Snake directly activates Easter, experiments with purified proteins will be necessary to obtain proof of an enzyme-substrate relationship between these proteases (LeMosy, 2001).

A surprising finding is that the cascade of Gd, Snake, and Easter is activated when all three proteases are coexpressed as zymogens. One possible explanation is that high concentrations of these proteases generated by overexpression in S2 cells promote protein-protein interactions that are important for cascade activation in vivo. For example, the proteases may normally be activated after being brought together in zymogen-activation complexes, as is the case for the proteases involved in blood clotting (LeMosy, 2001).

Gd is cleaved when it triggers the activation of Snake and Easter in vitro. It is thought that cleavage is important for activation of Gd's proteolytic activity. This conclusion is supported by the findings that a cleaved form of Gd, but not the zymogen, reacts with active-site inhibitors specific for active serine proteases (Han, 2000), and that two truncated forms of Gd are more active than the zymogen in processing Snake. Because Gd lacks a typical zymogen-activation site, as in Snake or Easter, it is unclear where Gd is cleaved (Konrad, 1998). The ability to generate more active forms of Gd by truncation at two different sites may indicate that Gd does not normally require cleavage at a single specific site to become activated (LeMosy, 2001).

Although present as a zymogen during oogenesis, Gd apparently is cleaved during early embryogenesis to a smaller polypeptide similar in size to the 46-kDa form generated during the activation of Snake and Easter in vitro. This cleavage must be significant for Gd function in vivo, since it relies on the activity of the Nudel protease which acts genetically upstream of Gd in the Toll signaling pathway. Additionally, the cleaved form of Gd is detectable throughout the first 3 h of embryonic development when the Toll ligand is produced: this is consistent with it being the functional Gd form that activates the downstream proteases. Although further studies are required to definitively demonstrate enzymatic activity, these findings strongly suggest that the smaller form of Gd present during early embryogenesis represents activation of the Gd zymogen (LeMosy, 2001).

How is the Gd zymogen normally cleaved? In using the coexpression assay, it was observed that the Nudel protease and preactivated forms of Gd and Snake are capable of inducing the cleavage of Gd to the 46-kDa form. However, only the Nudel protease seems to be essential for this cleavage in vivo, because mutagenesis of Gd's catalytic serine or a mutation in the snake gene does not block the appearance of the 46-kDa Gd polypeptide in the early embryo. A model consistent with all of the data is one in which the Nudel protease, detectable during the first 2 h of embryogenesis, directly cleaves the Gd zymogen. Once activated, Gd and perhaps even the downstream proteases could promote further Gd-zymogen cleavage through a positive feedback loop. In the coexpression experiments carried out in the absence of the Nudel protease, a protease normally present in the S2 cells may have cleaved a small amount of Gd sufficient to trigger the subsequent activation of Snake and Easter. This explanation is compatible with the idea that Gd can be activated by the cleavage of an unusually labile region rather than at a specific site in its primary structure (LeMosy, 2001).

Gd activity seems to be restricted to the ventral side of the embryo. One possible explanation is that Gd is prelocalized to a ventral site. On the contrary, the Gd zymogen is uniformly distributed at the oocyte surface, similar to Nudel. Alternatively, Gd could be proteolytically activated only on the ventral side of the embryo. Current results argue against this possibility, because Gd-zymogen cleavage is not regulated by pipe, the key gene that ventrally restricts Toll ligand production. Activation of the Nudel protease, required for Gd cleavage, is also not regulated by pipe (LeMosy, 2001).

Thus the observation that the processing of Spätzle to the active Toll ligand (the last step in the cascade) is controlled by pipe, appears to explain the localized activation of the Toll pathway. Spatial regulation could occur at an intermediate step, perhaps involving the proteolytic activation of Snake or Easter. For example, the Gd protease, although broadly activated, may activate Snake only when its substrate is associated with the ventral factor provided by pipe (LeMosy, 2001).

The mammalian blood-clotting cascade normally generates a blood clot only at the site of tissue injury and thus is a signaling pathway that transmits spatial information. In contrast, these studies suggest the protease cascade involved in dorsoventral patterning initially transmits not spatial but temporal information, which is, perhaps, a cue that embryogenesis has begun. Spatial information in the form of the ventral cue is then integrated at a distinct downstream step in the cascade. As a consequence, the ventralizing signal is generated at the right time and place to pattern the embryo (LeMosy, 2001).

Feedback control also seems to be important for defining the temporal and spatial dimensions of signaling by the dorsoventral cascade, as has been shown for other signaling pathways involved in creating complex developmental patterns. Earlier studies have suggested that a negative feedback loop involving the most downstream component in the Toll signaling pathway inhibits activation of the Easter protease (Misra, 1998). The data also raise the possibility that both positive and negative feedback loops modulate the activation of proteases upstream of Easter. Such feedback could amplify a subtle asymmetry in protease activation level not detectable by the current methods. The integration of positive and negative feedback loops, and of temporal and spatial cues as outlined above, likely provides the precise control of signaling necessary to establish embryonic dorsoventral polarity (LeMosy, 2001).

Independence of chromatin conformation and gene regulation during Drosophila dorsoventral patterning

The relationship between chromatin organization and gene regulation remains unclear. While disruption of chromatin domains and domain boundaries can lead to misexpression of developmental genes, acute depletion of regulators of genome organization has a relatively small effect on gene expression. It is therefore uncertain whether gene expression and chromatin state drive chromatin organization or whether changes in chromatin organization facilitate cell-type-specific activation of gene expression. In this study, using the dorsoventral patterning of the Drosophila melanogaster embryo as a model system, evidence is provided for the independence of chromatin organization and dorsoventral gene expression. Tissue-specific enhancers are defined and linked to expression patterns using single-cell RNA-seq. Surprisingly, despite tissue-specific chromatin states and gene expression, chromatin organization is largely maintained across tissues. The results indicate that tissue-specific chromatin conformation is not necessary for tissue-specific gene expression but rather acts as a scaffold facilitating gene expression when enhancers become active (Ing-Simmons, 2021).

Previous studies produced conflicting results regarding the relationship between gene expression, chromatin state and 3D chromatin organization. This study set out to understand this relationship in the context of embryonic development in Drosophila. Using the well-studied dorsoventral patterning system, it was shown that, despite significant differences in chromatin state and gene expression between tissues along the dorsoventral axis of the embryo, chromatin conformation is largely maintained across tissues. This suggests that cell-type-specific gene regulation does not require cell-type-specific chromatin organization in this context. Nevertheless, developmentally regulated genes and enhancers are organized into chromatin domains. It is suggested that this organization plays a permissive role to facilitate the precise regulation of developmental genes (Ing-Simmons, 2021).

Use was made of maternal effect mutations in the Toll signaling pathway, which lead to embryos that lack the usual patterning of the dorsoventral axis and have long been used as a system to study the specification of mesoderm (Toll^10B), neuroectoderm (Toll^rm9/rm10) and dorsal ectoderm (gd⁷) cell fates as well as the regulation of tissue-specific gene expression. However, these embryos are still under the influence of anterior-posterior patterning signals and do not show completely uniform cell identities. This study sought to investigate heterogeneity of cell identity at the single-cell level by using single-cell gene expression profiling. This revealed that certain cell types are indeed maintained in all three Toll pathway mutants, including pole cells and other terminal region cell identities, hemocytes and trachea precursor cells. However, heterogeneity of gene expression is reduced in the mutants, as shown by the loss of cells assigned to mesoderm clusters in gd⁷ and Toll^10B embryos and the depletion of ectoderm subsets in each of the mutants. These datasets showcase the advantages of measuring cellular heterogeneity at the single-cell level and provide a useful resource for further characterization of these embryos and investigation of the regulation of dorsoventral patterning (Ing-Simmons, 2021).

Although the gd⁷, Toll^10B and Toll^10B embryos still have heterogeneous gene expression profiles, nevertheless, there are clear differences in chromatin state and overall gene expression between these embryos. This study expanded on previous studies by identifying putative enhancers specific to neuroectoderm in addition to dorsal ectoderm and mesoderm. This allowed the identification of tissue-specific putative enhancer-gene pairs, which correspond well with known dorsoventral patterning enhancers and genes that are differentially expressed (DE across the dorsoventral axis. These regulatory elements and their target genes are located inside chromatin domains, distinct from the enrichment of housekeeping genes at domain boundaries. This is in line with previous results that suggest that 3D chromatin domains act as regulatory domains (Ing-Simmons, 2021).

This domain organization is maintained across tissues, even in cases in which there are significant changes in the local chromatin state and gene expression. This is consistent with earlier results from Hi-C experiments carried out in anterior and posterior embryo halves, which also showed no differences, and with previous studies in Drosophila cell lines and other systems, which suggested that domains are widely conserved across different tissues and even different species. To explain this maintenance of organization across cell lines, it was proposed that active chromatin, especially at broadly expressed genes, is responsible for partitioning the genome into domains. It has been proposed that compartmentalization of active and inactive chromatin, at the level of individual genes, underlies the formation of insulated chromatin domains. This model predicts that, when a developmentally regulated gene is active, its domain would merge with or have increased interactions with neighboring domains containing active genes, such as broadly expressed housekeeping genes. The results do not support this model, as this study found no evidence that differences in domain structure are driven by changes in chromatin state or by active expression of developmentally regulated genes. By contrast, this supports the idea that, similar to mammalian domain architecture, additional factors, such as insulator proteins, modulate domain organization in Drosophila. Therefore, based on current data, it is not believed that active transcription is the key determinant of 3D chromatin organization in this system (Ing-Simmons, 2021).

While overall and locus-specific chromatin organization are maintained across tissues, Hi-C and Micro-C analyses identify a small number of examples of regions that do have changes in organization. However, at these loci, there is no clear relationship between changes in organization and changes in chromatin state or expression, and the vast majority of developmentally regulated loci in this system do not have changes. It will be important for future studies to further investigate these loci to understand what drives these rare changes (Ing-Simmons, 2021).

This study also investigated chromatin organization at the level of enhancer-promoter interactions. Previous studies produced conflicting results about whether these interactions are correlated with tissue-specific activation of gene expression. No evidence was found for widespread enrichment of interactions between enhancers and their target promoters, including in tissues where they are active. This is in contrast with previous studies using 3C approaches that have found evidence of enriched enhancer-promoter interactions, which may precede or correlate with transcriptional activation. Notably, Ghavi-Helm (2019) found that a subset of Drosophila long-range enhancer-promoter pairs do form stable interactions that are enriched above local background19. While these loops are visible in the dataset presented in this study, the results suggest that such loops are not likely to be the primary mechanism of promoter regulation during Drosophila development, perhaps because most enhancers are close to their target promoters. Many stable loops in the Drosophila genome are instead associated with polycomb-mediated repression (Ing-Simmons, 2021).

Hi-C provides information about the average conformation across a population of hundreds of thousands of nuclei, which contain dynamic ensembles of different 3D conformations. While the scRNA-seq results indicate that the mutant embryos contain a range of different cell types, it is believed that the results indicate that the 3D chromatin structures in these cell types are drawn from the same population of possible conformations. This is supported by results from a recent study analyzing the structure of the Doc and sna loci in Drosophila embryos using Hi-M, a high-resolution single-cell imaging approach. Strikingly, this orthogonal technique also reveals chromatin organization that is consistent across different tissues in the embryo, despite differential expression of these genes. Imaging-based approaches directly measure spatial proximity between genomic loci, whereas Hi-C and Micro-C rely on cross-linking to detect chromatin interactions. Therefore, care must be taken when comparing these approaches. Nevertheless, both approaches indicate that genome organization is maintained across different tissues in this system (Ing-Simmons, 2021).

The results are consistent with several recent studies in mammals as well as in Drosophila, which provide evidence that stable enhancer-promoter contacts are not always required for gene activation. This is in line with models in which transient or indirect contacts with a regulatory element are sufficient to activate transcription, such as through the formation of nuclear microenvironments or phase-separated condensates (Ing-Simmons, 2021).

Together, the results indicate that differential chromatin organization is not a necessary feature of cell-type-specific gene expression. It is proposed that chromatin organization into domains instead provides a scaffold or framework for the regulation of developmental genes during and after the activation of zygotic gene expression. This may help render developmental enhancers 'poised' for timely regulation of target genes upon receipt of appropriate cellular signals. Other mechanisms of priming have been described, including paused polymerase (Pol) II at promoters and pioneer factors bound to poised enhancers. Feedback effects, such as downstream modification of chromatin state and additional mechanisms, including looping between polycomb-bound elements and segregation of active and inactive chromatin, may then act as layers on top of the initially established domain structure (Ing-Simmons, 2021).

GENE STRUCTURE

The cDNA sequence reveals a single long ORF beginning with the first ATG in the sequence. The cDNA contains 30 bp of 5' untranslated sequence that is located 241 bp downstream from the poly(A) site of the tsg gene. The ORF is followed by 247 bp of 3' untranslated sequence containing multiple stop codons in all frames and terminates in a poly(A) tail. The only consensus poly(A) addition signal (AATAAA) in the region is located 37 bp before the end of the gd cDNA. Two AACAAA motifs are located further downstream at bp 2626 and 3218, but there is no evidence from RNA blots that either of these is used as a polyadenylation site. Comparison of the cDNA and the genomic sequence reveals the presence of four introns. No evidence was found for alternative splicing (Konrad, 1998).

cDNA clone length - 1894

Bases in 5' UTR - 62

Exons - 5

Bases in 3' UTR - 245

PROTEIN STRUCTURE

Amino Acids - 528

Structural Domains

The gd cDNA sequence predicts a 59-kDa polypeptide. This was confirmed by in vitro translation of transcripts from the cDNA clone, which yielded a protein of ~60 kDa with a pI of 6.94 as determined by two dimensional PAGE. The amino-terminal residues are hydrophobic, suggesting a secretory leader sequence that would be cleaved after amino acid. Because Gd activity is not freely diffusible in the PV fluid, the Gd protein sequence was examined for potential anchoring sites. Two hydrophobic domains are notable, one near the middle of the protein that could represent a membrane spanning region, and the second, a hydrophobic tail, that could act as a membrane anchor. However, each of these domains contains charged amino acid residues. The protein contains three putative N-linked glycosylation sites (NXS/T). There are no RGD amino acid sequences that might interact with extracellular components nor any WIID or LDL repeats, as seen in the NDL protein (Konrad, 1998).

A BLAST search has suggested that Gd is related to the family of serine proteases. The most closely related proteins are factor IX of the mammalian clotting cascade, two crab coagulation factors, and a urokinase-type plasminogen activator. Sequence alignment of Gd with these proteins and chymotrypsin shows that Gd shares the 3 amino acids of the catalytic triad (H, D, S) and all but one of the cysteine bridges. In addition, several other structural features that identify serine proteases are present. However, Gd also has some features that are atypical of the basic serine protease family but that are seen in other related proteins. For example, it contains a putative activating cleavage site that is not typical, and the conserved D adjacent to the active site S is replaced by I and a short insert of amino acids immediately adjacent. There is a small acidic region N-terminal to the catalytic site, which is a feature shared by Snk but not Easter (Konrad, 1998).

The Gd protein possesses the catalytic triad (H, D, S) characteristic of serine proteases. Although the homology to serine proteases suggests a proteolytic role, several significant differences between the sequence of Gd and other members of the chymotrypsin family of serine proteases suggest that Gd may behave atypically. Most eukaryotic serine proteases undergo an activation process after zymogen cleavage in which the alpha amino group of the catalytic domain forms a salt bridge to an aspartate residue (D) located adjacent to the active site serine. In Gd, this aspartic acid residue is replaced by isoleucine, making it unlikely that the protease could be activated via the classical mechanism. However, several bacterial members of the alpha-lytic endopeptidase family (e.g., subtilisin) contain residues other than D (e.g., T) adjacent to the active serine, and complement component C2 contains E instead of D. Typically, activation involves cleavage after a positively charged amino acid (R or K) followed by two branch chain amino acids and a highly conserved G. The N-terminal residue is most commonly I but may be L, V, or M. One putative cleavage site for Gd based on position relative to conserved motifs (e.g., G-WPW and CGGTSLV) is S ITRG. Although this putative site has S instead of R or K and has G in the 4th position rather than the 3rd, cleavage at this site would produce a peptide of 30.5 kDa that could possibly correspond to the 30-kDa species observed on the Western blots. The extensive heteroallelic complementation exhibited by gd is consistent with the possibility that the Gd protein may function as a multimer perhaps during an activation process (Konrad, 1998).

The predicted protein encoded by gd was compared with members of the serine protease superfamily: it lacks sequence conservation within the activation peptide cleavage site. Examination of the protein data bank has revealed two serine proteases, which also deviate from the concensus within the activation peptide region. The complement factors C2 and B are activated by cleavage between an arginine and lysine residue and ~12 amino acids after the cleavage site in factor C2 and B, a type A von Willebrand repeat motif is found. The Gd amino acid sequence was compared with factors C2 and B, and an alignment revealed that Gd possesses an arginine lysine pair as well as a type A von Willebrand motif in a conserved geometry. These features suggest that gd encodes a serine protease zymogen, which is structurally similar to factors C2 and B, and is therefore activated by cleavage between Arg128 and Lys129. Site-directed mutagenesis studies and the analysis of interallelic complementation at gd are consistent with this model (DeLotto, 2001 and references therein).

gastrulation-defective: Regulation | Developmental Biology | Effects of Mutation | References

date revised: 5 August 2021

The Interactive Fly resides on the
Society for Developmental Biology's Web server.