The information contained in the recently published genomic sequence of Drosophila melanogaster was used to identify
12 additional bHLH proteins. By sequence analysis these proteins have been assigned to families defined by Atonal, Hairy-Enhancer of
Split, Hand, p48, Mesp, MYC/USF, and the bHLH-Per, Arnt, Sim (PAS) domains. In addition, one single protein represents a unique
family of bHLH proteins. mRNA in situ analysis demonstrates that the genes encoding these proteins are expressed in several tissue
types but are particularly concentrated in the developing nervous system and mesoderm (Moore, 2000).
Two newly identified genes, CG8667 (Mistr) and CG5545 (Doli), both members of the Ato-related family, are expressed in the developing nervous system. CG5545 is closely related to
the vertebrate repressor Beta 3 protein (96% sequence identity between fly and vertebrate proteins in the bHLH domain). It is suggested that this protein should be named Doli (Drosophila Olig family) -- the Olig proteins are involved in oligodendritic precursor
formation. CG8667 has closest sequence identity to the vertebrate
Mist1 protein, a negative regulatory factor of MyoD activity (78% identical over the entire bHLH domain and 92% identical in the basic domain alone). It is
proposed that this protein should be named Mistr (Mist 1-related protein).
Sequence homology between species does not always imply
functional homology. For example, CG8667/Mistr is a
Drosophila sequence ortholog of the mammalian Mist1 protein.
It is expressed solely in the developing nervous system, whereas
Mist1 is expressed not in the nervous system but in gut,
pancreas, submandibular gland, lung, and skeletal muscle. In this
case, differences in the expression pattern of the genes encoding these
proteins argue against any conservation of developmental role (Moore, 2000).
As with the other proteins of the Ato-related family, the genes encoding these proteins are expressed in the developing Drosophila nervous system. CG5545/doli is
expressed first in a subset of cells in both the ventral nerve cord (VNC) and the procephalic region at stage 9. The number of cells in these regions expressing the
gene increases to a peak at stage 11. By stage 14, levels of expression have fallen such that CG5545/doli is expressed only in a few cells per
hemisegment on the ventral surface of the VNC (Moore, 2000).
There is a strong maternal contribution of CG8667/mistr mRNA. Zygotic transcription is initiated at stage 14. It is expressed in bilateral domains in the cephalic
region, which, as development proceeds, fuse into a U shape forming part of the ring gland. Concomitant expression of CG8667/mistr also begins in the
CNS. By stage 17, CG8667/mistr is in clusters of cells at the anterior and posterior of the VNC and bilaterally in two lateral cells per hemisegment in the VNC (Moore, 2000).
CG10066 (Fer1), CG5952 (Fer2), and CG6913 (Fer3) are related to mammalian p48. These three new bHLH proteins are most closely related to the bHLH
domain of the p48 subunit of PTF1, a pancreatic, exocrine cell-specific transcription factor in the mouse, and represent a new bHLH family in Drosophila. These proteins have been named Fer for 48 related. CG10066/Fer1 is 88%, CG5952/Fer2 is 76%, and CG6913/Fer3 is 62% identical to p48 in the bHLH region (Moore, 2000).
CG10066/Fer1 is expressed in the epidermis at the stage when the epidermis begins to secrete cuticle and, therefore, may share a common function with p48 in active exocrine
cells. It is first transcribed in the epidermal pads adjacent to the posterior spiracles at stage 15. The expression of this gene quickly spreads over the entire epidermal
surface of the embryo and is strongest in epidermis underlying the forming denticle belts (Moore, 2000).
CG5952/Fer2 shows a strong maternal contribution of mRNA in the early embryo. Zygotic expression of this gene begins at stage 10 in an anterior-to-posterior
wave in the VNC and the brain. As development proceeds, the number of CG5952/Fer2-positive cells increases, so that by stage 12, the expression domain forms a
bilateral, dorsal-posterior, crescent-shaped structure (Moore, 2000).
CG6913/Fer3 is expressed at stage 11 in part of the posterior midgut primordia and stage 12 in part of the anterior midgut primordia. At later stages, expression has been detected in several unidentified cells scattered throughout the embryo (Moore, 2000).
CG10446 (Side) and CG5927 (Her) are in the HES family. CG10446 is most closely related to Deadpan (76% identity in the basic bHLH domain and 62% in the entire bHLH
domain). This protein has been named Side (similar to Deadpan). CG5927 is most closely related to the proteins of the Enhancer of split [E(spl)] complex, such as HLHmgamma (76% identity in the basic domain and 51% identity in the entire bHLH domain). CG5927 has been named Her (HES-related). Hairy, Dpn, and the
proteins of the E(spl) complex have WRPW at the very C terminus to mediate interaction with Groucho. CG5927/Her and CG10446/Side also end in this motif.
All members of the
HES proteins mediate transcription repression via their interaction
with Groucho. CG10446/Side and CG5952/Her have the WRPW domain
required for this interaction, implying that they are highly likely to
act via the same mechanism. CG10446/side is expressed
solely in the CNS at a stage at which cell differentiation is
occurring. It is hypothesized that it may play a role in antagonizing the
function of transcription factors involved in the later stages of CNS differentiation (Moore, 2000).
There is a strong maternal contribution of CG10446/side mRNA. Zygotic transcription of the gene begins at stage 12 in a subset of cells in the CNS.
CG5927/her has a low level of maternal mRNA contribution and then is expressed ubiquitously throughout embryogenesis (Moore, 2000).
CG12952 (Sage) is distantly related to the Mesp family and is expressed in the salivary gland. CG12952 represents a protein with little sequence
similarity to other known proteins. In the neighbor-joining tree, it is placed in the same family as the vertebrate Mesp proteins, which are necessary for mesoderm
segmentation initiation (53% identity in the bHLH domains). CG12952 has a strong maternal mRNA contribution in early embryogenesis. Its zygotic
expression begins in the salivary gland anlage at stage 10 and persists until stage 15. CG12592 has been named Sage (salivary gland-expressed bHLH) (Moore, 2000).
CG17592 (Dm Usf) is the ortholog of the mammalian USF proteins. CG17592 is the single Drosophila sequence homolog of the USF
proteins that are involved in cell proliferation control (92% identical in the basic domain). This protein has been named Dm Usf. Both vertebrate and
Drosophila USF are bHLH-zip proteins. Dm Usf has a loop and a second helix region, high in serines, which is greatly diverged from that of mouse and human and,
hence, may have lost its ability to dimerize. There is a weak maternal contribution of Dm usf mRNA. At stage 7, Dm usf is expressed in bilateral domains in
the ventral cephalic furrow. In later stages (15 onward) of development, Dm usf expression is confined to the proventriculus and a subset of cells in the CNS. This specific expression pattern differs from the ubiquitous USF expression pattern reported in vertebrates (Moore, 2000).
CG6211 (Gce) is closely related to the bHLH-PAS Rst(1)JH protein (78%
identity in the bHLH, 68% in the PAS-A, and 86% in the PAS-B domains). Rst(1)JH originally was isolated in a screen to find a Drosophila protein resistant to the
Juvenile Hormone Analog insecticide Methoprene. CG6211 transcript is expressed strongly as a maternally supplied message and then later in a subset of the germ
cells of the developing embryo. It is suggested that this protein should be named Gce (germ cell-expressed bHLH-PAS) (Moore, 2000).
CG11450 (shout) is expressed during mesoderm formation and in myoblasts. CG11450 represents a member of a new bHLH family. It
is expressed first in the dorsal and ventral cellular blastoderm. In the ventral region of the embryo, the gene is expressed continually in the presumptive mesoderm
throughout gastrulation and then in a segmented pattern in the ventral mesoderm layer at the extended germ-band stage. It is expressed in the myoblast cells that then
migrate dorsally from this layer. The expression pattern of CG11450 overlaps with that of the bHLH transcription factor Twist, suggesting that it may be
playing a role in the same mesoderm specification and myogenic pathways; therefore, this gene has been termed shout after "Twist and Shout" by John Lennon and Paul McCartney (Moore, 2000).
The expression domain of CG11450/shout overlaps with that of twist. twist and
CG11450/shout continue to be expressed in the presumptive
mesoderm during gastrulation. At the extended germ-band stage, both
twist and CG11450/shout are expressed in
alternating high and low levels along the length of the mesoderm. These
alternating expression levels of twist are required for the
specification of muscle derived from this tissue. The pattern of
CG11450/shout expression in the ventral mesoderm implies
that it could have a similar role to twist in specification
of mesoderm derivatives. In Drosophila, Twist activates
Snail and other downstream, mesoderm-specific regulators such as
Tinman, Bagpipe, and Mef2; all of these proteins have vertebrate orthologs
implicated in mesoderm development. Hence, CG11450/Shout
represents a good candidate for both sequence and function conservation
across species (Moore, 2000).
CG18144 (Dm Hand) is the Drosophila ortholog of the vertebrate hand proteins. CG18144 is 76% identical to dHand and 69% homologous in the bHLH domain to eHand; both vertebrate proteins are involved in heart formation. Dm hand expression begins at
stage 10 of embryonic development in bilateral stripes in the ventral mesoderm. It continues to be expressed in two tissues derived from this mesoderm, the dorsal
vessel (heart) and the circular visceral musculature. In addition, at stage 13 Dm hand mRNA appears in a small subset of cells in the CNS (Moore, 2000).