|Select image to enlarge|
Clusters of bHLH genes and the roles of their proteins as transcriptional regulators.
a, b. Enhancer of split [E(spl)-C, a] and achaete-scute (AS-C, b) Complexes. Genes are depicted as arrows, indicating the direction of transcription. E(spl)-C spans ~50 kb on the 3rd chromosome; AS-C spans ~100 kb near the tip of the X. Asterisks mark genes whose products have a 'basic helix-loop-helix' (bHLH) motif. Oddly, all of these genes are devoid of introns. Within each complex, the bHLH genes exhibit partial functional redundancy [2548, 4674]. AS-C bHLH proteins are transcriptional activators, while E(spl)-C bHLH proteins are repressors. Half circles mark E(spl)-C genes that belong to a separate ('Bearded') gene family .
c. Amino acids in bHLH motifs of E(spl)-C genes and hairy (AS-C genes not shown; for code see App. 1). Dashes are inserted to aid alignment . Numbers are residues counted from the N-terminus. E(spl)-C bHLH proteins are ~200 a.a. long, with ~60 a.a. in their bHLH domain. The 'conserved' row pertains to E(spl)-C genes: filled circles (invariance); unfilled circles (>50% but <100% identity); +s or -s (charged residues). The many +s define the 'basic' domain, which, in this subfamily, has proline at position 6 and arginine at position 13 .
d, e. bHLH proteins dimerize via helical domains (striped), adopt a scissors shape, and bind DNA via basic domains (black). Within each subunit the upper and lower helices touch (unlike in this cartoon). d. E(spl)-C bHLH dimers bind an 'N box' consensus sequence 'CACnAG' ('n' is a dimer-specific nucleotide) [3171, 3759], which Hairy can also bind, though it prefers CACGCG [3179, 4451]. Their C-terminal 'tails' (top) end in 'WRPW' (as does Hairy's ) [2274, 3697], which recruits Groucho (Gro, black rectangle). Gro is a co-repressor (X'd arrow) [2064, 3278] that in turn binds a chromatin-modifying histone deacetylase , though E(spl)-C proteins also have Gro-independent effects . The gro gene resides in the E(spl)-C (a). e. Most other bHLH dimers (inclu. AS-C) bind an 'E box' hexamer 'CAnnTG' [348, 3375], do not end in WRPW, and activate transcription (cf. the bHLH-PAS subgroup [4022, 4612]).
f, g. Promoter region of the m8 [a.k.a. E(spl)] gene. f. Details. Binding sites [S = Su(H); E = E-box; N = N box] are boxed (f) or shown as bars (g). "?" denotes a hexamer (function unknown) found between invertedly repeated 'S' sites [171, 3075], whose inter-repeat distance is constant among promoters. +s and -s signify stimulation vs. inhibition of transcription. g. 'Wide-field' view of the m8 cis-enhancer region. Dashed rectangle marks the section shown in f -- viz., bases -133 to -211 bp from the transcription start site ('+1 bp'; right-angle arrow). Negative feedback of m8 onto its own N boxes may help stabilize output and minimize noise [262, 1383, 3347]. The crowding of binding sites suggests steric competition or 'quenching'  that mediates 'either/or' (vs. 'both/and') logic . Additional N boxes and 'S' sites (of varying affinity) map in the 700 bp span (g) [2317, 3075], though we don't know whether they are all needed . The base at -185 (f) may be T  or C . Cis-regulatory regions for all E(spl)-C genes (except gro) have been analyzed similarly [871, 3075].
Maps of loci were compiled from Fig. 3 of  amended as per [2383, 4767] (a) and from [636, 1538] (b). Sequences in c are from . Panels d and e are based on [1152, 2634], and panels f and g are adapted from [171, 2246, 2317, 3171]. See  for an exegesis of E vs. N boxes and an exploration of target gene preferences for AS-C vs. E(spl)-C proteins. See also App. 7.