The Drosophila embryo as a tabula rasa for the epigenome

The control of gene expression in eukaryotes relies on how transcription factors and RNA polymerases manipulate the structure of chromatin. These interactions are especially important in development as gene expression programs change. Chromatin generally limits the accessibility of DNA, and thus exposing sequences at regulatory elements is critical for gene expression. However, it is challenging to understand how transcription factors manipulate chromatin structure and the sequence of regulatory events. The Drosophila embryo has provided a powerful setting to directly observe the establishment and elaboration of chromatin features and experimentally test the causality of transcriptional events that are shared among many metazoans. The large embryo is tractable by live imaging, and a variety of well-developed tools allow the manipulation of factors during early development. The early embryo develops as a syncytium with rapid nuclear divisions and no zygotic transcription, with largely featureless chromatin. Thus, studies in this system have revealed the progression of genome activation triggered by pioneer factors that initiate DNA exposure at regulatory elements and the establishment of chromatin domains, including heterochromatin, the nucleolus, and nuclear bodies. The de novo emergence of nuclear structures in the early embryo reveals features of chromatin dynamics that are likely to be central to transcriptional regulation in all cells.


Introduction
Development in multicellular organisms relies on differential gene expression to direct cell proliferation and differentiation. The identification of conserved transcription factors in genetic screens for embryonic defects in Drosophila in the 1980s reformulated our concept of development 1,2 . Later studies in other organisms revealed that the genetic control of development is conserved across eukaryotes, with common regulatory principles. A key issue is how transcription factors interact with the chromatin packaging of the genome for differential gene expression, even though the genome in different cell types is identical. Just as exceptional features of Drosophila embryogenesis provided a productive setting for identifying developmental regulators, studies in early embryos are now unraveling longstanding issues of fundamental mechanisms of the chromatin control of transcription.

The onset of chromatin organization in Drosophila embryos
The challenge in understanding transcriptional mechanisms is ordering molecular events and dependencies in living cells. Chromatin features have been correlated to gene expression and repression in many cell types, but it is more difficult to distinguish which features cause others and which actually regulate gene activity. Features of Drosophila embryogenesis are now being exploited to address this gap.
Like most animals, Drosophila females make a large egg that is loaded with proteins and untranslated mRNA. Fertilization by an incoming sperm triggers embryogenesis. A broad outline of chromatin changes in early embryogenesis is shown in Figure 1, where stages are labeled by the number of nuclear cycles (nc) since fertilization. The early stages of embryogenesis rely exclusively on maternally provided material, without any zygotic gene expression. The incoming sperm genome arrives packaged in protamines, which are removed and replaced with the H3.3 histone variant 3 . In its first hours, the embryo develops as a multinucleate syncytium with totipotent nuclei. Proliferative nuclear cycles occur synchronously and extremely fast, doubling nuclei every 10 minutes, so DNA replication, chromatin assembly, and mitosis occur at what must be close to maximal rates. These first rounds of nuclear replication and division use maternally supplied core histones and a variant histone H1 called bigH1 4 to duplicate chromatin. The rapidity of these divisions requires nearly simultaneous replication of the entire genome, with replication origins scattered at random roughly every 10 kb 5 . The speed of early divisions means that syncytial nuclei largely lack any of the localized chromatin features that characterize epigenomes in other cell types, but distinctions appear as nuclear cycles slow down, zygotic transcription begins, and the nuclei are partitioned to form the cellular blastoderm. In this way, the transition from the syncytial to the cellular embryo provides a tabula rasa for Nuclei are shown in black within the grey cytosol of the embryo. Stages are labeled as the number of nuclear cycles (nc) after fertilization. In nc1, protamines are removed from the male haploid pronucleus (blue) and repackaged with the H3.3 histone variant before DNA replication and mitosis. Fusion of the male and female pronuclei occurs in nc2. The first nuclear cycles up to nc13 occur in a syncytium with rapid progression through the S phase and mitosis without gap phases. (B) The timing of the appearance of chromatin features, including heterochromatin, Polycomb domains, and the histone locus bodies (HLBs), is shown. About 10 genes first activate in nc8, hundreds of genes activate in the minor zygotic genome activation (ZGA) around nc11, and thousands of genes activate in the major ZGA around nc14.
observing the onset and order of transcriptional programs and epigenomic specialization. Zygotic transcription of a few genes begins around nc8, reaching a notable level of transcription from about 100 sex determination, cellularization, and early patterning genes around nc11, a time referred to as the minor zygotic genome activation (minor ZGA) 6,7 . Widespread transcription of thousands of genes begins at nc14, referred to as the major ZGA, during the switch from maternal to zygotic control of morphogenesis. But how does zygotic transcription begin in the early embryo, and how does this alter the chromatin landscape?
Active promoters and enhancers in eukaryotes are characterized by a nucleosome-depleted region (NDR) where transcription factors bind. A small number of transcription-independent NDRs are detected in the nuclei of early syncytial probably due to intrinsic anti-nucleosomal sequence features 8 . At this time, developmentally regulated genes lack any nucleosome phasing around promoters or regulatory elements, but NDRs appear when transcription begins. At least three transcription factors have been implicated in driving zygotic transcriptional activation. The zinc-finger protein Zelda (Zld) is first translated beginning around nc8 from maternal mRNA and binds at thousands of sites 9-11 . These early sites include the promoters and enhancers of genes activated in the minor ZGA but also many regulatory elements of genes that will not be activated until the major ZGA. A second transcription factor implicated in early embryo transcription is the transcription factor GAGA factor (GAF), whose binding first appears in nc9 12 . A second GAGA motif-binding protein called CLAMP is involved in the activation of some major ZGA genes 13 . These key transcription factors also cooperate with spatially localized transcription factors to establish the body plan of the embryo (for examples, see 14-18).

Pioneer factors… how do they work?
How do Zld and GAF bind chromatin in early nuclei? In many developmental systems, pioneer transcription factors are thought to convert inaccessible chromatin at binding sites into exposed DNA and thereby trigger new programs of gene expression 19 . For example, the mammalian pluripotency factors Sox2 and Oct4 bind DNA wrapped on the surface of a nucleosome and either shift its position or displace histones, thereby creating an NDR; exposure of this DNA then allows binding of secondary transcription factors. Thus, creating an NDR at binding sites is a key step for activating developmentally regulated enhancers and promoters, underlying the ability of these factors to reprogram cells from diverse types. These properties were first identified for the archetypal pioneer factor FoxA1, which initiates endoderm development 19 . Recent structural studies of pioneer factors docked onto nucleosomes in vitro have fleshed out factors that may bind specific DNA motifs even when wrapped around a histone octamer and how the binding of one factor may enhance the binding of a second 20 . However, the mechanism by which pioneer factors first bind sites in vivo has been more difficult to pin down. For example, a large fraction of binding sites for the FoxA1 factor depends on the cell type where it is expressed, implying that other bound transcription factors influence where FoxA1 can bind 21,22 . Close examination of FoxA1/2-binding sites in differentiating endoderm cells showed that these sites are partially accessible even before these factors are produced 23,24 . This makes it difficult to determine the relationship between site accessibility and pioneer factor activity, especially as the relevant cell stages in the early embryo cannot be easily observed.
Since the Drosophila embryo starts with a largely featureless epigenome, it provides insight into the chromatin sequence of pioneering. The early factors Zld, GAF, and CLAMP are pioneer factors in that each can bind nucleosomal DNA in vitro 13,25,26 . The creative use of mutations, engineered constructs, or microinjection into the syncytial embryo is deciphering the molecular activities of these pioneer factors (Box 1). Depletion of Zld, CLAMP, or GAF from early embryos has identified sets of binding sites where new accessibility depends exclusively on only one factor 9,12,13,18,27,28 and other sets that require a combination of Zld and CLAMP or of Zld and GAF. Thus, at some sites, a single factor suffices to drive accessibility; at others, multiple transcription factors cooperate. This resembles the interdependency of pioneer factors in mammalian systems that allow for context-specific binding.

Box 1. Systems to manipulate chromatin components in the early Drosophila embryo
In the Drosophila embryo, the first 13 nuclear divisions occur synchronously in a large syncytium with a shared cytoplasm and are accessible for both observing and manipulating chromatin. A wide range of methods and techniques have been used in the studies discussed here. Embryos can be precisely staged by staining fixed material with DNA dyes, and detailed pictures of gene activation and the formation of nuclear bodies have been obtained using antibody staining and DNA in situ hybridization combined with high-resolution microscopy. Other well-developed methods use maternal genotypes that load the egg with fluorescently-labeled histones, chromatin proteins, or transcription factors for live imaging during embryogenesis. Fluorescent protein fusions with transcription factors and RNA polymerase II (RNAPII) subunits have been tracked in this way, and nascent transcripts have been detected with fluorescent MS2 coat protein (MCP) fusions that bind engineered hairpins in mRNAs 29 . The embryonic syncytium can be manually injected without adversely affecting development, and studies have injected mRNA for fluorescent protein fusions with DNA-binding proteins to image specific DNA sequences or fluorescentlylabeled antibody fragments to image specific histone modifications.
Genetics and injection provide strategies to test function in the early embryo. Compounds that interfere with transcription, DNA replication, or chromatin remodeling can be injected. Since the early embryo relies entirely on maternally provided products until the beginning of the cellular blastoderm stage, many components can be altered by modifying the genotype of the mother. A range of genetic strategies are also available. For factors that are required only in the embryo, mutant mothers will produce eggs that lack a specific factor. In other cases, an interesting factor may also be important in other stages of development. Here, mitotic recombination schemes have been designed that produce homozygous mutant germline clones from heterozygous mothers, thereby giving eggs that lack the factor. Finally, for factors that are essential in the female germline, double-stranded RNA targeting a specific factor can embryos, be loaded into the egg, and this can accomplish knockdown in the embryo 30 . More sophisticated strategies are now applied in the embryo using inducible degradation components maternally loaded into the egg 31 , or optogenetic inactivation 25,32 to trigger the elimination of factors during early embryogenesis. One particular method inspired by 'anchors away' technology in budding yeast uses the Drosophila Jabba protein to rapidly deplete GFP-labeled factors in the early embryo. Jabba normally localizes to lipid droplets in the early embryo, but a Jabba-anti-GFP nanobody fusion can sequester any target proteins outside of nuclei. Maternally loaded or injected mRNA encoding Jabbaanti-GFP nanobody has been used to acutely deplete histone methyltransferase proteins 33 , Zld, and Mxc 34 .
Live imaging ha een used to follow the binding of Zld and transcription in the nuclei of early embryos. While earlier studies had shown that DNA replication initiates immediately after mitosis, and transcription slightly later, a recent preprint has detailed the link between replication and transcriptional initiation 34 . Zld accumulates in small foci in nuclei, and clusters of RNA polymerase II (RNAPII) and nascent gene transcripts appear adjacent to Zld foci 17,35 . These authors used an innovative sequestration method to rapidly deplete Zld from the embryo. This system uses an anti-GFP nanobody fused to the Jabba protein, which normally binds and sequesters target proteins in cytosolic lipid droplets in the embryo. Expression of this "JabbaTrap" fusion in embryos with a target GFP fusion protein thus sequesters the target away from chromatin 33 . Injection of JabbaTrap mRNA into embryos with a Zld-GFP fusion showed that Zld is required for the formation of RNAPII clusters. Intriguingly, blocking DNA replication with multiple different inhibitors greatly reduces the formation of Zld clusters and of RNAPII clusters. Clusters that do form are delayed and smaller. Thus, DNA replication appears to enhance the activity of Zld in early embryos.
New nucleosomes behind DNA replication forks are assembled with acetylated H3 and H4 histones, and so while these modifications are enriched in early embryonic nuclei, most other histone modifications associated with transcriptional regulation are absent 36 . However, when Zld binds, nucleosomes around these sites acquire histone H3K27 acetylation 8 . CBP is the main H3K27 acetyltransferase, and eliminating the Drosophila homolog Nejire (Nej) in early embryos blocks RNAPII clustering and transcription 34 . This argues for a sequence of events as follows: first, DNA replication enhances the binding of Zld and clustering of RNAPII on chromatin; second, Zld stimulates histone acetylation by Nej; and, third, acetylation promotes RNAPII elongation 34 . But why would DNA replication enhance the activity of a nucleosome-binding pioneer factor? Perhaps the binding of Zld is enhanced by cell cycle-coupled protein modifications of the factor. Alternatively, early models for chromatin regulation in development proposed that the chromatin dynamics during DNA replication may provide a "window of opportunity" to bind transcription factors and start new transcriptional programs 37 . Nucleosome density and positioning are transiently reduced behind replication forks 38 . For some transcription factors, disorganized nucleosome positioning may delay binding, but others (like GAF) rapidly bind newly replicated DNA and position nucleosomes. Evidence that most Zld and RNAPII signals in early Drosophila embryos depend on DNA replication supports the idea of fast binding behind the replication fork. Importantly, a small fraction of Zld shows delayed binding in blocked embryos, and this might be due to the nucleosomal-binding activity of Zld. While this division between replication-dependent and nucleosomal-binding modes of Zld binding may be more extreme in the fast cycles of the early embryo, DNA replication may promote new transcriptional programming more generally.
The fast nuclear division cycle of early embryos has also revealed factor-specific distinctions for mitotic inheritance. While many transcription factors dissociate as mitotic chromatin condenses, a long-standing question in mammalian systems has been whether some factors persist in maintaining stable patterns of gene expression in cell lineages 39 . Indeed, both properties appear in the early Drosophila embryo. Live imaging of Zld protein shows that the factor releases from chromatin as mitosis initiates 40 , and the chromatin accessibility of Zld-bound sites is reduced as nuclei divide 28 . In contrast, GAF binds both interphase and mitotic chromatin, and mitotic GAF-bound sites remain accessible 41 . This difference appears to result from the longer residence time of GAF on chromatin that allows persistent binding to mitotic chromosomes. Thus, GAF fits the definition of a mitotic bookmarking transcription factor and appears to be functionally important, as imaging nascent transcription of a target promoter has shown GAF-dependent mitotically heritable expression 41 .
There are additional transcription factors that activate genes in the major ZGA 42-44 . These can be understood in the framework that pioneering activity is a matter of degree, as transcription factors vary in affinity from site to site and cooperatively interact with other chromatin proteins 45 . However, one distinct class comprises about 2,000 promoters for housekeeping genes that are activated in the major ZGA but lack binding by any known pioneer factors. Instead, these promoters are enriched for the H2AV histone variant (the Drosophila ortholog of H2A.Z and H2A.X variants) 46 . H2AV is deposited at these promoters well before these genes are zygotically expressed by a conserved SWR1-type chromatin remodeler called Domino. While the elimination of Domino from the early embryo prevents H2AV accumulation and gene activation, chromatin accessibility is not altered. It remains to be determined whether these sites are targeted by pioneering core promoter factors or whether intrinsic nucleosome-positioning features of their DNA sequences make them accessible.

Accretion of nuclear organization
The de novo establishment of nuclear organization in the early embryo includes the appearance of nuclear bodies and distinctive chromatin regions, including the histone locus body (HLB), the nucleolus, and heterochromatin. Observation of how these features appear reveals the steps in establishing phase-separated compartments within the nucleus. The HLB first appears in nc11 when the histone genes are transcribed from a gene array on chromosome 2, but DNA-binding factors s b -including the CLAMP pioneer factor and the NPAT homolog Mxc -first arrive at the histone genes one cycle earlier in nc10 47 . This nuclear body appears to be anchored by a DNAbound seed, while processing factors may localize into the HLB through interactions with histone mRNA 48 . A similar sequence drives the formation of the nucleolus, where transcription of the rDNA genes from arrays on the X and Y chromosomes begins in nc11, and processing factors appear in nc13 49 (Figure 1). These studies have led to a model where the early nucleus is supersaturated for nuclear body components, and factor binding followed by transcription triggers reliable formation of structured bodies.
Observations in the early embryo have also detailed the temporal relationship between histone modifications and gene regulation. Chromatin in the early embryo is globally hyperacetylated because of replication-coupled nucleosome assembly, but active modifications, such as H3K4 and H3K36 methylations, first appear after gene transcription is initiated in the major ZGA 36 . Markers of Polycomb gene silencing, such as H3K27 methylation and Polycomb complexes, first accumulate in nc14 8,50 (although a conflicting report described localized H3K27 methylation at all cycles of the early embryo 51 ). In Drosophila, regulatory elements called Polycomb Response Elements (PREs) are required for gene silencing, and PREs first become accessible in nc11 52 . Many PREs are binding sites for GAF, so the general mechanism of pioneering activity initiating zygotic gene expression also applies to the establishment of developmental gene silencing.
The establishment of heterochromatin in the embryo may operate by a different mechanism. In most cell types, repetitive transposons and short satellite repeats replicate very late in the S phase and become wrapped in nucleosomes marked by histone H3K9 methylation, comprising a repressive heterochromatic state. In the fast cycles of the early embryo, all sequences replicate at the same time, but repetitive sequence blocks begin to show heterochromatic properties at different times in later stages. Some satellite blocks become compacted in nc8 and gain H3K9 methylation in nc12, which then accumulates to high levels in successive cycles as interphases get longer 33 . More widespread heterochromatic features, including delayed replication and then HP1 binding, are apparent in nc13 and nc14 53,54 . Intriguingly, the embryonic establishment of heterochromatin on a specific satellite block depends on its maternal presence, suggesting that maternally-loaded RNAs from this satellite are required to establish heterochromatin in the embryo 55 . Similarly, maternal PIWI-interacting RNAs (piRNAs) may work on other repetitive sequences in the nucleus 56 . These dependencies may be distinct from the activities that maintain mature heterochromatin in later cell types.
The power of observation in the early embryo is now being applied to deciphering how the 3D organization of the nucleus is established. Chromatin in the nucleus displays a hierarchy of interactions, where loops between regulatory elements organize genes within topologically associated domains (TADs), which are further aggregated into inactive and active compartments 57 . This organization is thought to result from both cohesin binding that constrains the chromatin fiber and from non-specific associations between decondensed transcribed chromatin and inactive chromatin. As the early Drosophila embryo is transcriptionally silent, this stage provides a natural setting to track the acquisition of nuclear organization. Mapping of 3D organization in Drosophila embryos revealed that only a few boundaries are present in very early embryos, and these coincide with constitutive H2AZ-containing promoters 46 . Additional TADs and compartments appear in nc8 when Zld binds to chromatin 58,59 . This order of events demonstrates that factor binding alone can create domains and that transcription is not needed. Factor binding at TAD boundaries and at looping elements creates a framework of invariant cell type-independent domains in embryos, but only a subset of these interactions are productive in specific cell types 59-61 . Direct imaging of individual nuclei in the Drosophila embryo has further refined our understanding of nuclear organization. While contact maps have led to the impression of extensive 3D organization within nuclei, tracing chromatin paths 62-64 or imaging the juxtaposition of transcribing genes 35 shows that 3D organization in individual nuclei is very diverse, with structures detected by contact mapping being rare and transient. This places important limits on the functional significance of higher-order structures for gene expression where 3D interactions appear to be a pre-condition for gene activation but do not determine it.

Conclusions
Whereas the core questions in chromatin biology were posed many years ago, the intricacy of many eukaryotic systems has complexified the answers. The blank slate of chromatin in the early embryo has provided a simple setting for probing the mechanics of how transcription factors first interact with chromatin, how phase-separated bodies within the nucleus are first established, and how 3D nuclear organization comes to be. The Drosophila embryo has revealed that the binding of a small number of key DNA-binding proteins can trigger the organization of much of the nucleus, which successive gene activity elaborates. The sequence by which nuclear features emerge implies that they are consequences of "sticky" weak electrostatic interactions. These issues even apply to universally accepted concepts, such as how enhancers act on promoters to activate transcription. While textbooks describe that promoters and enhancers loop together in 3D space to form an activating structure, such contacts are rare and transient in direct imaging. How can transient contacts promote the slower process of transcriptional activation? One class of models suggests that an enhancer leaves a long-lasting mark on promoters, perhaps as histone modifications or stabilizing bound factors, and the persistence of these effects at promoters stimulates transcription . More radical models propose that enhancers produce diffusible molecules that move across nuclear distance to activate promoters 65 . The ability to resolve the sequence of nuclear events in time and space in the early embryo holds promise in understanding these fundamental issues in gene regulation.