NSF Sponsored Research Program

Studying the Role of DNA Methylation During Seed Development

We profiled soybean and Arabidopsis methylomes from the globular stage through dormancy and germination. CHH methylation increases significantly during development throughout the entire seed, targets primarily transposable elements (TEs), is maintained during endoreduplication, and drops precipitously within the germinating seedling. By contrast, no significant global changes in CG- and CHG-context methylation occur during the same developmental period. An Arabidopsis ddcc mutant lacking CHH and CHG methylation does not affect seed development, germination, or major patterns of seed gene expression. These results suggest that CHH and CHG methylation does not play a significant role in seed development, or regulation of seed gene activity - including genes encoding major storage proteins. By contrast, over 100 TEs are de-repressed transcriptionally in ddcc seeds, suggesting that the increase in CHH methylation during seed development may be a failsafe mechanism to reinforce transposon silencing and prevent the occurrence of lethal mutations which may disrupt seed germination.

Similarity between soybean and Arabidopsis seed methylomes and loss of non-CG methylation does not affect seed development


Seed Genome DNA Methylation Valleys (DMVs) are Enriched For Transcription Factor Genes

The precise mechanisms that control gene activity during seed development remain largely unknown. In our seed methylome project we showed that several genes essential for seed development-including those encoding storage proteins, fatty acid biosynthesis enzymes, and transcriptional regulators (e.g., ABI3, FUS3) are located within hypomethylated regions of the soybean genome [Lin et al., Proc. Natl. Acad, Sci. USA. 114, E9730-E9739, (2017)]. These hypomethylated regions are similar to the DNA methylation valleys (DMVs), or canyons, found in mammalian cells. In this project we addressed the question of the extent to which DMVs are present within seed genomes and what role they might play in seed development. We scanned soybean and Arabidopsis seed genomes from postfertilization through dormancy and germination for regions that contain <5% or <0.4% bulk methylation in CG-, CHG-, and CHH-contexts over all developmental stages. We found that DMVs represent extensive portions of seed genomes, range in size from 5- 76 kb, are scattered throughout all chromosomes, and are hypomethylated throughout the entire plant life cycle. Significantly, DMVs are enriched greatly in transcription factor (TF) genes and other developmental genes that play critical roles in seed formation. Many DMV genes are regulated with respect to seed stage, region, and tissue, and contain H3K4me3, H3K27me3, or bivalent marks that fluctuate during development. Our results indicate that DMVs are a unique regulatory feature of both plant and animal genomes, and that a large number of seed genes are regulated in the absence of methylation changes during development — probably by the action of specific TFs and epigenetic events at the chromatin level.

Seed genome hypomethylated regions are enriched in transcription factor genes

Profiling the Transcriptomes of Every Seed Region, Subregion, and Tissue Throughout Soybean and Arabidopsis Seed Development — An Atlas of Seed Gene Activity

We used laser capture microdissection (LCM) to isolate specific soybean and Arabidopsis seed regions (e.g., seed coat, endosperm, and embryo) subregions (e.g., embryo proper and suspensor), and tissues (e.g., cotyledon adaxial and abaxial parenchyma) at different stages of development. RNA-Seq (soybean) and microarray (Arabidopsis) transcriptome profiling experiments were carried out to determine (1) the spectrum of genes that are active in different parts of the seed during development, (2) what transcription factors are localized in specific seed regions and subregions, and (3) the biological processes that are partitioned within a seed which may play important roles in seed differentiation and/or function. We profiled the mRNA sets present in 40 soybean and 42 Arabidopsis seed compartments captured using LCM - from shortly after fertilization through the early maturation stage of development. All transcriptome data have been deposited in GEO and on our interactive NSF Project Seed Database (seedgenenetwork.net). This database contains analysis tools that allows users to browse the database by gene identification, gene ontology, and gene function, as well as compare gene activity in different seed regions during development. The soybean and Arabidopsis LCM transcriptome datasets (1) provide one of the most comprehensive descriptions of gene activity within seeds, (2) contain unique insights into the functions of understudied seed regions, such as the suspensor, and (3) are a unique resource for uncovering the gene networks for controlling seed form and function. Functional knockout studies were carried with 53 soybean region- and subregion-specific transcription factors (TFs) in order to determine what role they might play during seed development.

Using Genomics to Study Legume Seed Development

Comprehensive developmental profiles of gene activity in regions and subregions of the Arabidopsis seed

Down-Regulating the Expression of 53 Soybean Transcription Factor Genes Uncovers a Role for SPEECHLESS in Initiating Stomatal Cell Lineages during Embryo Development


Identifying Regulatory Networks That Control Seed Development

We have generated a regulatory atlas of soybean seeds by identifying transcription factor (TF) mRNAs that are specific for every seed region and subregion throughout development (see project outlined above). A series of ChIP-Seq experiments are being carried out to determine (1) what downstream genes these TFs regulate, (2) what DNA regulatory motifs they interact with, and (3) how region- and subregion-specific TFs are organized into regulatory networks that program developmental events that give rise to a soybean seed.

Experiments carried out at UC Davis with our collaborator, John Harada, investigated target genes that interact with one, or more, soybean TFs that program major developmental and physiological events during different periods of seed development. These include LEC1, AREB3, bZIP67, and ABI3 seed-specific TFs. They determined that LEC1 regulates gene sets at different developmental stages and generates distinct biological processes by interacting with AREB3, bZIP67, and ABI3 TFs in specific combinations. DNA binding sites of target genes bound by these TFs are organized into cis-regulatory modules (CRMs) that are in close proximity to their target genes, and are enriched for DNA motifs known to bind to LEC1, AREB3, bZIP67, and ABI3 TFs. Early maturation stage embryo protoplast transcriptional assays validated that the CRMs are functional.

ChIP-Seq experiments carried out in our laboratory at UCLA identified target genes for TFs that are specific for different regions and subregions of soybean post-fertilization stage globular seeds. These include AGL62, YAB1, and WOX9 TFs that are specific for the endosperm, embryo proper, and suspensor, respectively. In addition, comparative genomic approaches are being used to identify embryo-proper- and suspensor-specific TFs present in seeds across the plant kingdom to uncover major regulatory pathways that program region-specific events during early seed development.

Combinatorial interactions of the LEC1 transcription factor specify diverse developmental programs during soybean seed development

Using Giant Bean Embryos To Dissect Early Embryo Development

The Scarlet Runner Bean (Phaseolus coccineus) provides a novel opportunity for dissecting the molecular processes controlling plant embryo development. At the globular stage, the Scarlet Runner Bean embryo is ~100-times larger than that of Arabidopsis, contains a suspensor with 200 cells that is highly polyploid, and can be isolated directly from developing seeds within the flower. Because of its large embryonic size, both embryo proper and suspensor regions can be separated from each other manually and used directly for biochemical and molecular studies. Almost 50 years ago, the late Ian Sussex and his collaborators pioneered the use of giant Scarlet Runner Bean embryos, and provided the first insights into the molecular processes controlling early embryogenesis; for example, the suspensor produces signals that are required for embryo proper development. During this same period, others demonstrated that hormones, such as gibberellic acid, are synthesized within the giant Scarlet Runner Bean suspensor and contribute to embryo proper formation. The ability to manually isolate large numbers of globular stage embryo proper and suspensor regions, in addition to using state-of-the-art laser capture microdissection (LCM) techniques, provides a unique opportunity to use the Scarlet Runner Bean to gain entry into the earliest events in plant embryogenesis - complementing the elegant studies that can be carried out with Arabidopsis, maize, and rice. In addition, Scarlet Runner Bean is a close relative of the agronomically important Common Bean (Phaseolus vulgaris) that has a similarly large embryo and a substantial amount of genomic resources, including a genome sequence, which can be used as a surrogate for the Scarlet Runner Bean.

Our laboratory has resurrected the use of Scarlet Runner Bean for the study of early plant embryo development. We have sequenced thousands of expressed sequence tags (ESTs) from embryo proper and suspensor regions, used in situ hybridization and RNA-Seq to identify embryo-proper and suspensor-specific mRNAs, uncovered a suspensor cis-regulatory module that activates region-specific transcription of genes within the suspensor shortly after fertilization, and generated a rough draft of the Scarlet Runner Bean genome. Recently, we compared the transcriptomes of Scarlet Runner Bean with those of the Common Bean, soybean, and Arabidopsis. These experiments showed that giant bean suspensors carry out highly specialized metabolic functions, particularly those involved in hormone function, as compared with the less specialized soybean and Arabidopsis suspensors. In addition, we identified a set of embryo proper- and suspensor-specific transcription factors (TFs) that are shared by these species and might play a role in the specification processes controlling embryo proper and suspensor development shortly after fertilization. ChIP-Seq experiments are being carried out with these conserved TFs to uncover their target genes and construct regulatory networks controlling the earliest stages of embryo development.

Regional Localization of Suspensor mRNAs during Early Embryo Development

Using Genomics to Study Legume Seed Development

Identification of cis-regulatory sequences that activate transcription in the suspensor of plant embryos

Using giant scarlet runner bean embryos to uncover regulatory networks controlling suspensor gene activity

A cis-regulatory module activating transcription in the suspensor contains five cis-regulatory elements

A shared cis-regulatory module activates transcription in the suspensor of plant embryos

Using Giant Scarlet Runner Bean (Phaseolus coccineus) Embryos to Dissect the Early Events in Plant Embryogenesis

Scarlet Runner Bean EST Project Website