A fundamental question in biology is how the diverse cell types observed in a multicellular organism are encoded by a single genome sequence. However, the diversity and evolutionary dynamics of cell type identity programs remains largely unexplored beyond selected tissues in a few species. Similarly, little is known about the emergence of complex genome regulatory mechanisms that support cell type-specific programs and long-term cellular memory, for example genome spatial compartmentalization and repressive chromatin modifications.
In recent years, the development of advanced functional genomics technologies has revolutionized the study of cell type and genome regulation, even at single-cell resolution. This opens the way to the comparative analysis of genome regulation in species that represent diverse levels of biological complexity: ranging from unicellular temporal differentiation and simple multicellular behaviours (e.g. in some protists), through loosely integrated and limitedly diversified ensembles of cell types (e.g. in early-branching animals), to organisms with elaborate tissue and bodyplan organization (e.g in bilaterian animals).
What are the genome regulatory mechanisms linked to the origin of cell type differentiation? When did major animal cell types such as neurons emerge and did they evolve more than once? And how do animal cell type gene regulatory networks evolve?
To answer these and related questions, we combine high-throughput chromatin profiling and single-cell genomics technologies with advanced computational methods in order to dissect and compare cell type programs and genome regulatory architectures in phylogenetically diverse systems.
1. Single-cell genomics of cell type diversity.
Cell types are the basic functional units of multicellular organisms. Traditional cell type classification schemes relied on microscopy and, more recently, on molecular fingerprinting by in situ hybridization profiling. However, these approaches require a priori selection of gene markers; they are difficult to scale towards multiple expressed genes simultaneously; and they are not readily applicable to all species or life stages, in particular adult specimens. As a consequence, we lack a systematic and phylogenetically-inclusive understanding of the diversity of animal cell types.
In the lab, we apply and develop single-cell transcriptomics methods to unbiasedly characterize cell types in whole-adult organisms of diverse animal lineages. We focus our single-cell sampling efforts mainly in non-bilaterian linages: Porifera, Ctenophora, Placozoa and Cnidaria (Sebé-Pedrós et al. 2018, Levy, Elek et al. 2021, Najle, Grau-Bové et al. 2023).
The study of these early-branching animal lineages is essential to
Nevill Willmer, 1960
Valentine JW, et al. 1994
understand the evolutionary origins of major cell types (e.g. neurons, secretory cells, stem cells, muscle fibers, epithelial cells) and of genome regulatory mechanisms linked to animal multicellularity (see below).
2. Comparative modelling of cell type GRNs
A central question in evolutionary biology is how animal cell types originate and diversify. Cell types access distinct combinations of genetic elements (genes and cis-regulatory elements) and this process results in unique transcriptional programs that define the cellular phenotype. While scRNA-seq is a powerful tool to generate cellular taxonomies, understanding cell type evolution will require linking specific genomic changes affecting these transcriptional programs (gene family evolution, regulatory sequence dynamics, changes in transcription factor (TF) binding preferences, etc.) to changes in cellular phenotypes. In this direction, we can leverage single-cell data, together with genome-wide cis-regulatory element mapping and TF binding analysis, to
start dissecting cell type Gene Regulatory Networks (GRNs) in non- model species (even if only rudimentarily, e.g. Sebé-Pedrós et al. 2018). Eventually, the comparison of whole-organism cell type GRNs at different phylogenetic timescales will allows us to understand how new cell identities emerge, as well as reveal the cis-regulatory logic of cell type specification in different lineages.
3. Origin and evolution of the animal regulatory genome
Despite the fact that all cells within a clonal multicellular organism share quasi-identical genetic information, metazoans show functionally specialized cell types. This cell-specific genome interpretation is mediated by diverse genome regulatory processes, ranging from chromatin chemical modifications to physical folding of the chromatin fiber. In fact, the very origin of multicellularity and cell differentiation has been hypothesized to be linked to the emergence of epigenomic mechanisms such as repressive chromatin states or chromosomal compartmentalization. We have shown, for example, that some of the closest unicellular relatives of animals lack distal regulatory elements (Sebé-Pedrós et al. 2016), while such elements exist in the earliest-branching metazoans (e.g. Sebé-Pedrós et al. 2018). In this context, we continue to investigate genome regulation in non-bilaterian animals and close unicellular
relatives of animals in order to trace the regulatory genome changes linked to the emergence of multicellularity and stable cell differentiation.
4. Phylogenetics of eukaryotic chromatin
The access to eukaryotic genetic information is controlled by a complex nucleoproteic interface called chromatin. The evolution of chromatin represents a radical shift in genome function: from a largely accessible genome in prokaryotes to a repressive ground state in the eukaryotic genome, with restricted access to genetic information. The main components of eukaryotic chromatin are histone proteins and associated chaperones, remodellers and readers/writers/erasers of histone post-translational modifications (hPTMs); as well as sequence-specific transcription factors (TFs) and proteins that mediate chromatin folding (e.g. CTCF and cohesins) and diverse other interations (e.g. mediator). Together, chromatin processes play a crucial role in the establishment and maintenance of cell identities (whether stable cell types or temporal cellular states).
We have a long-standing interest in TF evolution (e.g. Sebé-Pedrós et al. 2013, de Mendoza&Sebé-Pedrós 2019), and we are expanding these phylogenetic analyses to other chromatin components, such as histone modifiers and readers. Similarly, we are using comparative proteomics to interrogate the diversity ofhPTMs across eukaryotes (Sebé-Pedrós et al. 2016, Grau-Bové et al. 2022) and to try to characterize the molecular players involved in hPTM readout.
Ultimately, we believe that expanding our understanding of genome regulation from a phylogenetic perspective will not only reveal how these mechanisms evolved, but also identify shared fundamental principles of eukaryotic genome function.
Our overreaching goal is to increase phylogenetic sampling for cell type programs and genome function in the most systematic and unbiased possible manner, thus enabling taxon-dense comparative analyses. However, we also rely on a set of model (quasi-model) non-bilaterian species that we regularly culture in the lab. Our favourite organisms include Capsaspora owczarzaki, Salpingoeca rosetta, Ephydatia muelleri, Mnemiopsis leidyi, Trichoplax adhaerens and Nematostella vectensis. Among the advantages offered by these species are decent quality genome assemblies and, in several cases, availability of genetic tools. In addition to these six species, we culture a diverse zoo of free-living protists at key phylogenetic positions of the eukaryotic tree of life, allowing us to perform broad paneukaryotic comparative analyses.