Skip to main content

Cornell University

3CPG

Cornell Center for Comparative and Population Genomics

Events

May 17, 2024

Dr. Adam Siepel, Professor and Chair, Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory

“Probabilistic and machine-learning methods for problems in population genomics and transcriptional regulation”

I will describe my research group’s recent progress in developing computational methods to address two mostly unrelated problems in genomics: inference of selective sweeps from population genomic data and characterization of the dynamics of transcription from nascent RNA sequencing data. In the first part of the talk, I will describe our methods for inferring ancestral recombination graphs (ARGs) from sequence data, and then show how features from inferred ARGs can be used in a neural-network setting to improve not only the detection of selective sweeps but also estimation of selection coefficients and allele frequency trajectories. I will then present a new approach for mitigating the problem of “simulation misspecification” that arises when training neural networks of this kind, by framing it as a problem of “domain adaptation” and using a gradient reversal layer to improve generalization to real data. In the second part of the talk, I will introduce a unified probabilistic model for the dynamics of transcription initiation, promoter-proximal pause escape, and elongation, and the generation of nascent RNA sequencing read counts under steady-state conditions. I will show using simulated data that the approach yields accurate estimation of key rate parameters and correctly identifies epigenomic and DNA-sequence covariates of local elongation rates. Then I will summarize analyses of several publicly available PRO-seq data sets, showing that pause-escape is often strongly rate-limiting, that steric hindrance in the promoter-proximal region can dramatically reduce initiation rates, and that reductions in local elongation rate are associated with cytosine nucleotides, DNA methylation, splice sites, RNA stem-loops, CTCF binding sites, and several histone marks. Finally, I will introduce a convolutional neural network that improves our predictions of local elongation rates. Altogether, the talk will summarize several years of methods development in two important areas of genomics, and insights from applying these new methods to real genomic data.

May 9, 2024

May 8, 2024

The Weill Institute for Cell and Molecular Biology's Career Council invites trainees from across Cornell University to hear from Dr. Marcus Smolka, Ph.D. (Professor, Molecular Biology & Genetics and Interim Director, Weill Institute for Cell and Molecular Biology) as he reflects on his Journey Through Science and what challenges and triumphs throughout his career led him to where he is today. There will be plenty of opportunity for Q&A and refreshments will be provided to all those in attendance.

May 1, 2024

Dr. Magnus Nordborg, Scientific Director, Gregor Mendel Institute, Austria Academy of Sciences

“Towards an unbiased characterization of genetic diversity”

Our view of genetic diversity is shaped by methods that provide an incomplete and highly biased picture, effectively limited to single-nucleotide polymorphism in conserved regions of the genome. Long-read sequencing technologies, which are starting to provide nearly complete genome sequences for population samples, should solve the problem—except that characterizing and making sense of non-SNP variation is difficult even with perfect sequence data. I will describe our attempts to investigate and address this problem using samples of genomes from Arabidopsis thaliana as an example. Our analyses reveal substantial and worrying biases in current data that affect everything from GWAS and functional genomics to population genetics and diversity studies. We also discover exciting new biology, especially when it comes to understanding the evolutionary dynamics of transposable elements. We demonstrate that existing genome annotation tools do not predict mobile elements well even in a model plant and present alternative algorithms.

Twenty-five years ago, technical developments that came out of the Human Genome Project ushered in the SNP era, leading to a revolution of population genetics—as predicted by Aravinda Chakravarti, who noted that we needed models to “make sense out of sequence”. I will argue that we are in an analogous position now, with technologies making it easy to generate complete genomes sequences of almost any species at a population scale—an enormous breakthrough for anyone interested in the full diversity of life. However, to make sense of these data, we will need a modeling framework rooted in population genetics, but which also incorporates accurate mechanistic models of the mutational and recombination processes that ultimately generate genetic variation. This framework remains to be developed.

April 29, 2024

Seminar title: Jacks of all trades in the realm of plenty: Taking another look at Amazonian poyphagous herbivores.

Hosted by Andre Kessler

April 26, 2024

Moss functional traits and biogeochemical impacts of the bryosphere.

April 22, 2024

Environmental signals, phenotypic plasticity, and evolutionary change: insights from killifish and waterfleas.

Hosted by Swanne Gordon

April 19, 2024

The James B. Sumner Lecture was established to honor Professor Sumner and brings preeminent scientists to Cornell to speak about broad ranging topics in biological and biomedical Research. 

Dr. Rapoport is a Professor at Harvard Medical School and Howard Hughes Medical Institute Investigator. He is a member of the American and German National Academy of Sciences and has won numerous awards including the 2004 Otto Warburg Medal, 2005 Max Delbrück Medal, 2007 Sir Hans Krebs Medal, and 2011 Schleiden Medal. His seminar wil

He is interested in the mechanisms by which proteins are transported across membranes, how misfolded proteins are degraded, and how organelles form and maintain their characteristic shapes. Most of the projects center around the endoplasmic reticulum (ER). One project concerns the molecular mechanism by which proteins are translocated across the ER membrane or across the plasma membrane in bacteria and archaea. Much of the current work deals with ERAD (ER-associated protein degradation), a process in which misfolded proteins are retro-translocated across the ER membrane into the cytosol. Major questions concern the mechanism by which proteins move across the membrane and are extracted by the Cdc48 ATPase.

 

April 15, 2024

Why can't we predict traits from the environment? Pondering persistent problems in plant functional ecology.

Hosted by Xu/Agrawal

April 12, 2024

Seminar title: The Everglades: degradation and challenges for restoration.