September 5th, Friday @ BROAD - Monadnock [ In Person Only ]
- Registration is required -
8:45AM - Breakfast / Welcome [Organizers]
9:00AM - 1) Agata Kilar : Rivas Lab
- Title: Enhanced sequence homology using evolutionary models
- Summary: Sequence homology detection is crucial for understanding evolutionary relationships, yet current methods like BLAST and profile hidden Markov models (HMM) use fixed evolutionary parameters, limiting their sensitivity in identifying remote homologs. BLAST uses fixed substitution matrices and gap penalties, while standard profile HMMs assume a constant evolutionary distance across sequences – “fixed evolutionary time”. Here, we present eHMMER, an enhancement of the widely-used profile HMM tool HMMER, which integrates time-dependent evolutionary models. eHMMER dynamically adjusts the evolutionary time parameter, effectively elongating or shortening evolutionary branches within the HMM profile. This adaptability improves the detection of both remote and closely related sequences. Controlled benchmarking demonstrates that incorporating this evolutionary ""time slider"" further enhances sensitivity in homology detection. In practical applications, eHMMER successfully identifies novel annotation candidates within Domains of Unknown Function (DUFs), which constitute nearly 25% of the Pfam protein domain database. Thus, eHMMER offers a significant advancement in sensitivity for protein annotation and evolutionary analysis, building upon the effectiveness of current methods.
9:15AM - 2) Chaitra Prabhakara & Madeline Ryan : Tabin Lab
- Title: Cross-Species Analysis of Proventriculus and Ceca Development: Evolutionary Implications for Avian Gut Organogenesis
- Summary: A key question in developmental biology is how organ systems, composed of individually regulated yet interconnected organs, develop and function cohesively. Understanding the robust development and integration of each organ is vital for achieving collective system-level functions. The vertebrate gut is an exemplary model for studying organ system development, offering insights into the mechanisms that ensure coordinated growth and functionality of the digestive system. Beginning as a simple tube, the vertebrate gut ultimately differentiates into three distinct compartments (foregut, midgut, and hindgut), each characterized by unique specialized functions. The gastrointestinal tract is enveloped by concentric and orthogonally aligned layers of smooth muscle that facilitate the movement of food. Notably, we observe that the position of the radial muscles within the mesenchyme, differs between the foregut, midgut, and hindgut. This raises the question: how does smooth muscle specification vary across the continuous gut tube? To delve into these differences, we investigated the embryonic development of organs at the boundaries of the foregut/midgut and midgut/hindgut—the proventriculus and the ceca, respectively. We characterized the epithelial and mesenchymal development, including muscle specification, within these organs, and compared them to the tubular segments of the gut—the fore, mid and hindgut. In both the proventriculus and ceca, the epithelial tube shows distinct branching patterns, with the proventriculus forming numerous branches and the ceca developing only two branches. We further examined the roles of localized chemo-attractants, cell division, and mechanical forces in facilitating this branching process. Mapping out these developmental dynamics not only elucidates the complexities of organ morphogenesis but also deepens our understanding of how organ systems develop and function in a coordinated manner.
9:30AM - 3) Prathitha Kar : Sunyaev lab
- Title: Expanded models of gene constraint improves selection estimates from population data
- Summary: Measurements of gene constraint are fundamental to identifying the roles of genes in diseases. A wide range of population-genetics methods have been developed to estimate the selection acting against heterozygous loss of function (LoF) variants of different genes (s_het). Methods to infer the degree of constraint can use different aspects of the distribution of loss-of-function variants in the sample ranging from using the presence/absence of segregating variants in functional sites (eg., LOEUF) to the entire site frequency spectrum (SFS). Recent theoretical advancements have made it possible to analytically estimate the SFS of rare variants for arbitrary demography while allowing for recurrent mutations. Using this model, we re-calibrated demographic models for all ancestry labels in the data allowing us to use polymorphisms stratified by ancestry. We, first, apply demography based framework to estimate per gene selection to the 1.46 million haploid exome sequencing datasets (gnomAD v4) from six ancestries -the most extensive exome sequencing dataset- and provide the best available estimates of s_het from human data. This approach also incorporates LoF missanotations, benefiting from LoF variants present at high frequencies at seemingly constrained genes. We estimated that ~3% of stop gain LoFs are labeled incorrectly and this fraction varies between genes. Obtained estimates outperform existing constraint scores on all tested benchmarks, including a list of validated genes implicated in pediatric disorders (Area under Precision-Recall curve for our method = 0.1961, GeneBayes = 0.157). We then combine large language models trained on phylogenetic data with our population genetics approach to obtain the distribution of fitness effects for missense variants of each gene. These estimates are obtained completely independently of the LoF s_het values allowing us to study the relation between fitness effects of missense and LoF mutations across genes. Finally, we extend our framework to identify genes that have distorted SFS due to elevated mutation rates at LoF sites. Strikingly, all of the top 10 genes that we predict to be hypermutable at LoF sites have been recently described as drivers of clonal expansions in spermatogonia. This advanced demography-based methodology transpires into improved constraint estimates, facilitates comparison of the effects of missense mutations and LoFs and even could identifies genes under positive selection in sperm.
9:45AM - 4) Javier Maravall-López : Reich lab
- Title: Autoimmune trade-offs of adaptation to M. tuberculosis and favored biological processes revealed by genome-wide signatures of positive selection in ancient West Eurasians
- Summary: Ancient DNA is emerging as a promising tool to elucidate Holocene human adaptation, with a recent study (Akbari et al. 2024) identifying hundreds of loci with genome-wide significant evidence of selection. However, our understanding of these signals is still limited: characterization of both selection pressures and adaptive phenotypes remains incomplete. Here, we make major advances in both directions by integrating the Akbari et al. results with diverse GWAS, QTL, functional and pathway data. We first show that many selection loci colocalize with GWAS loci, including at autoimmune disease loci in which positively-selected alleles increase disease risk, consistent with widely-hypothesized antagonistic pleiotropy. Second, we find that selection signals are enriched in digestive tissue variants, and we use spatial transcriptomics data from the human gut to refine this observation to immune communities in the gut mucosa. Positively-selected alleles were associated with increased expression of key gut immune genes, like MUC2 (the main component of the digestive mucus) or GP2 (involved in immune surveillance against bacteria). This selection on enhanced gut immune response could have been driven by enteric pathogens (like S. enterica) or by the transition to new diets (for example, low-fiber diets are known to result in decreased mucus secretion). Third, by combining trans pQTL data with a formal test of causality on selection, we identified genes that converged onto adaptive biological processes, showing that selection acted to enhance inflammation, foreign body rejection, and blood clotting. Finally, three lines of evidence indicate selection driven by mycobacterial pathogens (with M. tuberculosis as the natural candidate). First, selection signals are significantly enriched for expertly-curated genes shown to be causal for Mendelian Susceptibility to Mycobacterial Disease (MSMD) (Bonferroni p < 0.017). Second, independent data sources (ATAC-seq, RNA-seq and candidate cis-regulatory elements) implicate cell types from the mononuclear phagocyte system, the key player in host defense against mycobacterial pathogens, in selection signals. Third, pQTL data, combined with a formal test for causality on selection, implicate genes converging onto the IL12 and IL23 signaling pathways, the molecular triggers of immune response to mycobacteria by the former cell system. Positively-selected MSMD variants were also Inflammatory Bowel Disease (IBD) risk alleles (eg. IRF1 and RORC loci). Combined, these results provide strong support for adaptation to M. tuberculosis having increased genetic risk for IBD during the West Eurasian Holocene.
10:00AM - 5) Tahlia Perry : Center for Zoonomics
- Title: The Genetic Legacy of Zoo-based Gorillas
- Summary: Population genomics across species can provide insights into diverse evolutionary processes. Despite many genomic resources being developed for humans and chimpanzees, surprisingly little is known about the genetic diversity of other great apes, hindered by the difficulty of obtaining high quality samples from wild animals. Zoo populations can serve as a valuable resource for population genomics, as the population founders were originally sourced from the wild. Additionally, zoos collect rich, individual-level longitudinal data, such as health data across the lifecourse, that cannot be feasibly garnered from wild animals – providing the opportunity to study the genetic and environmental predictors of health-related traits. Western lowland gorillas (Gorilla gorilla gorilla) are one of the most carefully managed species in zoos across the globe, with health data collected in a standardized manner over the past 10 years. Yet, little is known about the genomic diversity of the zoo population and the species more broadly. Here, we generated high coverage genomes to create the largest genomic dataset of gorillas to date. In combination with previously published data, we are analyzing 111 gorilla genomes, representing 61 gorillas born in captivity within a documented, international pedigree of 2,407 gorillas, and 50 wild-born gorillas with unknown relatedness. Identity-by-descent analyses confirm expectations from the pedigree and reveal low levels of relatedness among wild-born founders. Surprisingly, despite the zoo population having gone through an extreme bottleneck during its founding, zoo-born gorillas relative to wild-born gorillas show higher levels of heterozygosity and decreased rates of inbreeding. These results shed light on the founding of the gorilla population in zoos and provides insights into the genetic diversity of wild gorillas.
10:15AM - 6) Rishabh Kapoor : Extavour Lab
- Title: Replaying the molecular tape: convergent re-evolution of cytoplasmic intermediate filaments in Panarthropoda
- Summary: In a famous thought experiment, Stephen Jay Gould proposed that replaying the tape of life’s history would result in radically different outcomes, favoring a primary role of historical contingency in shaping evolution. In contrast, other biologists have emphasized examples of both natural and experimental evolutionary convergence to argue for the predictability of evolution. Here, we explore the interplay between contingency and convergence by examining a case of a natural replay experiment: the ancestral loss of cytoplasmic intermediate filaments (cIFs) in Panarthropoda. Through a comprehensive bioinformatic screen for intermediate filament-family proteins across genomes and transcriptomes of 4145 Ecdysozoan species, we recover strong evidence for the prior hypothesis of cIF loss in the panarthropod common ancestor. Surprisingly, however, we find that 6.8% of panarthropod genomes encode putative cIF genes, a significant expansion upon the three previously reported cases. Molecular phylogenetic analyses suggest that these cIFs arose via 276 independent but analogous instances of duplication and neofunctionalization of lamin genes. This is analogous to the path traced by the original cIF family in the last common ancestor of Bilateria, providing evidence consistent with evolutionary repeatability within this family. Furthermore, using transgenic expression and domain predictions, we provide evidence that association with cell membranes and cell-cell junctions is a repeated outcome of cIF evolution. In favor of a role of historical contingency, however, we find that cIF evolution is not predictable from organismal ecology or morphology, and putative cIFs acquire divergent spatiotemporal expression patterns and domain architectures. This study of a natural replay experiment reveals that contingency and convergence interact to produce both familiar and novel trajectories in molecular evolution.
10:30AM - Coffee Break
10:50AM - *Keynote Speaker :: Molly Schumer / Associate Professor in Biology, Stanford ::
- Title: From molecular mechanism to evolution in nature: insights from swordtail fish
12:00PM - Lunch
1:00PM - 7) Annabel Perry : Reich lab
- Title: A Simulation Framework Validates Results from a Novel Method to Detect Selection in Ancient Humans
- Summary: Human evolutionary biologists face a challenge: evolutionary timescales are long, but data tracking humans across extensive periods is limited. Despite breakthroughs in ancient human DNA research, scientists have only uncovered a few clear examples of adaptation in recent human history. That’s where our lab stepped in. We developed a new method to track changes in the frequency of trait-related alleles over time. When tested on ancient human DNA, our method uncovered ten times more evolutionarily significant alleles than ever detected before. Of course, a big claim like this does not come without questions. Some evolutionary processes - like background selection, stabilizing selection, or population movement - can muddle the picture, making it look like a trait is changing when it is just staying steady. To dig deeper, we coded simulations using realistic genetic annotations, empirically backed parameters, and cutting-edge models of population history. Through these simulations, we demonstrate that our novel method detects signals of evolution only when the population truly experiences a shift in the most adaptive value of a trait. Other processes, like background selection or stabilizing selection, do not produce the same patterns. Thus, the putatively adaptive changes we observed in real ancient human DNA cannot be explained by these other processes – humans were actually adapting, even in the past few thousand years. With this simulation-backed method, scientists can now untangle the mysteries of recent human adaptation.
1:15PM - 8) Leah Darwin : Rand lab
- Title: Experimental analysis of mitonuclear epistasis, GxE, and coevolution in Drosophila
- Summary: Mitochondrial function, critical to the function of eukaryotic cells, requires the coordinated expression of 37 mitochondrial genes and hundreds of genes encoded by the nuclear genome. Disruption of the cooperation between these two genomes can influence adaptive evolution by introducing functional incompatibilities that impair cellular metabolism, reduce organismal fitness, and generate variation in complex traits. To advance our understanding of these complex genetic interactions, we created an experimental system in Drosophila melanogaster in which we generated novel combinations of mitochondrial and nuclear genotypes. In particular, we have constructed a panel of mitonuclear variation representative of both natural variation and deep divergence through the inclusion of 10 mtDNAs from Zimbabwe, 10 mtDNAs from Beijing, and two mtDNAs from entirely different species D.simulans and D.yakuba. These mitochondrial haplotypes are placed onto two common nuclear backgrounds allowing for us to explicitly test for the presence of mitonuclear epistasis and its relationship to intraspecific and interspecific divergence. We measured four different quantitative traits across two different environments and found that genetic interactions involving intraspecific mitochondrial variation and interactions with the environment had a much larger effect than genetic interactions with the two outgroups. Additionally, results from experimental populations only differing in mitochondrial genotype suggest that these differences can influence selection of nuclear alleles. Here, we again find that variation from within the species D.melanogaster is more influential than deep interspecific divergence. This suggests that strong purifying selection acting on the mitochondria has preserved its function across millions of years of nuclear divergence although certain mitonuclear combinations can still have profound effect on more recent evolution.
1:30PM - 9) Sarah Perkins : Neafsey lab
- Title: Heterogeneous constraint and adaptation across the malaria parasite life cycle
- Summary: Evolutionary forces vary across genomes, creating disparities in how traits evolve. In organisms with complex life cycles, it is unclear how intrinsic differences among discrete life stages impact evolution. Here, we look for life history-driven patterns of adaptation in Plasmodium falciparum, a malaria-causing parasite with a multi-stage life cycle. We posit that notable differences across the P. falciparum life cycle—including cell ploidy, the extent of clonal competition, and the presence of transmission bottlenecks—alter the drift-selection balance acting at discrete life stages. Categorizing genes by their stages of expression, we compare patterns of between- and within-species diversity across stages. Most notably, we find signals of weaker negative selection in genes exclusively expressed in sporozoites. This matches theoretical expectations as sporozoites do not proliferate, show limited evidence of clonal competition, and pass through a strong bottleneck. We discuss how the timing of therapeutic interventions towards particular life stages might impact the rate at which parasite populations evolve resistance, and consider the functional, molecular, and population genetic factors that could contribute to these patterns.
1:45PM - 10) Qian Tang : Owens Lab
- Title: Insects' adaptation to brighter night skies
- Summary: We employed population genomic approaches to analyze samples collected over the course of a 27-year monitoring program of corn earworms using black light and pheromone traps. Since 2012, light traps have caught significantly fewer moths than pheromone traps. Our analyses suggest that the nocturnal insects' adaptation to a brighter night sky in natural conditions may be associated with protein kinase C, which adjusts the insects' neural responses to various light intensities.
2:00PM - 11) Samantha Petti : Petti lab
- Title: Estimating fitness landscape statistics with a Gaussian process kernel trick
- Summary: Understanding the structure of fitness landscapes is central to many questions in evolutionary biology. A common approach is to summarize a fitness landscape using epistatic coefficients, which quantify the influence of specific subsequences on fitness. We introduce a method for efficiently computing estimates and uncertainty bounds for various types of epistatic coefficients that does not require evaluation of the entire fitness landscape. More specifically, our method gives the posterior distribution of epistatic coefficients under a Gaussian process prior on the fitness landscape and only requires matrix-vector products with dimension equal to the number of training examples. We will discuss families of epistatic coefficients, describe Gaussian process priors on fitness landscapes, and introduce a “kernel trick” that enables efficient estimation of epistatic coefficients within this framework.
2:15PM - 12) Aoxing Liu : Daly lab
- Title: Revisiting Old-School Genetics with Deep Pedigrees in Seven Million Non-Genotyped and 500K Genotyped Finns
- Summary: More than a hundred years ago, Mendel grew peas to study the laws of genetics; around the same time, cattle breeders began recording production and familial data to artificially select cattle that produced more milk/meat. Fortunately for both Mendel and the breeders, the generation intervals of peas (about 1 year?) and cattle (<5 years) are relatively short - especially compared to humans (25 to 30 years) - allowing them to gather sufficient data (i.e., phenotypes and pedigrees) to study and apply genetic principles. Although it may seem less obvious, we - human geneticists - are also fortunate today! On one hand, nationwide health registers are becoming available in more nations - for example, they already exist in the Nordic countries, and efforts are underway in the UK (Our Future Health) and many others. Simply because of its nationwide coverage, those registers include many people who have not yet been genotyped but do have relatives. On the other hand, biobanks are expanding and may eventually reach nationwide coverage. This means - by their nature, biobanks (will) capture complex family structures (e.g., about 10% of Finns have been genotyped by FinnGen), which require more sophisticated modeling to better control for confounding but also offer great potential for smarter use in genetic and health research. I’m happy to share some of my explorations with deep pedigrees in the Finnish population (7 million non-genotyped and 500K genotyped individuals) and the Danish population (3 million non-genotyped and over 100K genotyped individuals). I will primarily focus on two topics: first, family-based heritability (specifically, the upper-bound heritability in the missing heritability problem) across hundreds of traits in the phenome; and second, a novel familial resemblance score at the individual level (aggregating phenotypic information from all genealogy-traceable relatives) and its relationship to the intensively/extensively studied polygenic scores.
2:30PM - 13) Noah Connally : O'Connor and Karlsson labs
- Title: Inability of gene expression to explain human GWAS is consistent with population history differences in a comparison of humans, cattle, and pigs
- Summary: Colocalization is a family of methods for connecting GWAS hits to molecular phenotypes. These methods have identified GWAS hits that alter the expression of nearby genes, which has been used to understand the molecular biology of traits and identify potential drug targets. However, the usefulness of colocalization methods has been limited by the fact that they fail to explain most GWAS hits, a problem we have referred to as “missing regulation,” and which may be due to limitations imposed by natural selection. We examine the role selection plays in colocalization by comparing humans to two mammalian species with different evolutionary histories: cattle and pigs. In cattle and pigs, colocalization succeeds at a high rate, which we show is due to the effect of strong artificial selection on eQTL mapping power. eQTLs mapped in cattle and pigs have a genetic architecture that is not observed in humans. This architecture is similar to that of human GWAS variants, cell type-specific eQTLs, other molecular markers, suggesting that missing regulation in humans can be explained by eQTLs that selection has maintained at frequencies that cannot be detected in human, bulk-tissue eQTLs.
2:45PM - Closing Remarks [Organizers]
**Due to unforeseen circumstances, the schedule may change.