fitness landscapes
Recently Published Documents


TOTAL DOCUMENTS

555
(FIVE YEARS 98)

H-INDEX

50
(FIVE YEARS 6)

2021 ◽  
Author(s):  
Louisa Gonzalez Somermeyer ◽  
Aubin Fleiss ◽  
Alexander S Mishin ◽  
Nina G Bozhanova ◽  
Anna A. Igolkina ◽  
...  

Studies of protein fitness landscapes reveal biophysical constraints guiding protein evolution and empower prediction of functional proteins. However, generalisation of these findings is limited due to scarceness of systematic data on fitness landscapes of proteins with a defined evolutionary relationship. We characterized the fitness peaks of four orthologous fluorescent proteins with a broad range of sequence divergence. While two of the four studied fitness peaks were sharp, the other two were considerably flatter, being almost entirely free of epistatic interactions. Counterintuitively, mutationally robust proteins, characterized by a flat fitness peak, were not optimal templates for machine-learning-driven protein design – instead, predictions were more accurate for fragile proteins with epistatic landscapes. Our work paves insights for practical application of fitness landscape heterogeneity in protein engineering.


2021 ◽  
Author(s):  
Chia-Hung Yang ◽  
Samuel V. Scarpino

AbstractOver 100 years, Fitness landscapes have been a powerful metaphor for understanding the evolution of biological systems. These landscapes describe how genotypes are connected to each other and are related according to relative fitness. Despite the high dimensionality of such real-world landscapes, empirical studies are often limited in their ability to quantify the fitness of different genotypes beyond point mutations, while theoretical works attempt statistical/mechanistic models to reason the overall landscape structure. However, most classical fitness landscape models overlook an instinctive constraint that genotypes leading to the same phenotype almost certainly share the same fitness value, since the information of genotype-phenotype mapping is rarely incorporated. Here, we investigate fitness landscape models through the lens of Gene Regulatory Networks (GRNs), where the regulatory products are computed from multiple genes and collectively treated as the phenotypes. With the assumption that regulatory mediators/products exhibit binary states, we prove topographical features of GRN fitness landscape models such as accessibility and connectivity insensitive to the choice of the fitness function. Furthermore, using graph theory, we deduce a mesoscopic structure underlying GRN fitness landscape models that retains necessary information for evolutionary dynamics with minimal complexity. We also propose an algorithm to construct such a mesoscopic backbone which is more efficient than the brute-force approach. Combined, this work provides mathematical implications for fitness landscape models with high-dimensional genotype-phenotype mapping, yielding the potential to elucidate empirical landscapes and their resulting evolutionary processes in a manner complementary to existing computational studies.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Francisco McGee ◽  
Sandro Hauri ◽  
Quentin Novinger ◽  
Slobodan Vucetic ◽  
Ronald M. Levy ◽  
...  

AbstractPotts models and variational autoencoders (VAEs) have recently gained popularity as generative protein sequence models (GPSMs) to explore fitness landscapes and predict mutation effects. Despite encouraging results, current model evaluation metrics leave unclear whether GPSMs faithfully reproduce the complex multi-residue mutational patterns observed in natural sequences due to epistasis. Here, we develop a set of sequence statistics to assess the “generative capacity” of three current GPSMs: the pairwise Potts Hamiltonian, the VAE, and the site-independent model. We show that the Potts model’s generative capacity is largest, as the higher-order mutational statistics generated by the model agree with those observed for natural sequences, while the VAE’s lies between the Potts and site-independent models. Importantly, our work provides a new framework for evaluating and interpreting GPSM accuracy which emphasizes the role of higher-order covariation and epistasis, with broader implications for probabilistic sequence models in general.


Cell Systems ◽  
2021 ◽  
Vol 12 (11) ◽  
pp. 1019-1020
Author(s):  
Neil Thomas ◽  
Lucy J. Colwell
Keyword(s):  

2021 ◽  
Author(s):  
Bill Yang ◽  
Goksel Misirli ◽  
Anil Wipat ◽  
Jennifer Hallinan

2021 ◽  
Author(s):  
Sam F. Greenbury ◽  
Ard A. Louis ◽  
Sebastian E. Ahnert

Fitness landscapes are often described in terms of ‘peaks’ and ‘valleys’, implying an intuitive low-dimensional landscape of the kind encountered in everyday experience. The space of genotypes, however, is extremely high-dimensional, which results in counter-intuitive properties of genotype-phenotype maps, such as the close proximity of one phenotype to many others. Here we investigate how common structural properties of high-dimensional genotype-phenotype maps, such as the presence of neutral networks, affect the navigability of fitness landscapes. For three biologically realistic genotype-phenotype map models—RNA secondary structure, protein tertiary structure and protein complexes—we find that, even under random fitness assignment, fitness maxima can be reached from almost any other phenotype without passing through a fitness valley. This in turn implies that true fitness valleys are very rare. By considering evolutionary simulations between pairs of real examples of functional RNA sequences, we show that accessible paths are also likely to be utilised under evolutionary dynamics.


2021 ◽  
Vol 22 (20) ◽  
pp. 10908
Author(s):  
Luca Sesta ◽  
Guido Uguzzoni ◽  
Jorge Fernandez-de-Cossio-Diaz ◽  
Andrea Pagnani

We present Annealed Mutational approximated Landscape (AMaLa), a new method to infer fitness landscapes from Directed Evolution experiments sequencing data. Such experiments typically start from a single wild-type sequence, which undergoes Darwinian in vitro evolution via multiple rounds of mutation and selection for a target phenotype. In the last years, Directed Evolution is emerging as a powerful instrument to probe fitness landscapes under controlled experimental conditions and as a relevant testing ground to develop accurate statistical models and inference algorithms (thanks to high-throughput screening and sequencing). Fitness landscape modeling either uses the enrichment of variants abundances as input, thus requiring the observation of the same variants at different rounds or assuming the last sequenced round as being sampled from an equilibrium distribution. AMaLa aims at effectively leveraging the information encoded in the whole time evolution. To do so, while assuming statistical sampling independence between sequenced rounds, the possible trajectories in sequence space are gauged with a time-dependent statistical weight consisting of two contributions: (i) an energy term accounting for the selection process and (ii) a generalized Jukes–Cantor model for the purely mutational step. This simple scheme enables accurately describing the Directed Evolution dynamics and inferring a fitness landscape that correctly reproduces the measures of the phenotype under selection (e.g., antibiotic drug resistance), notably outperforming widely used inference strategies. In addition, we assess the reliability of AMaLa by showing how the inferred statistical model could be used to predict relevant structural properties of the wild-type sequence.


2021 ◽  
Author(s):  
Christopher W Bakerlee ◽  
Alex N. Nguyen Ba ◽  
Yekaterina Shulgina ◽  
Jose I. Rojas-Echenique ◽  
Michael M. Desai

Epistasis can dramatically affect evolutionary trajectories. In recent decades, protein-level fitness landscapes have revealed pervasive idiosyncratic epistasis among specific mutations. In contrast, other work has found ubiquitous and apparently non-specific patterns of global diminishing-returns and increasing-costs epistasis among mutations across the genome. Here, we use a hierarchical CRISPR gene drive system to construct all combinations of 10 missense mutations from across the genome in budding yeast, and measure their fitness in six environments. We show that the resulting fitness landscapes exhibit global fitness-correlated trends, but that these trends emerge from specific idiosyncratic interactions. This provides the first experimental validation of recent theoretical work that has argued that fitness-correlated trends are the generic consequence of idiosyncratic epistasis.


2021 ◽  
Vol 47 (3) ◽  
pp. 215-235
Author(s):  
Noelle Driver ◽  
Michael Frame

Sign in / Sign up

Export Citation Format

Share Document