Origins of Life Outweigh Selection as Primary Constraint on Protein Evolution, Revealing Vast Uncharted Territory for AI Drug Design

OIST study finds ancestral "starting points" limit protein diversity, suggesting AI models like AlphaFold are trained on a narrow, non-representative dataset.

By: AXL Media

Published: Mar 31, 2026, 3:51 AM EDT

Source: Information for this report was sourced from Okinawa Institute of Science and Technology (OIST) Graduate University

Origins of Life Outweigh Selection as Primary Constraint on Protein Evolution, Revealing Vast Uncharted Territory for AI Drug Design - article image
Origins of Life Outweigh Selection as Primary Constraint on Protein Evolution, Revealing Vast Uncharted Territory for AI Drug Design - article image

The Finite Scope of the Protein Universe

While the theoretical number of possible protein sequences is virtually infinite, the biological reality is a tiny fraction of that potential. Modern biochemistry relies on a specific set of amino acid chains that have successfully folded into functional 3D shapes over billions of years. However, new research from the Okinawa Institute of Science and Technology (OIST) suggests that the proteins we know today are not necessarily the "best" possible designs, but rather the ones that happened to stem from a few specific ancestral origins. This "point-of-origin" effect acts as a powerful invisible fence, keeping protein evolution tethered to its ancient beginnings and leaving vast swaths of functional sequence space completely unexplored.

Mathematical Models vs. Biological Reality

To quantify these evolutionary boundaries, the research team built a sophisticated mathematical model to describe the sequence space of known protein families. By simulating mutations, natural selection, and genetic interactions, they predicted the theoretical diversity that a biological function should be able to achieve. When they compared these predictions to the proteins actually found in nature, they found a massive discrepancy. The known proteins were far less diverse than they should be if evolution were driven solely by fitness. This indicates that the "starting point" of a protein family is the single most dominant factor in determining its modern form, far outweighing the pressures of day-to-day survival.

Reassessing the Role of Selection and Epistasis

In classical evolutionary biology, natural selection and epistasis—the way genes interact with one another—are considered the primary architects of biological form. However, lead author Lada Isakova noted that their simulations showed these factors had a surprisingly small impact on the overall diversity of protein families. Instead, the "historical accident" of where a protein started its journey in sequence space dictated its entire future trajectory. This finding is remarkable because it suggests that much of the protein world is a product of historical contingency rather than optimized design, providing a new perspective on how "frozen" certain biological pathways truly are.

Categories

Topics

Related Coverage