Westlake University Breaks Scalability Barrier in Genomics with PIGA Method to Construct World’s Largest Human Pangenome
Westlake University researchers develop the PIGA method to assemble over 1,100 human genomes affordably, revealing millions of previously "missing" DNA sequences.
By: AXL Media
Published: Apr 3, 2026, 10:44 AM EDT
Source: Information for this report was sourced from Westlake University

Moving Beyond the Single Reference Genome
For over two decades, the Human Genome Project’s linear reference genome (GRCh38) has served as the "gold standard" for biomedical research. However, a single reference cannot capture the vast genetic diversity of the global population, often leading to "reference bias" where complex structural variants (SVs) and tandem repeats (TRs) in diverse individuals are simply ignored. To solve this, scientists proposed the pangenome—a comprehensive map of all genetic variations within a population. While long-read sequencing has made this possible, the high cost has previously restricted pangenomes to small groups of a few dozen people. Researchers at Westlake University have now disrupted this limitation by developing an affordable, large-scale assembly method that scales to thousands of participants.
The PIGA Workflow: A Hybrid Sequencing Breakthrough
The team’s innovation lies in the pangenome-informed genome assembly (PIGA) workflow. Unlike traditional "de novo" assembly, which builds each individual genome from scratch, PIGA uses a pangenome-guided framework to integrate sequence information across an entire cohort. This allows the team to utilize a cost-effective hybrid strategy: combining modest-coverage Illumina short-read data with PacBio long-read sequencing. By leveraging the collective information of the group to fill in the gaps of individual samples, the PIGA method substantially reduces sequencing costs while maintaining high accuracy. This technical pathway effectively democratizes large-scale genomic assembly, making population-wide pangenomes a practical reality for research institutions worldwide.
A New Catalog of "Hidden" Genetic Sequences
Applying the PIGA method to a cohort of 1,116 individuals, the team constructed the world’s largest diploid pangenome to date. The study identified 405.3 million base pairs of non-reference sequences—genetic material that is completely absent from current standard references. Notably, 26.2 megabases of this "new" DNA were annotated as functional genic and regulatory elements, suggesting that a significant portion of the human instructions for life has been missing from scientific view until now. This expanded catalog provides a critical foundational infrastructure for understanding how these previously invisible sequences contribute to human health and disease.
Categories
Topics
Related Coverage
- Leicester scientists develop rapid sequencing technique to accelerate phage therapy against superbugs
- New spatial kidney map identifies high-risk B cell clusters as major driver of rapid diabetic kidney failure
- New GFAKaleidos Software Simplifies Pangenome Analysis Through Unified Multi-Model Graph Statistics Framework
- Mount Sinai Researchers Launch Global Strategy to Combat Liver Cancer Using Framework of Disease Hallmarks