New Machine Learning Framework Categorizes Metabolic Subgroups To Predict Three-Year Diabetes And Heart Disease Risk In Chinese Adults

New data-driven study identifies three metabolic subgroups in diabetes-free adults to predict the short-term risk of diabetes, stroke, and heart disease.

By: AXL Media

Published: Mar 21, 2026, 7:55 AM EDT

Source: Information for this report was sourced from Chinese Medical Journals Publishing House Co., Ltd.

New Machine Learning Framework Categorizes Metabolic Subgroups To Predict Three-Year Diabetes And Heart Disease Risk In Chinese Adults - article image
New Machine Learning Framework Categorizes Metabolic Subgroups To Predict Three-Year Diabetes And Heart Disease Risk In Chinese Adults - article image

Mapping Metabolic Heterogeneity Before Disease Onset

Researchers from Beijing Hospital and Peking University have unveiled a sophisticated machine learning framework designed to identify latent metabolic phenotypes in adults who do not yet have diabetes. Published in the Chinese Medical Journal, the study shifts the focus from single glycemic markers to a multidimensional clinical profile. By analyzing electronic health record (EHR) data from over 51,400 participants, the team demonstrated that substantial metabolic differences exist long before a clinical diagnosis, and these differences are the primary drivers of future chronic disease trajectories.

A Two Stage Data Driven Methodology

The research team employed an innovative "physics-informed" analytical approach to ensure clinical relevance. In the first stage, they used ensemble clustering on outcome indicators—such as heart disease and stroke—to derive "pseudo labels." In the second stage, these labels were integrated with thirteen common clinical markers, including BMI, waist circumference, lipid profiles, and kidney function markers. This data was used to train a weighted naive Bayesian classifier, which successfully categorized individuals into three distinct risk clusters, a model subsequently verified through independent validation cohorts.

Defining the Three Metabolic Subgroups

The study identified three clear archetypes within the diabetes-free population, each with a unique risk gradient. The first is a low-risk subgroup characterized by favorable metabolic profiles and minimal incidence of future disease. The second, a high-risk subgroup, showed poor glycemic and lipid control, correlating with the highest three-year risk for incident diabetes and fatty liver disease. The third subgroup presented an intermediate diabetes risk but was composed primarily of older individuals with elevated blood pressure and obesity, which conferred the highest specific risks for cardiovascular disease and stroke.

Categories

Topics

Related Coverage