Establishing causal relationships in nutrition research is challenging because dietary exposures are typically complex, intercorrelated, and difficult to measure over long timeframes. Although randomised controlled trials (RCTs) are the gold standard for causal inference, long-term dietary intervention studies that investigate disease endpoints are often impractical due to cost, duration, and ethical constraints. Shorter dietary RCTs, however, can still provide valuable insight by identifying intermediate biomarkers, that may lie on the pathway from a dietary intervention to a long-term health outcome.
Mendelian randomisation (MR), a popular epidemiological tool to assess causality between exposures and outcomes, uses genetic variants as proxies for modifiable exposures, thus reducing issues of confounding and reverse causation. However, complex dietary patterns are extremely difficult proxy with a genetic variant. To address this, we introduce a “two-step” framework that integrates evidence from dietary RCTs and intermediate biomarkers, with MR to infer potential long-term effects of dietary interventions. In step 1, we use RCT data to identify intermediates that change in response to the intervention. In step 2, we apply MR to test whether these intermediates have a causal effect on a long-term clinical outcome (Figure 1).
We demonstrate the robustness of the framework using data from the DiRECT trial (Lean et al., 2018), which evaluated the effect of a structured diet programme on type 2 diabetes (T2D) remission. First, blinded to remission outcomes, we identified 216 circulating proteins that changed following the dietary intervention (step 1). We then applied MR to estimate the causal relevance of each protein for T2D risk, identifying 10 proteins with evidence of a causal effect (step 2).
After unblinding, we compared these MR-derived estimates with the observed associations between protein levels and T2D remission in DiRECT. Seven of the ten MR-identified proteins showed strong alignment with the trial’s observational findings, with directions of effect that were consistent and a high degree of correlation (r ≈ -0.60, R2 = 0.37, Figure 2). Proteins predicted by MR to increase T2D risk were also found to be lower in individuals who achieved remission, and vice versa. This mirroring of effects supports the robustness of the framework and demonstrates that intermediate molecular responses to diet can reliably indicate longer-term clinical outcomes.
Overall, this two-step framework extends the utility of both dietary RCTs and MR in nutrition research. By linking short-term biological changes to long-term disease outcomes, it offers a novel method in causal inference in settings where long-term dietary RCTs are not feasible.