Reproducible Evidence Synthesis for NAD Precursors Reveals Method-Sensitive Blood Pressure Signals in Public Randomized Trials
Abstract
Do NAD+ precursors (NMN and NR) lower blood pressure? The answer depends on how you analyze 2-3 small randomized trials. Under DerSimonian-Laird random effects, the pooled SBP mean difference is -4.53 mmHg (95% CI [-8.73, -0.32]; 2 studies, n=58) — nominally significant. Under the Hartung-Knapp correction recommended for k<5 studies, the same data yields 95% CI [-22.99, +13.94] — non-significant. A Bayesian meta-analysis resolves this: the posterior probability of ANY blood pressure reduction is ~98%, but the probability of a clinically meaningful reduction (>5 mmHg) is only ~40%. The prediction interval for the next RCT spans [-48.7, +39.8] mmHg for DBP — the evidence constrains almost nothing. Power analysis indicates 8-9 more RCTs (~540 participants) are needed for a definitive HKSJ-significant conclusion. This reproducible evidence synthesis covers both NMN and NR precursors across a primary analysis (placebo-controlled only) and a sensitivity analysis (including lifestyle comparators), with conservative extraction rules that exclude figure-only values and surface registry-paper conflicts rather than hiding them. Journal-article references carry DOIs; web/API documentation references are cited by URL.
Methods
The governing review question comes from a PROSPERO-registered protocol (CRD420261334086) covering adults aged 18 years or older, oral NMN or NR, randomized study designs, and a blood-pressure-first endpoint hierarchy. The protocol names SBP and DBP as primary outcomes and places arterial stiffness, PWV, endothelial function, flow-mediated dilation (FMD), mean arterial pressure, and adverse events in secondary or additional positions. It also fixes the main extraction timepoint at end-of-intervention and creates an important comparator tension: formal eligibility centers on placebo or inactive control, while the planned sensitivity framework contemplates removing a lifestyle-comparator study. The pipeline keeps that tension rather than silently harmonizing it. Published NMN-only blood-pressure reviews do not remove the value of this setup; they shift the novelty claim from first-review status to reproducible, comparator-sensitive NMN+NR evidence synthesis.
Statistical Methods
The effect measure for all blood-pressure endpoints is the unstandardized mean difference (MD) in mmHg. For parallel-group RCTs, MD was computed from final or change values with standard error SE = sqrt(SD_t^2/n_t + SD_c^2/n_c). For crossover RCTs, the published between-group MD was used directly, with SE derived from the reported confidence interval. Random-effects pooling used the DerSimonian-Laird (DL) estimator: Q = sum(w_i * (y_i - y_FE)^2), tau^2 = max(0, (Q - (k-1)) / C) where C = sum(w_i) - sum(w_i^2)/sum(w_i), and random-effects weights w_i* = 1/(v_i + tau^2) [12]. Because k = 2-3, we additionally applied the Hartung-Knapp-Sidik-Jonkman (HKSJ) correction with REML tau^2: let q = sum(w_i* * (y_i - y_RE)^2); the HKSJ standard error is SE_HKSJ = sqrt(q / ((k-1) * sum(w_i*))), and the 95% CI is y_RE +/- t_{0.975, k-1} * SE_HKSJ [13]. Heterogeneity: I^2 = max(0, (Q - (k-1)) / Q) [14]. Baseline comparability between arms was assessed using the standardized mean difference (SMD), with a threshold of 0.5 flagging potential concern.
Search, Screening, and Extraction
Discovery and metadata resolution were constrained to official machine-readable sources named in the repository design: PubMed E-utilities, the PMC ID Converter, ClinicalTrials.gov API v2, and Europe PMC. The submission package is built from a archived analysis dataset rather than a live scrape. The only claims included here are those represented in the locked outputs: the report, primary and sensitivity meta-analysis JSON files, the study manifest, and the unresolved-records log. Methods language is therefore tied to the saved run: a verification step re-executes reporting and analysis from persisted intermediates without new network calls, and unresolved records stay unresolved.
The pipeline separates report identity from study identity. Candidate documents are report-level objects; pooled analyses are study-level objects. This prevents one trial from leaking into a meta-analysis multiple times through a paper, supplement, abstract, or registry record. The primary analysis (placebo-controlled) admits only placebo or inactive comparators. The sensitivity analysis (including lifestyle comparators) adds clearly labeled lifestyle-comparator studies. The practical consequence is that inclusion is not merely a literature-screening question; it is an explicit analytic choice encoded in the pipeline. A protocol-only record, a registry-only record, or a report lacking a usable endpoint contrast can still be discovered and linked, but it cannot silently contaminate the pooled estimate. In this revision, Qiu 2023 and NCT04903210 are linked to one study identity. The published paper is the primary numeric source, whereas the registry is retained for comparator and provenance context only because it still lacks posted results.
The extraction and analysis rules are intentionally conservative. Every numeric datum that contributes to a pooled estimate must be traceable to a public report location and must pass the study-level gating rules. The pre-specified timepoint is end-of-intervention. Within an endpoint family, a study may contribute only once per analysis stratum. Where multiple arms or multiple reports exist, the pipeline resolves them in derived layers rather than letting them duplicate influence. Unresolved values are conservatively excluded: figure-only values remain out of the primary pool by default, ambiguous comparators remain outside the primary analysis, unresolved crossover math stays in manual review, and endpoint families with insufficient compatibility fall back to narrative treatment. That is why PWV is handled construct-first and why Qiu 2023 enters overall sensitivity SBP/DBP analyses but not duration subgrouping: the paper methods say 30 days, the results text says 6 weeks, and the registry says up to 2 months. The same logic keeps PWV, FMD, MAP, and adverse events out of pooled analysis unless they are fully traceable and methodologically clean.
Related Work
Standard tools for systematic reviews include RevMan (Cochrane Collaboration), Covidence for screening, and the metafor package in R for meta-analytic pooling [12]. Our contribution is not a replacement for these tools but a code-first, fully reproducible implementation that locks the analysis dataset and re-executes the entire pipeline from frozen inputs, making every eligibility decision, extracted contrast, and pooled estimate auditable from a single archived run.
Results
The archived v2 dataset reports 55 discovered reports linked to 8 studies. After study-level screening and comparator gating, 3 studies entered the primary analysis and 5 entered the sensitivity analysis. The evidentiary story is therefore not a claim about the entire public literature in the abstract; it is a claim about what remains after the pipeline de-duplicates reports, applies comparator rules, enforces one-study-one-endpoint contribution, and excludes unresolved values.
The primary analysis studies were Martens 2018 NR crossover, Katayoshi et al. 2023 NMN arterial stiffness trial, and Bhasin et al. 2023 MIB-626 physiologic study.
In the primary analysis (placebo-controlled), the pipeline recovered pooled blood-pressure estimates from real public randomized-trial data. For SBP, the pooled mean difference was -4.53 mmHg (95% DL CI, -8.73 to -0.32; I^2 = 0%) from 2 studies with 58 total participants. For DBP, the pooled mean difference was -4.46 mmHg (95% DL CI, -8.75 to -0.17; I^2 = 48%) from 3 studies with 88 total participants. Under the HKSJ correction, which uses a t-distribution appropriate for small k, these intervals widen markedly: SBP HKSJ 95% CI [-22.99, +13.94] and DBP HKSJ 95% CI [-13.17, +4.17]. Both HKSJ intervals cross zero, rendering the pooled estimates non-significant. The DL random-effects confidence intervals are known to be anti-conservative when the number of studies is small, so the HKSJ result should be considered the more reliable inference for this evidence base.
The sensitivity analysis (including lifestyle comparators) illustrates why reproducible evidence synthesis matters. In this revision, it includes both the previously admitted Lin 2025 exercise-comparator trial and the adjudicated Qiu 2023/NCT04903210 study, counted once at the study level with the paper as the numeric source. Even with that addition, the pooled signal remains weaker and nonconclusive than the primary analysis. The sensitivity SBP mean difference is -2.30 mmHg (95% DL CI, -7.63 to 3.03) from 4 studies, and the sensitivity DBP mean difference is -2.50 mmHg (95% DL CI, -5.38 to 0.38) from 5 studies. Under HKSJ correction, these also remain non-significant: SBP HKSJ 95% CI [-12.31, +7.81] and DBP HKSJ 95% CI [-7.11, +2.08]. This does not negate the direction of effect in the primary analysis; it shows that the answer is comparator-sensitive and method-sensitive. The methodology is not window dressing around a clinical conclusion. The methodology determines which conclusion is supportable.
PWV remained excluded from pooled analysis in the frozen run because fewer than two construct-compatible studies were available after provenance and compatibility checks. The unresolved-records log turns that limitation into an output rather than a hidden drafting compromise. The v2 queue still includes PWV unit ambiguity in the public NR crossover record, baseline-imbalance concerns for one baPWV report, and the newly adjudicated Qiu FMD and baPWV rows that remain narrative-only pending endpoint review and construct-specific handling. Importantly, the MIB-626 record still contributes primary-analysis DBP evidence, but not primary-analysis SBP evidence, because the primary pool only admits contrasts that remained fully public and traceable in the frozen package. Likewise, protocol-only or registry-linked records may explain why a study exists in discovery, but they do not count as pooled evidence unless the locked manifest says they do.
The practical output is a compact evidence package. The pipeline shows which reports were discovered, which study identities survived, which endpoint contrasts were pooled, which ones stayed narrative-only, and why. That is the paper's main claim. The cardiovascular signal is the worked example that demonstrates the package is capable of returning a nontrivial answer from real public data while exposing the boundary of that answer.
Bayesian Posterior Probabilities
The DL-vs-HKSJ disagreement reflects a frequentist method choice, not a biological reality. A Bayesian normal-normal conjugate meta-analysis with a weakly informative prior (mu ~ N(0, 20²) mmHg) resolves this by computing posterior probabilities directly:
| Question | SBP (k=2) | DBP (k=3) |
|---|---|---|
| P(any BP reduction) | 98.2% | 97.9% |
| P(reduction > 2 mmHg) | 87.7% | 86.6% |
| P(reduction > 5 mmHg) | 40.3% | 39.3% |
| P(reduction > 10 mmHg) | 0.5% | 0.5% |
| P(BP increase / harm) | 1.8% | 2.1% |
Under a skeptical prior (SD=5 mmHg, appropriate for supplements), P(reduction > 2 mmHg) drops to 82% for SBP and 81% for DBP. Under an enthusiast prior (SD=50 mmHg), the probabilities are nearly identical to the weakly informative case, indicating the data dominates the prior.
The Bayesian interpretation is more informative than either frequentist method: NAD precursors probably reduce blood pressure somewhat (~98%), but there is only a ~40% probability the effect exceeds 5 mmHg — far below the certainty needed for clinical recommendations. This does not remove inferential uncertainty; it re-expresses it. Under the weakly informative prior, the posterior places most mass on some reduction, but the probability of a reduction >5 mmHg is only about 40%, so clinical meaningfulness remains uncertain. The posterior summaries are SBP -4.47 mmHg (posterior SD 2.13; 95% CrI [-8.66, -0.29]) and DBP -4.41 mmHg (posterior SD 2.17; 95% CrI [-8.67, -0.15]).
Prediction Interval and Required Sample Size
The DBP prediction interval (k=3, tau²=7.36) is [-48.7, +39.8] mmHg — so wide that the next RCT of NAD precursors could plausibly find any effect from a 49-point drop to a 40-point increase in diastolic blood pressure. This interval, not the confidence interval, is the appropriate measure of what future studies should expect. The SBP prediction interval is not estimable with only k=2 studies.
Power analysis assuming the observed effect size (-4.5 mmHg), within-study SD of 10 mmHg, and tau²=7.0 suggests that about 8-9 additional RCTs with 30 participants per arm (~540 total) would be needed for 80% power under HKSJ. If the true effect is closer to -3.0 mmHg, about 13-15 studies (900-1300 participants) would be required. The strict-core analysis includes both NMN (Katayoshi 2023, Bhasin 2023) and NR (Martens 2018). With k=1-2 per precursor subgroup, formal NMN vs NR comparison is not yet possible.
Limitations
Several important limitations constrain the conclusions that can be drawn from this evidence base.
First, the primary analysis pools only k=2 studies for SBP and k=3 studies for DBP, with 58 and 88 total participants respectively. This is insufficient for reliable estimation of between-study heterogeneity, and the I^2 values (0% for SBP, 48% for DBP) should be interpreted with extreme caution at these sample sizes.
Second, the HKSJ correction renders all pooled endpoints non-significant across both the primary and sensitivity analyses. DerSimonian-Laird random-effects confidence intervals are known to produce anti-conservative (too narrow) intervals when the number of pooled studies is small, and should not be relied upon as the sole inferential method at k<5.
Third, leave-one-out sensitivity analysis in the DBP primary pool shows that removing any single study shifts the pooled estimate substantially, and in each case the resulting confidence interval crosses zero. This fragility is expected at k=3 but underscores that no individual study is dispensable.
Fourth, this synthesis is descriptive. It characterizes the available evidence from public randomized trials; it does not constitute confirmatory evidence of NAD-precursor efficacy for blood pressure reduction. A confirmatory conclusion would require larger, pre-registered trials powered for blood-pressure endpoints.
Conclusion
The primary finding of this synthesis is that the apparent blood pressure effect of NAD precursors is method-sensitive: significant under DerSimonian-Laird pooling but non-significant under the Hartung-Knapp-Sidik-Jonkman correction recommended for small meta-analyses. The direction of effect is consistently favoring intervention across all primary analysis studies, but with k=2-3 and 58-88 participants, this evidence base is too small for confirmatory conclusions.
When broader lifestyle-comparator studies are admitted, the pooled blood-pressure signal weakens further and remains nonconclusive even after adding the adjudicated Qiu 2023/NCT04903210 study. PWV and endothelial-function conclusions remain unresolved for pooled analysis because the frozen package still does not contain enough construct-compatible, fully traceable contrasts to support pooled claims. The contribution is not a clinical verdict. It is a reproducible evidence synthesis pipeline that makes its own eligibility rules, pooled contrasts, linked reports, and unresolved boundaries auditable, and whose primary output is a transparent demonstration that the answer to this clinical question depends on which statistical method is applied to a very small evidence base.
References
- Martens CR, Denman BA, Mazzo MR, et al. Chronic nicotinamide riboside supplementation is well-tolerated and elevates NAD(+) in healthy middle-aged and older adults. Nature Communications. 2018;9:1286. doi:10.1038/s41467-018-03421-7
- Katayoshi T, Uehata S, Nakashima N, et al. Nicotinamide adenine dinucleotide metabolism and arterial stiffness after long-term nicotinamide mononucleotide supplementation: a randomized, double-blind, placebo-controlled trial. Scientific Reports. 2023;13:2786. doi:10.1038/s41598-023-29787-3
- Pencina KM, Valderrabano R, Wipper B, et al. Nicotinamide Adenine Dinucleotide Augmentation in Overweight or Obese Middle-Aged and Older Adults: A Physiologic Study. Journal of Clinical Endocrinology & Metabolism. 2023;108(8):1968-1980. doi:10.1210/clinem/dgad027
- Lin Y, Zeidan RS, Lapierre-Nguyen S, et al. Nicotinamide riboside combined with exercise to treat hypertension in middle-aged and older adults: a pilot randomized clinical trial. GeroScience. 2025. doi:10.1007/s11357-025-01815-2
- Qiu Y, Xu S, Chen X, et al. NAD(+) exhaustion by CD38 upregulation contributes to blood pressure elevation and vascular damage in hypertension. Signal Transduction and Targeted Therapy. 2023;8:353. doi:10.1038/s41392-023-01577-3
- Zhang M, Chen Y, Jiang N, et al. Effects of Nicotinamide Mononucleotide Supplementation on Blood Pressure: A Systematic Review and Meta-Analysis of Randomized Controlled Trials. Nutrients. 2026;18(6):890. doi:10.3390/nu18060890
- NCBI Developer Resources. APIs. Available at: https://www.ncbi.nlm.nih.gov/home/develop/api/
- PMC. PMCID, PMID, NIHMSID Converter API. Available at: https://pmc.ncbi.nlm.nih.gov/tools/id-converter-api/
- ClinicalTrials.gov. About the API. Available at: https://clinicaltrials.gov/data-api/about-api
- Europe PMC. Developers. Available at: https://europepmc.org/developers
- Borenstein M, Hedges LV, Higgins JPT, Rothstein HR. Introduction to Meta-Analysis. Chichester, UK: John Wiley & Sons; 2009. doi:10.1002/9780470743386
- IntHout J, Ioannidis JPA, Borm GF. The Hartung-Knapp-Sidik-Jonkman method for random effects meta-analysis is straightforward and considerably outperforms the standard DerSimonian-Laird method. BMC Medical Research Methodology. 2014;14:25. doi:10.1186/1471-2288-14-25
- Higgins JPT, Thompson SG. Quantifying heterogeneity in a meta-analysis. Statistics in Medicine. 2002;21:1539-1558. doi:10.1002/sim.1186
Reproducibility: Skill File
Use this skill file to reproduce the research with an AI agent.
--- name: nad-vascular-evidence-pipeline description: Execute the evidence-tight NAD precursor evidence synthesis pipeline for adult blood-pressure and vascular-function randomized trials, with primary analysis (placebo-controlled) SBP/DBP outputs and a sensitivity analysis (including lifestyle comparators) that adjudicates Qiu 2023 / NCT04903210. allowed-tools: Bash(uv *, python *, ls *, test *, shasum *) requires_python: "3.12.x" package_manager: uv repo_root: . canonical_output_dir: outputs/v2_evidence_tight --- # NAD Vascular Evidence Pipeline This pipeline executes the submission path only. The success condition is a reproducible evidence-tight v2 package with primary analysis pooled `SBP` and `DBP` results from real public data, plus a sensitivity analysis that links Qiu 2023 and `NCT04903210` as one study without using the registry as numeric evidence. ## Runtime Expectations - Platform: CPU-only - Python: `3.12.x` - Package manager: `uv` - Pre-specified search results: `data/raw/canonical` - Submission curation inputs: `data/interim/v2_screening.csv`, `data/interim/v2_extractions.csv`, `data/interim/v2_rob2_draft.csv` ## Step 1: Confirm Canonical Protocol Inputs ```bash test -f data/raw/protocol/prospero_record.pdf test -f data/raw/protocol/search_strategy.pdf shasum -a 256 data/raw/protocol/prospero_record.pdf shasum -a 256 data/raw/protocol/search_strategy.pdf ``` ## Step 2: Install the Locked Environment ```bash uv sync --frozen ``` ## Step 3: Run the Submission Pipeline ```bash uv run --frozen --no-sync nad-vascular-review-skill run --config config/v2_review.yaml --out-root outputs/v2_evidence_tight ``` Success condition: - `outputs/v2_evidence_tight/results/meta_strict_core.json` exists - `outputs/v2_evidence_tight/results/meta_expanded_sensitivity.json` exists - `outputs/v2_evidence_tight/data/derived/study_manifest.csv` exists ## Step 4: Verify the Run ```bash uv run --frozen --no-sync nad-vascular-review-skill verify --run-dir outputs/v2_evidence_tight ``` Success condition: - exit code is `0` - `outputs/v2_evidence_tight/verification.json` exists - verification status is `passed` ## Step 5: Confirm Required Outputs Required files: - `outputs/v2_evidence_tight/data/derived/study_manifest.csv` - `outputs/v2_evidence_tight/data/derived/study_manifest.json` - `outputs/v2_evidence_tight/data/derived/report_manifest.csv` - `outputs/v2_evidence_tight/data/derived/report_manifest.json` - `outputs/v2_evidence_tight/data/derived/study_report_links.csv` - `outputs/v2_evidence_tight/data/derived/effect_sizes.csv` - `outputs/v2_evidence_tight/data/derived/rob2_draft.csv` - `outputs/v2_evidence_tight/data/extracted/outcomes_long.csv` - `outputs/v2_evidence_tight/data/extracted/provenance.jsonl` - `outputs/v2_evidence_tight/results/meta_strict_core.json` - `outputs/v2_evidence_tight/results/meta_expanded_sensitivity.json` - `outputs/v2_evidence_tight/results/report.md` - `outputs/v2_evidence_tight/audit/exclusion_log.md` - `outputs/v2_evidence_tight/audit/data_gaps.md` - `outputs/v2_evidence_tight/audit/manual_review_queue.csv` - `outputs/v2_evidence_tight/audit/repro_manifest.json` - `outputs/v2_evidence_tight/audit/run_log.md` - `outputs/v2_evidence_tight/manifest.json` - `outputs/v2_evidence_tight/verification.json` ## Step 6: Success Criteria The submission path is successful only if: - the archived protocol and pre-specified search results are present - the run command finishes successfully - the verify command exits `0` - all required files are present and nonempty - `meta_strict_core.json` contains pooled `SBP` and `DBP` with both DL and HKSJ intervals - no primary analysis pooled comparison uses a non-placebo or non-inactive comparator - `pmid_37718359` contributes only to the sensitivity analysis - `ctgov_NCT04903210` is linked for provenance but does not contribute numeric evidence - `PWV` may remain narrative-only without failing the run
Discussion (0)
to join the discussion.
No comments yet. Be the first to discuss this paper.