Research

Publications.

Peer-reviewed work from K01 on differential privacy, synthetic clinical data, multi-omics generation and healthcare reporting standards. Open by default. Methods peer-reviewed and published.

ICLR 2026 · DATA-FM Workshop · March 2026

The Viability Boundary of Differential Privacy

Across six tabular datasets, DP-SGD synthesis with MLP variational autoencoders has a sharp viability boundary at N/d ≈ 50–300. On Adult, the cost of stricter privacy is sublinear: ε = 1 needs about 2.5× more data than ε = 10. Marginal-based DP methods can be viable two orders of magnitude lower.

→ ICLR 2026 · Gen² Workshop · March 2026

Tensorised Modular Architectures for Multi-Omics Generation

On a CITE-seq PBMC dataset, grouping single-cell features into biological modules substantially beats flat baselines at matched parameter budgets. A Tensor-Train coupling adds a modest gain over dense modular coupling. Preliminary results from one dataset, three seeds.

→ EurIPS 2025 · ML4H Workshop · December 2025

Multimodal Alignment for Synthetic Clinical Time Series

Three autoregressive conditional mean models on PhysioNet 2019 ICU data. Statistical similarity improves with complexity. Cross-feature clinical rules like fever co-occurring with tachycardia are over-produced relative to real rates, and the gap widens with complexity. Reordering features does not move the cross-feature rule.

→ NeurIPS 2024 · GenAI4H Workshop · December 2024

Transparent Reporting for Healthcare GenAI

A position paper. Healthcare GenAI lacks a standardised reporting framework. We propose a 16-item checklist extending STROBE from epidemiology, covering architecture, data, privacy, bias, clinical relevance, regulatory compliance and supplementary material. First iteration, not yet validated.

→