ICML 2026 PMLR 306

Support-Proximity Augmented Diffusion Estimation

A calibrated conditional diffusion surrogate with support-proximity regularization for robust offline black-box optimization under OOD risk.

Yonghan Yang*, Ye Yuan*, Zipeng Sun, Linfeng Du, Bowei He, Haolun Wu, Can Chen, Xue Liu

MBZUAI · McGill University · Mila - Quebec AI Institute · Amazon AGI

* Equal contribution. Correspondence: ye.yuan3@mail.mcgill.ca

Paper PDF Code Poster Slides BibTeX

Overview

Forward diffusion made conservative for offline design

SPADE turns forward surrogate modeling into a calibrated conditional diffusion problem and injects a kNN support prior to prevent offline optimizers from exploiting unsupported regions.

Offline black-box optimization searches for high-scoring designs from a fixed dataset without new oracle calls. This setting appears in materials, robot morphology, DNA sequence design, and LLM data-mixture optimization, where each real evaluation can be expensive or unavailable.

SPADE models \(p_\theta(y|x)\) with a conditional diffusion surrogate, calibrates its moments and rankings for optimization, and regularizes predictions according to data support. Candidates far from the dataset receive lower means and higher uncertainty, which makes LCB-based acquisition search more conservative.

Motivation

OOD exploitation is the central failure mode

Surrogate optimizers actively search for high predicted scores. Without a notion of support, the search can amplify surrogate overestimation in regions where historical data provides little evidence.

Illustration of supported and unsupported OOD regions for surrogate optimization — The optimizer should prefer designs that are both high-scoring and supported by the offline dataset.

Method

SPADE: calibrated diffusion plus support proximity

SPADE surrogate training and optimization pipeline

Conditional Diffusion Surrogate

Models \(p_\theta(y|x)\) and estimates predictive mean and uncertainty from Monte Carlo samples.

Calibrated Diffusion Estimation

Adds moment matching and pairwise rank consistency so generated score distributions remain useful for optimization.

Support-Proximity Regularization

Uses kNN distance as a support proxy; low-support candidates receive mean shrinkage and variance inflation.

Optimization

Search high utility without leaving data support

\[p(x | y) \propto p(y | x)p(x)\]

\[\mathrm{LCB}(x) = \hat{\mu}_\theta(x) - \beta \hat{\sigma}_\theta(x)\]

\[\hat{A}(x) \approx A(\hat{\mu}_\theta, \hat{\sigma}_\theta) + \kappa \log \hat{p}_{\mathrm{kNN}}(x)\]

Results

State-of-the-art offline BBO performance

SPADE achieves the best overall ranking across six benchmark tasks and places in the top two on five of six normalized maximum-score tasks.

Mean Rank2.8 / 24Normalized maximum score

Median Rank1.5 / 24Normalized maximum score

Top-2 Tasks5 / 6Across benchmark tasks

Median-score Mean Rank1.7 / 24Candidate-distribution robustness

Table of SPADE benchmark results — Normalized maximum scores among K = 128 candidates, averaged over 8 random seeds.

Ablation

Both calibration and support proximity matter

Task	Base	w/o Prox	w/o Calib	Full SPADE
SuperC	0.519	0.538	0.542	0.546
Ant	0.932	0.952	0.963	0.978
D’Kitty	0.962	0.972	0.975	0.981
LLM-DM	0.957	0.979	0.998	1.019
TF8	0.890	0.912	0.897	0.923
TF10	0.870	0.895	0.882	0.915

Full SPADE is best on all six tasks, showing that calibrated diffusion and support-proximity regularization are complementary.

Materials

Poster and presentation

The ICML poster and presentation deck are included in this repository for the project page.

Download poster PDF Download slides PDF Download slides PPTX

Code

Compact PyTorch implementation

The package exposes dataset loading, configuration, surrogate training, and acquisition optimization through a small public API.

git clone https://github.com/HarryYoung2018/spade.git
cd spade
conda create -n spade python=3.10 -y
conda activate spade
pip install -r requirements.txt
pip install -e .

import torch
from spade import Dataset, SpadeConfig, train_spade, optimize_spade

data = Dataset.from_npz("dataset.npz")
cfg = SpadeConfig()
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = train_spade(data, cfg, device=device)
result = optimize_spade(model, data, cfg, device=device)

Reproducibility

Current public release surface

This release includes the method implementation, an NPZ quickstart runner, and smoke tests. It does not include the full benchmark preprocessing and evaluation stack for Design-Bench, TFBind, or LLM-DM.

Read reproducibility status

Citation

Cite SPADE

@inproceedings{yang2026spade,
  title     = {Support-Proximity Augmented Diffusion Estimation for Offline Black-Box Optimization},
  author    = {Yang, Yonghan and Yuan, Ye and Sun, Zipeng and Du, Linfeng and He, Bowei and Wu, Haolun and Chen, Can and Liu, Xue},
  booktitle = {Proceedings of the 43rd International Conference on Machine Learning},
  series    = {Proceedings of Machine Learning Research},
  volume    = {306},
  address   = {Seoul, South Korea},
  publisher = {PMLR},
  year      = {2026}
}

@article{yang2026support,
  title   = {Support-Proximity Augmented Diffusion Estimation for Offline Black-Box Optimization},
  author  = {Yang, Yonghan and Yuan, Ye and Sun, Zipeng and Du, Linfeng and He, Bowei and Wu, Haolun and Chen, Can and Liu, Xue},
  journal = {arXiv preprint arXiv:2605.11246},
  year    = {2026}
}