BRIDGE: Bootstrapping Text to Control Time-Series Generation via Multi-Agent Iterative Optimization and Diffusion Modeling
Source
- Raw Markdown: paper_bridge-2025.md
- PDF: paper_bridge-2025.pdf
- Preprint: arXiv 2503.02445
- Official code: microsoft/TimeCraft/BRIDGE
Status And Credibility
BRIDGE was first posted to arXiv on 2025-03-04 and the inspected arXiv metadata reports version 7, updated 2025-09-05. The arXiv comment states ICML 2025 Main Conference. The official code is in the Microsoft TimeCraft repository.
Core Claim
BRIDGE defines text-controlled time-series generation and trains a diffusion generator conditioned on both natural-language descriptions and semantic prototypes. It also uses a multi-agent LLM workflow to create and refine text-time-series pairs when human descriptions are scarce.
Key Contributions
- Introduces a multi-agent framework for collecting templates, evaluating generated descriptions, and iteratively refining text descriptions.
- Builds a text-controlled generation framework that combines text embeddings with TimeDP-like semantic prototypes.
- Evaluates fidelity, text controllability, human preference, and downstream augmentation behavior.
- Reports that text and prototype conditioning both matter in ablations.
Evidence And Results
The paper evaluates 12 datasets and reports improved fidelity and controllability over TimeGAN, GT-GAN, TimeVAE, TimeVQVAE, and ablated BRIDGE variants. It uses J-FTSD and human evaluation to measure text-to-series alignment, and it reports downstream forecasting augmentation experiments.
Limitations
- Text descriptions are generated and refined by an LLM pipeline, so generated language artifacts can become part of the data distribution.
- The text mostly describes morphology, statistics, and background; it is not necessarily operational context, exogenous-variable history, or action history.
- Human evaluation and J-FTSD test alignment, not necessarily downstream utility or causal validity.
- BRIDGE is text-conditioned synthetic generation, not action-conditioned world modeling.
Foundation TSFM Relevance
| Agenda slot | Verdict | Evidence | Missing pieces |
|---|---|---|---|
| Context interface | partially closes | Natural-language descriptions condition generated numeric time series. | Needs real operational context and systematic misinterpretation tests. |
| Time-series generation and editing | partially closes | Hybrid text/prototype diffusion generator supports controllable sample generation. | No constrained editing of observed histories and no action channel. |
| Benchmark hygiene | warning | Uses generated text labels, J-FTSD, and human ranking. | Needs audits for caption leakage, evaluator bias, and downstream utility. |