Time-Series Generation

Summary

Time-series generation is not one task. The current TimeCraft batch separates at least seven interfaces:

InterfaceRepresentative sourcesGenerated objectMain conditioning signalLocal interpretation
Cross-domain synthetic generationTimeDP, TimeCraftFixed-window time-series samplesFew target-domain examples converted into prototype weightsUseful for low-resource synthetic data, but not action-conditioned.
Text-controlled generationBRIDGE, T2STime-series samplesNatural-language descriptions, sometimes plus prototypesA context interface for generation; evaluation must distinguish caption alignment from numeric utility.
Target-aware augmentationTarDiff, OATSSynthetic training samplesDownstream loss, influence scores, or valuable training samplesShifts the objective from realism to downstream utility; needs leakage and overfitting audits.
Causal/interventional generationCaTSGObservational, interventional, and counterfactual samplesCausal conditions plus latent environment estimatesClosest TimeCraft branch to counterfactual modeling, but real-world counterfactual validation remains weak.
Irregular/continuous generationDiff-MNContinuous-time trajectories from irregular observationsIrregular observation context plus generated MoE-NCDE dynamics weightsDirectly relevant to continuous latent-state modeling and arbitrary-time generation.
Forecast-generation via diffusionMG-TSD, SundialForecast sample pathsNumeric history plus denoising or flow objectivesProbabilistic forecasting, not unconditional synthetic data generation.
Financial market simulationDiGA, MarSOrder-flow or market trajectoriesScenario targets, injected orders, matching rules, market stateWorld-model-adjacent because generated futures are used for what-if analysis and agent training.

The important axis is the conditioning contract. A generator conditioned on text, examples, downstream gradients, causal interventions, irregular observations, or candidate orders should not be evaluated as if it solved the same problem.

TimeCraft Lineage

TimeCraft is best read as a Microsoft Research framework and repository that packages several related generation lines:

  • TimeDP supplies the prototype/domain-prompt branch.
  • BRIDGE adds text-to-series data preparation and hybrid text/prototype conditioning.
  • TarDiff adds task-aware diffusion guidance through influence functions.
  • CaTSG adds observational, interventional, and counterfactual time-series generation.
  • OATS makes synthetic generation part of the TSFM training loop.
  • Diff-MN targets irregular-to-continuous generation through diffusion-parameterized MoE-NCDE dynamics.

That lineage matters because it moves from generate realistic samples toward generate samples for a purpose: match a target domain, satisfy a text description, improve a downstream model, respect causal interventions, support TSFM pretraining, or produce a continuous trajectory.

Evaluation Boundary

Generation papers often report MMD, KL, discriminative score, predictive score, J-FTSD, human preference, downstream AUROC/AUPRC, or trading-agent utility. These metrics answer different questions:

  • Fidelity metrics test whether generated samples resemble a reference distribution.
  • Text-alignment and human-ranking metrics test whether generated samples match a condition.
  • Downstream utility metrics test whether synthetic samples improve another model.
  • Causal metrics test interventional or counterfactual behavior, but real-world counterfactual labels are usually absent.
  • Market-simulation metrics test stylized facts, market impact, and agent-training usefulness.

For this wiki, a time-series generator becomes world-model-relevant only when the generated future remains conditioned on state, context, and explicit actions, control inputs, interventions, or candidate orders. Most TimeCraft branches are still passive or condition-controlled generators rather than full action-conditioned world models.

Irregular and continuous generation should report whether irregularity is naturally observed or simulated by dropping points from regular series. Diff-MN tests random dropping at several observation rates, so transfer to real sampling policies remains open.

Bias and spurious-correlation metrics should be separated from fidelity metrics. InvDiff is mostly text-to-image evidence, with a limited AusElec/TimeGrad OOD forecasting experiment; use it as a shortcut-auditing pattern rather than broad time-series generation evidence.

Relation To Foundation TSFM Agenda

Time-series generation maps most directly to the generation/editing, context interface, dense numeric fidelity, causal/counterfactual, and benchmark slots in the Foundation Time-Series Model Research Agenda. The TimeCraft batch strengthens the generation/editing branch, but it also shows why the agenda must separate observational generation, text-controlled generation, utility-guided augmentation, and intervention-aware rollout.

Open Questions

  • Which synthetic time-series generators improve downstream models under strict train/validation/test separation rather than by tuning to the evaluation set?
  • Can text-controlled generation use operationally meaningful context such as incidents, exogenous variables, and constraints rather than only morphology captions?
  • Can CaTSG-style causal generation scale beyond predefined SCMs and synthetic counterfactual labels?
  • Can Diff-MN-style continuous generation become a reusable latent-state interface for irregular clinical, industrial, or observability data?
  • Which generation metrics predict utility for forecasting, anomaly detection, representation learning, and action-conditioned planning?
  • For market simulators, which combination of stylized-fact fidelity, scenario-control error, market-impact validity, and downstream trading-agent transfer predicts real utility?
  • How should time-series generators define invariant temporal features so debiasing removes shortcut dependence without erasing rare regimes or meaningful domain shifts?