Synthetic Data For Time Series
Summary
Synthetic data is used in this corpus for several different jobs: scaling pretraining volume, creating label coverage for classification, simulating causal or template structure, teaching covariate and grouped-forecasting behavior, fitting learned inference priors, bootstrapping annotation layers over real data, and aligning models to reasoning or generation tasks. These are related, but they are not one method.
What The Wiki Currently Believes
Data-Volume Scaling
TimesFM, Timer, MOMENT, Sundial, Kairos, TiRex, Tiny Time Mixers, and Reverso all make pretraining-corpus design central to zero-shot or few-shot transfer. In this use, synthetic data usually fills underrepresented frequencies, seasonalities, irregularities, spikes, or long-horizon regimes that are sparse in public real data.
For Sundial, the synthetic component is only 0.05% of TimeBench and is described as pattern-diversity support; the main claim is trillion-point mixed real-world corpus scale rather than synthetic-primary pretraining.
Label And Classification Coverage
CauKer uses synthetic causally coherent time series for classification TSFM pretraining, and MantisV2 uses CauKer-style synthetic classification data plus test-time strategies to close zero-shot gaps. This is a label-coverage story: the synthetic process supplies many labeled classification tasks that real archives cannot provide at the same scale.
Iterative Label Bootstrapping
Florence-2 is not a time-series paper, but it is an important data-engine analogy. It shows a real-observation plus generated-annotation path: start with a first annotation pass, train a model, use the model to improve and extend the annotations, filter the results, and repeat. For time series, this pattern targets the labeled-dataset bottleneck directly: real multivariate time series and event streams may be abundant, while labels for regimes, anomalies, events, or temporal segments are scarce.
Causal And Template Generation
Causal or template generators appear when a paper wants controllable structure rather than only more samples. CauKer composes Gaussian-process kernels and structural causal mechanisms; TempoPFN combines ForecastPFN-style components, KernelSynth, CauKer-style causal structure, spike and regime-switching generators, and augmentation cascades; Reverso uses Gaussian-process, spike, trapezoidal, trend, seasonality, and irregularity sequences.
Covariate And Grouped Forecasting Behavior
Chronos-2 uses synthetic multivariate and covariate-informed examples to teach grouped forecasting behavior across related series, targets, past-only covariates, known future covariates, and categorical covariates. This is different from pure univariate data scaling because the synthetic data must teach how variables relate inside a forecasting context.
PFN-Style Learned Inference Priors
TabPFN-v2, TabPFN-3, and TabICL are static tabular-data references, but they are important because they learn in-context inference from synthetic structural-causal tabular tasks. TabPFN-3 also reports TabPFN-TS-3, a specialized time-series checkpoint trained through the TabPFN ecosystem and evaluated on fev-bench. TempoPFN is the local open time-series analogue: it asks whether a learned inference prior trained only on synthetic temporal generators can become a zero-shot forecaster. See Tabular Foundation Models for the static-tabular side.
Observability And Benchmark Hygiene
Toto 2.0 trains on observability and synthetic time-series data while excluding public forecasting datasets during pretraining. That makes it a useful synthetic-data reference even though the source is an announcement article: it explicitly connects synthetic data to benchmark-leakage control, scaling, and observability-domain coverage.
Reasoning And Alignment Data
ChatTS uses synthetic time-series attributes and Q&A generation for time-series/LLM alignment. TimeOmni-1 combines curated reasoning samples with TSR-Suite-style reasoning tasks, and TimeOmni-VL builds time-series understanding and generation data around TS-image representations and CoT-conditioned generation. In this use, the bottleneck is not only numeric realism; it is whether annotations and prompts represent the reasoning behavior the model should learn.
Natural language guidance of high-fidelity TTS is an audio source rather than a time-series forecasting source, but it is a useful synthetic-annotation pattern: derive structured labels from real temporal data, convert them into natural-language descriptions, and use a small high-fidelity slice to steer generation quality.
T2S is the time-series counterpart to that pattern. It segments real time series into fragments, generates natural-language captions for local morphology, filters candidate captions with embedding similarity, and trains a text-to-series diffusion model on the resulting TSFragment-600K dataset. This is not synthetic data in the “simulate from a prior” sense; it is a synthetic annotation and conditional-generation loop over real temporal fragments.
Evidence
The repeated use of synthetic data is a response to different bottlenecks. Data-volume scaling, label generation, iterative label bootstrapping, causal/template coverage, covariate behavior, PFN-style inference priors, language alignment, and reasoning supervision all need different audits.
Risks And Caveats
- Synthetic templates can create unrealistic coupling, overly clean seasonality, or artifacts that a model memorizes as shortcuts.
- Model-generated annotation loops can amplify seed-model mistakes, especially for rare regimes or minority event classes.
- Text-to-series generators can learn caption artifacts or generic morphology words rather than operationally meaningful regimes, events, or interventions.
- Pretraining corpora can leak public benchmark train or test structure even when the paper labels an evaluation as zero-shot.
- Covariates in synthetic forecasting data are usually exogenous variables or known future features; they should not be described as actions, control inputs, or interventions unless the generator and evaluation actually encode controllable decisions.
- Synthetic causal structure is not enough by itself for counterfactual validity on real systems with confounding, delayed effects, missingness, or policy-driven interventions.
Relation To Foundation TSFM Agenda
Synthetic data maps to several slots in the Foundation Time-Series Model Research Agenda, but mostly as support rather than direct closure. It can improve data diversity, rare-regime coverage, context/generation alignment, and causal-template coverage. It becomes a warning when synthetic artifacts, decorative captions, benchmark leakage, or exogenous covariates are mistaken for real state, context, or controllable interventions.
Open Questions
- Which synthetic-generation assumptions survive transfer to real-world temporal domains?
- How should synthetic data be audited for causal and numerical artifacts?
- When do synthetic covariates remain exogenous variables, and when should they be modeled as actions, control inputs, or interventions?
- How should benchmark reports separate synthetic-only pretraining, mixed real/synthetic pretraining, fine-tuning, and ensemble entries?
- Which generator families best transfer to high-cardinality observability metrics and event streams?
- Which temporal labels should be bootstrapped from real data instead of generated by a simulator?
- When should text-to-series generation be evaluated by downstream utility rather than only reconstruction, retrieval, or caption-alignment metrics?