TimeOmni-1: Incentivizing Complex Reasoning With Time Series In Large Language Models

Source

Core Claim

TimeOmni-1 argues that time-series models need explicit reasoning tasks and introduces TSR-Suite plus a unified reasoning model for perception, extrapolation, and decision-making.

Key Contributions

  • Formalizes four atomic tasks spanning scenario understanding, causality discovery, event-aware forecasting, and decision-making.
  • Builds TSR-Suite with more than 23K samples and 2.3K human-guided annotations.
  • Trains TimeOmni-1 with scenario mixtures, reward functions, and task-specific optimizations.
  • Reports strong out-of-distribution generalization and causality-discovery gains.

Method Notes

TimeOmni-1 is central to Time-Series Foundation Models, Causal Time Series, and Synthetic Data For Time Series.

Evidence And Results

The abstract reports causality discovery accuracy of 64.0% versus 35.9% for GPT-4.1 and more than 6% valid-response-rate improvement on event-aware forecasting.

Limitations

The approach is reasoning-centered. It should be compared with forecasting-centered Eidos and high-fidelity generation-centered TimeOmni-VL.

Foundation TSFM Relevance

Agenda slotVerdictEvidenceMissing pieces
Time-series reasoning interfacepartially closesTSR-Suite covers scenario understanding, causality discovery, event-aware forecasting, and decision-making with human-guided CoT annotations.QA-style reasoning does not guarantee calibrated numeric forecasts or dense state representations.
Causal structurepartially closesThe raw paper reports causal-discovery tasks and a 64.0% causality score versus 35.9% for GPT-4.1.Needs grounding in multivariate intervention data rather than answer selection alone.
Control utilityadjacentThe decision-making task uses environments such as CityLearn-style charge/discharge choices.It is not a full action-conditioned world model with rollout likelihoods under candidate controls.
Benchmark validitywarningRewarded reasoning and valid-response metrics can improve answer format without proving robust temporal dynamics.Needs held-out numerical rollouts and failure analysis by task type.

Open Questions

  • Which tasks in TSR-Suite genuinely require reasoning rather than pattern matching?
  • Can TimeOmni-1-style reasoning improve numerical forecasting fidelity?