TimeOmni-1: Incentivizing Complex Reasoning With Time Series In Large Language Models
Source
- Raw Markdown: paper_timeomni-1-2026.md
- PDF: paper_timeomni-1-2026.pdf
Core Claim
TimeOmni-1 argues that time-series models need explicit reasoning tasks and introduces TSR-Suite plus a unified reasoning model for perception, extrapolation, and decision-making.
Key Contributions
- Formalizes four atomic tasks spanning scenario understanding, causality discovery, event-aware forecasting, and decision-making.
- Builds TSR-Suite with more than 23K samples and 2.3K human-guided annotations.
- Trains TimeOmni-1 with scenario mixtures, reward functions, and task-specific optimizations.
- Reports strong out-of-distribution generalization and causality-discovery gains.
Method Notes
TimeOmni-1 is central to Time-Series Foundation Models, Causal Time Series, and Synthetic Data For Time Series.
Evidence And Results
The abstract reports causality discovery accuracy of 64.0% versus 35.9% for GPT-4.1 and more than 6% valid-response-rate improvement on event-aware forecasting.
Limitations
The approach is reasoning-centered. It should be compared with forecasting-centered Eidos and high-fidelity generation-centered TimeOmni-VL.
Foundation TSFM Relevance
| Agenda slot | Verdict | Evidence | Missing pieces |
|---|---|---|---|
| Time-series reasoning interface | partially closes | TSR-Suite covers scenario understanding, causality discovery, event-aware forecasting, and decision-making with human-guided CoT annotations. | QA-style reasoning does not guarantee calibrated numeric forecasts or dense state representations. |
| Causal structure | partially closes | The raw paper reports causal-discovery tasks and a 64.0% causality score versus 35.9% for GPT-4.1. | Needs grounding in multivariate intervention data rather than answer selection alone. |
| Control utility | adjacent | The decision-making task uses environments such as CityLearn-style charge/discharge choices. | It is not a full action-conditioned world model with rollout likelihoods under candidate controls. |
| Benchmark validity | warning | Rewarded reasoning and valid-response metrics can improve answer format without proving robust temporal dynamics. | Needs held-out numerical rollouts and failure analysis by task type. |
Links Into The Wiki
- TimeOmni-1
- Foundation Time-Series Model Research Agenda
- Time-Series Foundation Models
- Causal Time Series
- Synthetic Data For Time Series
Open Questions
- Which tasks in TSR-Suite genuinely require reasoning rather than pattern matching?
- Can TimeOmni-1-style reasoning improve numerical forecasting fidelity?