TimeOmni-1: Incentivizing Complex Reasoning With Time Series In Large Language Models

Source

Raw Markdown: paper_timeomni-1-2026.md
PDF: paper_timeomni-1-2026.pdf

Core Claim

TimeOmni-1 argues that time-series models need explicit reasoning tasks and introduces TSR-Suite plus a unified reasoning model for perception, extrapolation, and decision-making.

Key Contributions

Formalizes four atomic tasks spanning scenario understanding, causality discovery, event-aware forecasting, and decision-making.
Builds TSR-Suite with more than 23K samples and 2.3K human-guided annotations.
Trains TimeOmni-1 with scenario mixtures, reward functions, and task-specific optimizations.
Reports strong out-of-distribution generalization and causality-discovery gains.

Method Notes

TimeOmni-1 is central to Time-Series Foundation Models, Causal Time Series, and Synthetic Data For Time Series.

Evidence And Results

The abstract reports causality discovery accuracy of 64.0% versus 35.9% for GPT-4.1 and more than 6% valid-response-rate improvement on event-aware forecasting.

Limitations

The approach is reasoning-centered. It should be compared with forecasting-centered Eidos and high-fidelity generation-centered TimeOmni-VL.

Foundation TSFM Relevance

Agenda slot	Verdict	Evidence	Missing pieces
Time-series reasoning interface	partially closes	TSR-Suite covers scenario understanding, causality discovery, event-aware forecasting, and decision-making with human-guided CoT annotations.	QA-style reasoning does not guarantee calibrated numeric forecasts or dense state representations.
Causal structure	partially closes	The raw paper reports causal-discovery tasks and a 64.0% causality score versus 35.9% for GPT-4.1.	Needs grounding in multivariate intervention data rather than answer selection alone.
Control utility	adjacent	The decision-making task uses environments such as CityLearn-style charge/discharge choices.	It is not a full action-conditioned world model with rollout likelihoods under candidate controls.
Benchmark validity	warning	Rewarded reasoning and valid-response metrics can improve answer format without proving robust temporal dynamics.	Needs held-out numerical rollouts and failure analysis by task type.

Links Into The Wiki

Open Questions

Which tasks in TSR-Suite genuinely require reasoning rather than pattern matching?
Can TimeOmni-1-style reasoning improve numerical forecasting fidelity?
How does TimeOmni-1 compare against reasoning-specific RL/post-training approaches such as ExpRL-style dense reward priming, especially on tasks where references or partial-progress rewards are available?

Alex Open Research Wiki

Explorer

TimeOmni-1: Incentivizing Complex Reasoning With Time Series In Large Language Models

TimeOmni-1: Incentivizing Complex Reasoning With Time Series In Large Language Models

Source

Core Claim

Key Contributions

Method Notes

Evidence And Results

Limitations

Foundation TSFM Relevance

Links Into The Wiki

Open Questions

Graph View

Table of Contents

Backlinks