Awesome Agentic Time Series
Source
- Raw Markdown snapshot: paper_awesome-agentic-time-series-2026.md
- Official repository: https://github.com/TROUBADOUR000/Awesome-Agentic-Time-Series
- Official survey PDF in repository: https://github.com/TROUBADOUR000/Awesome-Agentic-Time-Series/blob/main/The%20Landscape%20of%20Agentic%20Time%20Series%20Systems.pdf
- Local repository metadata snapshot:
papers/awesome-agentic-time-series-2026/source_repo_metadata.json - Local survey PDF/text artifacts:
papers/awesome-agentic-time-series-2026/source_survey.pdf,papers/awesome-agentic-time-series-2026/source_survey_pdftotext.txt
Status And Credibility
This is a June 2026 public GitHub repository and survey snapshot. The inspected main commit is 1dc5e3c366be82f930619ce0801c810fbcfe7060, dated 2026-06-13, and the included survey PDF metadata was created on 2026-06-12. The README is MIT-licensed and lists 239 dated paper entries across surveys, benchmarks, time-series foundation models, LLM4TS, agentic systems, and reliability.
The source is credible as a current field map because it is maintained by a broad author group affiliated with Tsinghua University, UIC, CUHK, UTS, CMU, Ohio State, USC, NUS, Dartmouth, Peking University, Shenzhen University, Tongji, Northwestern, and Griffith, and because the repository itself preserves the paper list and survey artifact. It is not peer-reviewed evidence for every listed method, benchmark, or claim. Treat it as a survey and bibliography source: useful for taxonomy, gap finding, and candidate discovery, but not a substitute for ingesting primary papers before making strong technical claims.
Core Claim
The repository and survey frame agentic time series as a shift from model-centric prediction toward closed-loop systems that observe temporal evidence, reason over evolving state, choose tools or actions, receive feedback, update memory, and eventually simulate future temporal environments.
For this wiki, the important distinction is interface-level: a time-series agent is not just an LLM wrapper around forecasts. It is a temporal decision system with observations, context, tools or actions, feedback, state updates, and reliability constraints.
Repository Scope
The README organizes the field into six high-level source groups:
| Group | README entries in snapshot | Local interpretation |
|---|---|---|
| Surveys and position papers | 8 | Useful for terminology and field boundaries, but primary sources still need individual ingestion. |
| Benchmarks and datasets | 50 | Shows the shift from forecasting leaderboards toward reasoning, QA, engineering, decision, and future-prediction benchmarks. |
| Time-series foundation models | 31 | Mostly passive forecasting, representation, or universal-model sources; not automatically agentic. |
| LLM4TS | 64 | Translation, alignment, temporal reasoning, and LLM-mediated analysis sources. |
| Agentic time-series systems | 79 | Perception, reasoning, planning/action, memory, knowledge, world-model, and data-agent systems. |
| Reliability, safety, and trustworthiness | 7 | Early explicit reliability layer for forecasting agents and temporal decision systems. |
The repository’s paper-list taxonomy is useful because it separates the field by system role rather than by one benchmark score. It also exposes a practical ingest queue: many relevant 2025-2026 papers are not yet local source pages.
Survey Notes
The included survey defines a time-series agent as a closed-loop system operating in a temporal environment. The central loop is:
temporal evidence + context + current state
-> perception / reasoning / planning
-> tool call, query, or action
-> feedback
-> updated state or memoryThe survey’s five capability layers are:
- Time-series perception: turn raw numeric observations, diagnostic tool outputs, symbolic summaries, structure, or multimodal context into evidence.
- Time-series reasoning: infer patterns, causal hypotheses, anomalies, uncertainty, and future dynamics.
- Planning and action: route tools, acquire evidence, orchestrate model/data/code workflows, coordinate agents, or take external decisions.
- Memory and knowledge: store temporal cases, regimes, procedures, failures, confidence signals, and domain knowledge across sessions.
- Temporal world models: simulate plausible futures, interventions, and counterfactual alternatives.
Reliability and trustworthiness sit across the layers rather than at the end. The survey names forecasting quality, reasoning faithfulness, tool-use reliability, hallucination and grounding, robustness, decision safety, human alignment, auditability, and reproducibility as system-level checks.
Foundation TSFM Relevance
| Agenda slot | Verdict | Evidence | Missing pieces |
|---|---|---|---|
| Benchmarks and evaluation protocol | adjacent | The source maps forecasting, reasoning, QA, engineering, tool-use, decision, and future-prediction benchmarks into an agentic evaluation landscape. | Needs primary benchmark ingests and normalized protocols before the wiki can compare results. |
| Context interface | adjacent | The survey treats temporal agents as systems that combine numeric observations with textual, structural, tool, memory, and environmental context. | Does not define a concrete reusable schema for channels, topology, exogenous variables, action history, or deployment context. |
| Control and counterfactuals | adjacent | The planning/action and temporal-world-model layers explicitly discuss tools, actions, feedback, interventions, and counterfactual simulation. | Survey-level taxonomy only; no telemetry-native action-conditioned benchmark with typed operator actions and outcomes. |
| Streaming state and memory | adjacent | Memory and knowledge are treated as persistent temporal experience rather than a passive chat transcript. | No standardized benchmark for state-update cost, memory auditability, stale-memory failures, or long-horizon regime retention. |
| Reliability and benchmark hygiene | warning | The source warns that deployed agentic systems fail through interactions among perception, reasoning, tools, memory, actions, and feedback rather than through final forecast error alone. | Needs primary-source evidence and reproducible evaluation bundles for each failure mode. |
Links Into The Wiki
- Time-Series Foundation Models
- Foundation Time-Series Model Research Agenda
- Time-Series Benchmark Hygiene
- World Models
- Digital World Models
- Action-Conditioned Time-Series Datasets
- Streaming Latent-State Updates
- Agentic World Modeling
- Toto 2.0 TSALM Workshop Presentation
- TimeOmni-1
- TimeOmni-VL
Candidate Follow-Up Ingests
The source is mainly valuable as a candidate queue. The strongest first follow-up candidates are the entries that most directly touch Alex’s agenda:
Position: Beyond Model-Centric Prediction - Agentic Time Series Forecastingfor the explicit agentic-forecasting position.TemporalBenchandTimeSage-MTfor agentic/time-series reasoning benchmarks.TFRBench,ARFBench, andTimeSeriesGymfor reasoning, incident-response, and engineering-agent evaluation.KairosAgent,TimeART,Cast-R1,Nexus, andMoiraiAgentfor tool-augmented or planning-oriented time-series agents.MemCast,TS-Memory, andMEMTSfor memory interfaces.Chronicle,AgriWorld, andSonar-TSfor world-model, tool, or query interfaces over temporal environments.
These should not be cited as local evidence until each primary paper or artifact is checked and ingested.
Open Questions
- Which listed sources are primary evidence for action-conditioned temporal world models rather than LLM-mediated analysis pipelines?
- Which benchmark entries actually test closed-loop decisions, feedback, and action consequences instead of static QA or forecasting?
- What minimum reliability protocol should apply before calling a time-series agent deployable: calibration, grounding, tool-use logs, memory audit, action safety, replay, and cost?
- Which memory papers distinguish durable temporal state from context-window summarization?
- Can the survey’s five-layer architecture be translated into a concrete data contract for observability or industrial control: observations, context, event streams, actions, outcomes, and safety constraints?