ChronoGraph: A Real-World Graph-Based Multivariate Time Series Dataset

Source

ChronoGraph is a graph-structured multivariate time-series forecasting dataset from production microservices with incident labels.

The dataset covers 708 services and 1529 directed service-to-service edges.
Each service has five temporal node features, and each edge has eight temporal interaction features.
The paper reports six months of telemetry, 8005 aligned time steps, and 17 expert-labeled anomaly segments.
The official data layout is graph-native: edges.csv, node_features.json, and edge_features_part{i}.json.
The main tasks are service-level forecasting, anomaly detection, and incident-aware evaluation.
Reported baselines include Prophet, Chronos-Bolt Base, TabPFN-TS, Autoencoder, Isolation Forest, One-Class SVM, and a Prophet/Isolation Forest/Autoencoder ensemble.

It is definitely a time-series dataset, but the source does not expose controllable actions or interventions as a channel.
Incident windows are labels or exogenous shocks, not operator actions.
It belongs as a near-miss for passive world models, not as a primary action-conditioned dataset.

The evaluated baseline suite is mostly topology-agnostic, despite the dataset itself being graph-native.
ChronoGraph is the closest public match here to “whole graph plus temporal node/edge features”, but it is still passive telemetry rather than an intervention log.
The public repository uses Apache-2.0, but the knowledge base does not mirror dataset payloads.

Agenda slot	Verdict	Evidence	Missing pieces
Native multivariate encoding and high-channel scaling	partially closes	Provides 708 services, 1529 directed edges, node metrics, edge metrics, and incident labels from production microservices.	The baseline suite is mostly topology-agnostic; no foundation model demonstrates graph-aware scaling here.
Benchmarks: what level of modeling is tested?	partially closes	Tests forecasting and anomaly detection during incidents and shows long-horizon and anomaly-detection failures.	Does not test causal/counterfactual reasoning or action-conditioned control utility.
Causal structure, counterfactuals, and control	insufficient evidence	The service graph plus telemetry is close to an observability state substrate for digital-world agents.	No operator action, deployment, rollback, or autoscaling intervention channel is logged.