OpenRCA

Source

Dataset metadata snapshot: openrca-2025
OpenReview: https://openreview.net/forum?id=M4qNIzQYpd
Official GitHub: https://github.com/microsoft/OpenRCA

Core Claim

OpenRCA reframes microservice/software RCA as an LLM and agent benchmark. A model receives a natural-language query and must inspect telemetry to produce root-cause datetime, component, and reason.

Dataset Notes

The OpenReview paper reports 335 failures from three enterprise software systems and more than 68 GB of telemetry.
The GitHub README names the systems as Telecom, Bank, and Market.
The telemetry directory contains logs, metrics, and traces under date-stamped folders.
Inputs include KPI time series, dependency trace graphs, semi-structured logs, and natural-language queries.

Reported Baselines

OpenRCA introduces RCA-agent, which uses Python retrieval and analysis to avoid forcing all telemetry into the LLM context. The repository also includes standard, balanced, and oracle-style evaluation scripts.

Why It Matters

OpenRCA is the best fit in this group for evaluating LLM-agent investigation behavior over large telemetry, not for training a pure numeric graph time-series model.

Gotchas

The scoring is strict: the answer must match all required root-cause elements.
The README links telemetry through Google Drive and does not state a separate telemetry dataset license.
OpenRCA is diagnostic; it does not provide operator remediation actions.

Foundation TSFM Relevance

Agenda slot	Verdict	Evidence	Missing pieces
Benchmarks: what level of modeling is tested?	partially closes	Benchmarks LLM/agent RCA over logs, metrics, traces, dependency graphs, and natural-language queries for enterprise systems, close to the observability slice of the digital-world robot north star.	It is diagnostic and answer-scored, not a training corpus for action-conditioned system control.
Context interface	partially closes	Combines natural-language incident queries with KPI time series, trace graphs, semi-structured logs, and record metadata.	Context is consumed through an agent workflow, not a standardized TSFM input schema.
Control and counterfactuals	warning	Root-cause outputs identify datetime, component, and reason.	No logged remediation actions, intervention choices, or counterfactual rollout labels.

Alex Open Research Wiki

Explorer

OpenRCA

OpenRCA

Source

Core Claim

Dataset Notes

Reported Baselines

Why It Matters

Gotchas

Foundation TSFM Relevance

Links Into The Wiki

Graph View

Table of Contents

Backlinks