Graph Reinforcement Learning for Power Grids: A Comprehensive Survey

Source

Publication And Credibility

  • Paper date: arXiv published 2024-07-05; v4 updated 2026-01-07.
  • Venue/status: Energy and AI article DOI 10.1016/j.egyai.2025.100671, with arXiv source available.
  • Credibility: Current peer-reviewed survey with multiple Fraunhofer/University of Kassel/University of Greifswald authors; useful for mapping but secondary relative to individual method papers.

Core Claim

The survey maps graph reinforcement learning for transmission and distribution grids, emphasizing graph representation, GNN architecture, RL method, benchmark limitations, and real-world deployment gaps.

L2RPN / Grid2Op Notes

The survey explicitly discusses Grid2Op as the common simulator for many transmission-grid control papers and highlights comparability issues around grid size, horizons, stochasticity, seeds, and action spaces.

Action-Time-Series / World-Model Notes

It contains the clearest secondary-source statement that Taha et al. use a GCN as a learned physics model for planning: predicting line loading for different actions and feeding those predictions to MCTS. That is the main reason this survey is linked into world-model pages.

Foundation TSFM Relevance

Agenda slotVerdictEvidenceMissing pieces
Benchmark hygienepartially closesSummarizes Grid2Op evaluation pitfalls and recommends standardized protocols.As a survey, it does not provide a new benchmark artifact.
Context interfacepartially closesGraph representation choices are central.Needs TSFM-specific schemas and non-grid analogs.
Causal structure, counterfactuals, and controladjacentReviews action-conditioned graph RL and learned surrogate planning.Individual primary papers still need separate verification.