Learning Topology Actions for Power Grid Control: A Graph-Based Soft-Label Imitation Learning Approach

Source

Publication And Credibility

  • Paper date: arXiv published 2025-03-19; v2 updated 2025-06-19.
  • Venue/status: ECML PKDD 2025 ADS track, DOI 10.1007/978-3-032-06129-4_8.
  • Credibility: Peer-reviewed venue with Fraunhofer/University of Kassel and TenneT authors. This is one of the strongest recent practical Grid2Op method papers.

Core Claim

Soft-label imitation learning trains a GNN policy on distributions over viable topology actions derived from simulated action outcomes, rather than forcing one hard expert action per state.

L2RPN / Grid2Op Notes

The experiments use the WCCI 2022 Grid2Op environment and report a 17 percent performance improvement over the greedy expert used to produce the imitation targets, plus stronger performance than hard-label and DRL baselines.

Action-Time-Series / World-Model Notes

This is not a learned transition model, but it is directly relevant to action-conditioned world-model design: simulator-generated counterfactual action outcomes are distilled into a reusable action-ranker that preserves multiple viable actions per state.

Limitations / Gotchas

  • The best reported variants still rely on simulator or feasibility checks after neural action ranking; the GNN is not a stand-alone safety guarantee.
  • The paper does not provide a learned long-horizon transition model; it ranks or proposes actions from simulated outcomes.
  • Temperature sensitivity and the scaling behavior across larger or different grid topologies remain open limitations.

Foundation TSFM Relevance

Agenda slotVerdictEvidenceMissing pieces
Causal structure, counterfactuals, and controlpartially closesLabels come from simulated candidate-action outcomes, so supervision is counterfactual/action-conditioned.Does not model full multi-step trajectories or uncertainty over future events.
Context interfacepartially closesGNN encodes grid topology for action choice.Needs a generic graph-context schema outside power grids.
Benchmark hygienepartially closesCompares soft labels, hard labels, expert, and DRL baselines.Still tied to WCCI 2022 action space and generated expert distribution.