Physics Informed Reinforcement Learning with Gibbs Priors for Topology Control in Power Grids

Source

Raw Markdown: gibbs-priors-topology-control-2026
Rendered / retrieved PDF: paper_gibbs-priors-topology-control-2026.pdf
External source: https://arxiv.org/abs/2604.01830

Publication And Credibility

Paper date: arXiv published 2026-04-02.
Venue/status: arXiv preprint.
Credibility: Very recent preprint; important because it directly targets Grid2Op case14/case36/case118 and compares against PPO plus a strong LJN topology-only baseline on case118. Needs independent replication before being treated as settled SOTA.

Core Claim

A semi-Markov RL agent acts only in hazardous regimes and uses a GNN surrogate to predict post-action overload risk; those predictions form a physics-informed Gibbs prior that selects a small candidate set and reweights policy logits.

L2RPN / Grid2Op Notes

This is one of the closest current Grid2Op papers to an action-conditioned learned world-model component. It reports reward/survival gains over PPO, near-oracle case14/case36 tradeoffs at much lower decision time than the Greedy oracle, and case118 gains over PPO while remaining below but faster than topology-only LJN.

Action-Time-Series / World-Model Notes

The learned object is current graph state + feasible topology action -> next-step overload risk, not a full multi-step latent dynamics model. That makes it an action-conditioned risk surrogate suitable for pruning/ranking candidate actions before expensive simulation.

Foundation TSFM Relevance

Agenda slot	Verdict	Evidence	Missing pieces
Causal structure, counterfactuals, and control	partially closes	Trains a one-step action-conditioned risk predictor from simulator outcomes.	Does not roll out future graph states over multi-step action sequences.
Safety and rare events	partially closes	Focuses intervention on hazardous regimes and overload risk.	Needs uncertainty and calibration for operational deployment.
Context interface	partially closes	Graph encoder and action embedding join topology context with control inputs.	Needs transfer tests across grids and non-grid systems.

Alex Open Research Wiki

Explorer

Physics Informed Reinforcement Learning with Gibbs Priors for Topology Control in Power Grids

Physics Informed Reinforcement Learning with Gibbs Priors for Topology Control in Power Grids

Source

Publication And Credibility

Core Claim

L2RPN / Grid2Op Notes

Action-Time-Series / World-Model Notes

Foundation TSFM Relevance

Links Into The Wiki

Graph View

Table of Contents

Backlinks