NeoRL-2

Summary

NeoRL-2 is a near-real-world offline reinforcement-learning benchmark with datasets and evaluation simulators. It provides non-vision action-conditioned trajectories across control tasks designed to expose delays, exogenous factors, safety constraints, rule-based policies, and limited-data regimes.

Official Artifacts

Official GitHub: https://github.com/polixir/NeoRL2
Official Hugging Face dataset: https://huggingface.co/datasets/polixirai/NeoRL2
arXiv preprint: https://arxiv.org/abs/2503.19267
Dataset metadata snapshot: neorl2-2025

Dataset Shape

The core paper/GitHub tasks are Pipeline, Simglucose, RocketRecovery, RandomFrictionHopper, DMSD, Fusion, and SafetyHalfCheetah. Each task exposes observations, continuous actions, rewards, next observations, and termination flags.

Role In The Wiki

NeoRL-2 belongs in the Tier 1 non-vision action-conditioned world-model dataset bucket. It is a stronger fit than passive forecasting, anomaly-only, or contextual bandit datasets because it exposes transition tuples and rewards.

Relation To Foundation TSFM Agenda

Use the source-level agenda mapping in neorl2-2025 rather than duplicating verdict rows here.

At the entity level, NeoRL-2 is useful as a compact benchmark for action-conditioned dynamics under practical constraints. Its main caveats are simulation realism, preprint status, and artifact/license pinning.

Alex Open Research Wiki

Explorer

NeoRL-2

NeoRL-2

Summary

Official Artifacts

Dataset Shape

Role In The Wiki

Relation To Foundation TSFM Agenda

Graph View

Table of Contents

Backlinks

Alex Open Research Wiki

Explorer

NeoRL-2

NeoRL-2

Summary

Official Artifacts

Dataset Shape

Role In The Wiki

Relation To Foundation TSFM Agenda

Related Pages

Graph View

Table of Contents

Backlinks