Fixed-Point Reasoners: Stable and Adaptive Deep Looped Transformers
Source
- Raw Markdown: paper_fprm-2026.md
- PDF: paper_fprm-2026.pdf
- Preprint: arXiv 2606.18206
- Workshop version: ICML 2026 AdaptFM OpenReview poster
- Official code: nilskiKonjIzDunava/fprm
- Official checkpoints: fixed-point-reasoners/fprm
- Official X thread: Sajad Movahedi on FPRM
- Gonzo ML discussion: Telegram post 5602
- Review: ArXivIQ summary
Local social/context snapshots are stored as raw provenance under papers/fprm-2026/telegram-post-gonzo_ML-5602.md, papers/fprm-2026/x_thread_sajad_movahedi_2069070293696405680.md, and papers/fprm-2026/x_thread_sajad_movahedi_2069070293696405680.json.
Credibility And Status
This is a fresh 2026 source: arXiv v1 was submitted on 2026-06-16, and OpenReview lists it as a published ICML 2026 AdaptFM workshop poster. The author list connects ELLIS Institute Tübingen, Max Planck Institute for Intelligent Systems, ETH Zurich, Swiss Institute of Bioinformatics, Université Paris Cité, and Liquid AI. The GitHub repository describes itself as the official implementation, and the official Hugging Face organization publishes experiment checkpoints under an MIT-licensed model card.
Core Claim
FPRM is a single-loop Transformer reasoning model that uses convergence toward a fixed point as the halting signal. It replaces the post-norm convention in looped reasoning models with pre-norm plus learned layer-wise and iteration-wise residual scaling, then uses a damped fixed-point optimizer at inference to reduce oscillation around the fixed point.
What It Adds
- Fixed-point halting: the model loops until the hidden-state residual is small, so the stopping rule is part of the latent dynamics instead of a separately trained ACT head.
- Signal propagation fix: pre-norm improves gradient and representation flow through large effective depth; residual scaling recovers boundedness that post-norm previously supplied.
- Non-hierarchical recursion: the reported 7M-parameter FPRM outperforms or matches recursive puzzle baselines on Sudoku-Extreme, Maze-Hard, ARC-AGI, and synthetic state tracking without HRM/TRM-style fast/slow hierarchy.
- Released artifacts: the GitHub README links official Hugging Face checkpoints for Maze-Hard, Sudoku-Extreme, ARC-1, and ARC-2. This supersedes the Gonzo ML post’s initial
Model: N/Afield.
Relevance To This Wiki
FPRM is an important update to the looped-depth and recursive-reasoning branch. It narrows the design question from “do recursive models need a fast/slow hierarchy?” to “what state-stability, signal-propagation, and stopping-rule contracts make looped depth usable under a declared compute budget?”
For time-series and world-model work, the transferable mechanism is the convergence-based control interface: a latent-state update could spend more recurrent depth on hard windows, rare regimes, or action-conditioned transitions, then stop when the update stabilizes. The current evidence is still puzzle and algorithmic-reasoning evidence, so it should be treated as adjacent dynamic-compute machinery rather than direct TSFM evidence.
Foundation TSFM Relevance
| Agenda slot | Verdict | Evidence | Missing pieces |
|---|---|---|---|
| Dynamic compute / fixed-FLOPs hierarchy | adjacent | Fixed-point residuals provide an input-dependent halting signal for looped latent computation. | Needs matched wall-clock, memory-bandwidth, and expected-FLOPs comparisons against unique-depth, wider, recurrent-state, and explicit-memory baselines. |
| Latent-state modeling | adjacent | The model treats hidden-state convergence as useful computation rather than only a diagnostic. | No multivariate time-series, event-stream, or action-conditioned state-transition benchmarks. |
| Benchmark hygiene | warning | Strong puzzle and ARC-style results are useful but concentrated in fully observed symbolic domains. | Needs no-loop/no-fixed-point ablations, calibrated stopping tests, and transfer to numeric or operational trajectories before TSFM claims. |
Limitations
- The evidence is centered on Sudoku, Maze, ARC-AGI, and synthetic state-tracking tasks, not continuous numeric time series or action-conditioned world models.
- The fixed-point residual is a halting statistic, not automatically calibrated uncertainty; downstream systems would need calibration and failure-detection probes.
- The paper’s hierarchy critique should not be overread: FPRM shows that HRM/TRM-style hierarchy is not necessary for the reported benchmarks, but it does not prove that hierarchy is unnecessary for long-horizon, partially observed, multimodal, or control-heavy settings.
- Reported test-time compute can reach very high effective depth, so practical serving claims need realized latency, batching, kernel, memory, and early-exit accounting rather than only nominal loop count.
Links Into The Wiki
- FPRM
- Hierarchical Reasoning Model
- Tiny Recursive Model
- Looped Transformers And Test-Time Memory
- Efficient Recurrent Sequence Models
- Time-Series Scaling And Efficiency
- Hierarchical Modeling with a Fixed FLOPs Budget
- Foundation Time-Series Model Research Agenda
Open Questions
- Can fixed-point residuals become calibrated uncertainty or anomaly signals for time-series windows, or are they only optimization diagnostics?
- Does pre-norm plus residual scaling still stabilize looped depth when the latent state is driven by noisy observations, missing data, exogenous variables, and control inputs?
- Under a fixed serving budget, when is fixed-point looping better than a unique-depth Transformer, compact recurrent state, explicit memory tokens, depth-KV retrieval, or parallel latent trajectories?
- Can the fixed-point halting criterion avoid shortcut collapse on tasks where multiple candidate futures or hidden regimes must remain alive until late evidence arrives?