ParaRNN

Summary

ParaRNN is Apple’s framework for training nonlinear recurrent neural networks in parallel by solving the hidden-state trajectory as a nonlinear system with Newton iterations and parallel reduction.

Role In The Wiki

ParaRNN anchors the nonlinear branch of efficient recurrent sequence models. Where Mamba-style SSMs preserve parallel training by keeping hidden-state updates linear, ParaRNN shows that adapted GRU and LSTM cells can be trained at billion-parameter language-model scale with parallelized nonlinear state updates.

Evidence

ParaRNN: Unlocking Parallel Training of Nonlinear RNNs for Large Language Models

Relation To Foundation TSFM Agenda

Use the source-level agenda mapping in pararnn-2025 rather than duplicating verdict rows here.

At the entity level, ParaRNN anchors the nonlinear branch of efficient recurrent sequence models. Where Mamba-style SSMs preserve parallel training by keeping hidden-state updates linear, ParaRNN shows that adapted GRU and LSTM cells can be trained at billion-parameter language-model scale with parallelized nonlinear state updates. This page should stay as the object card; source pages carry slot-level verdicts, evidence, and missing pieces.

Overlap Notes

ParaRNN overlaps with Mamba on the serving goal of compact recurrent state, but differs by allowing nonlinear recurrent cells and paying for Newton-style hidden-trajectory solving. It overlaps with RMT only at the level of “state carried across sequence”; RMT exposes state as memory tokens, while ParaRNN keeps it as recurrent hidden dynamics. It now also contrasts with Supervised Memory Training: SMT/DMT avoids BPTT during pretraining through Transformer-generated predictive memory labels, while ParaRNN keeps end-to-end recurrent training and parallelizes the nonlinear trajectory solve.

Alex Open Research Wiki

Explorer

ParaRNN

ParaRNN

Summary

Role In The Wiki

Evidence

Relation To Foundation TSFM Agenda

Overlap Notes

Graph View

Table of Contents

Backlinks

Alex Open Research Wiki

Explorer

ParaRNN

ParaRNN

Summary

Role In The Wiki

Evidence

Relation To Foundation TSFM Agenda

Overlap Notes

Related Pages

Graph View

Table of Contents

Backlinks