Parcae: Scaling Laws For Stable Looped Language Models

Source

Core Claim

Parcae recasts looping as a time-varying dynamical system over the residual stream and constrains injection parameters for stable looped language-model scaling.

Relevance To This Wiki

It is the scaling-law and stability source for looped language models: recurrence is not only an architectural idea but a trainable scaling path when residual dynamics are controlled.

Limitations

The reported laws are for looped language models, not numeric time-series models. The inference-time quality curve saturates, so extra loops have diminishing returns.

Foundation TSFM Relevance

Useful background for fixed-FLOPs dynamic compute: loop count, data, and parameter memory become coupled scaling knobs.

Open Questions

  • What matched-budget baseline should this source be compared against: unique-depth Transformer layers, recurrent state, explicit memory, or extra inference steps?
  • Which claims transfer from token-sequence reasoning to multivariate time-series state tracking, event streams, or action-conditioned world models?
  • Does the saturating inference-time loop curve cap useful test-time compute before hard time-series windows are solved?