NextLat

Summary

NextLat is the next-latent-prediction method introduced by Next-Latent Prediction Transformers Learn Compact World Models. It trains an autoregressive Transformer with ordinary next-token prediction plus an auxiliary objective that predicts the model’s own next hidden state.

The method matters because it turns a Transformer’s hidden state into an explicitly supervised latent transition object. The base Transformer and its ordinary autoregressive inference path remain unchanged, while a lightweight latent dynamics model is used during training and for optional self-speculative decoding.

Method Contract

Base model: decoder-only autoregressive Transformer.
Main objective: next-token cross-entropy.
Auxiliary objective: predict $h_{t + 1}$ from $(h_{t}, X_{t + 1})$ with a latent dynamics model.
Target handling: detached hidden-state targets and stop-gradient choices are used to avoid collapse and reduce extra backward cost.
Optional semantic alignment: KL matching between token distributions from true and predicted hidden states.
Claimed latent semantics: optimized hidden states become belief states, i.e. compact sufficient statistics for predicting future observations.
Serving hook: recursively rolling the latent dynamics model can draft variable-length continuations for self-speculative decoding.

flowchart LR
  Prefix[token history] --> Tr[Transformer]
  Tr --> Ht[h_t]
  Ht --> LM[next-token head]
  LM --> Xt[next token]
  Ht --> Psi[latent dynamics]
  Xt --> Psi
  Psi --> Hhat[h_hat_t+1]
  Hnext[h_t+1 target] -. detached .-> Loss[NextLat loss]
  Hhat --> Loss

Official Artifacts

Preprint: arXiv 2511.05963
OpenReview: PLAN-FM Bridge @ AAAI 2026
Official blog: Next-Latent Prediction Transformers
Official code: JaydenTeoh/NextLat
Official X thread: Jayden Teoh announcement
Local code README snapshot: papers/nextlat-2026/github-readme-nextlat.md

The repository includes NextLat plus GPT, MTP, JTP, and BST baselines, training/evaluation scripts, configs, and data instructions. It does not by itself make the paper’s claims independently replicated.

Relevance To This Wiki

NextLat belongs on the latent-space predictive learning, JEPA-adjacent, and world-model branches. It is not a pure JEPA system because it keeps next-token prediction and uses the Transformer’s own hidden states as targets rather than a separate target encoder. It is also not a complete action-conditioned world model because the transition is over hidden state plus next token, not over typed external actions or interventions.

It should also be read as a close neighbor of Alex’s LeNEPA idea: LeNEPA asks whether NEPA-style next-embedding prediction plus LeJEPA-style distribution regularization should use external embeddings, own hidden states, or both. NextLat supplies the own-hidden-state side of that comparison.

For time-series and operational world-model work, the useful transfer is the pressure toward compact belief states and the evaluation lesson: next-observation accuracy is not enough. A TSFM analogue should check whether latent states preserve regimes, rare events, channel dependencies, exogenous variables, and action history, not only whether forecast loss improves.

Caveats

Evidence is language and synthetic/sequence-world-model evidence, not numeric time-series evidence.
The idealized theorem depends on successful optimization and does not remove empirical target/loss-design questions.
The latent dynamics model is simple and underexplored.
Self-speculative decoding is promising but currently evaluated with fixed draft-length sweeps rather than learned adaptive budgets.
The official README records reproducibility caveats around torch.compile(), Triton/Liger kernels, and hardware-specific throughput measurement.

Alex Open Research Wiki

Explorer

NextLat

NextLat

Summary

Method Contract

Official Artifacts

Relevance To This Wiki

Caveats

Graph View

Table of Contents

Backlinks

Alex Open Research Wiki

Explorer

NextLat

NextLat

Summary

Method Contract

Official Artifacts

Relevance To This Wiki

Caveats

Related Pages

Graph View

Table of Contents

Backlinks