The Illusion of Superposition? A Principled Analysis of Latent Thinking in Language Models

Source

Raw Markdown: paper_illusion-of-superposition-2026.md
PDF: paper_illusion-of-superposition-2026.pdf
Preprint: arXiv 2604.06374v1
Related OpenReview workshop version: The Illusion of Superposition in Latent CoT via Soft Thinking, published at LIT Workshop @ ICLR 2026
Gonzo ML discussion: Telegram post 5511 (local extract stored at papers/illusion-of-superposition-2026/telegram-post-gonzo-ml-5511.md)
Gonzo-linked review: ArXivIQ review
Podcast pointer: Gonzo ML Podcasts 3957
Local artifact metadata: papers/illusion-of-superposition-2026/official_artifacts_metadata.json

Status And Credibility

arXiv lists the paper as cs.CL with cs.LG, version v1, submitted on 2026-04-07. The paper is by Michael Rizvi-Martel, Guillaume Rabusseau, and Marius Mosbach from Mila, Université de Montréal, and McGill University. The arXiv license is CC BY 4.0 and the listed length is 9 pages.

Credibility is sufficient for an important warning-source ingest because the authors are from credible Canadian ML/NLP groups, the paper directly tests a widely repeated latent-CoT claim, the paper includes concrete internal-state probes, and a related narrower version was published at the LIT Workshop @ ICLR 2026. Caveats matter: the arXiv paper is still a preprint at ingest time, the workshop page is not an accepted main-conference version of the same full paper, no official code or model release was found, and the experiments are small and mostly synthetic or probe-oriented.

Core Claim

Latent chain-of-thought does not automatically mean that a model is exploring multiple reasoning paths in parallel. The paper tests the common superposition hypothesis across three regimes:

Training-free Soft Thinking: construct a soft reasoning token as a convex combination of token embeddings.
Fine-tuned Coconut: adapt a pretrained model to feed hidden states back as latent thoughts.
From-scratch Coconut: train a model entirely with latent thoughts on a synthetic task designed to reward multi-path reasoning.

The main finding is asymmetric: off-the-shelf and fine-tuned models usually collapse or shortcut, while small from-scratch models can show signs of superposition under constrained capacity.

flowchart TD
  Claim["continuous latent reasoning should support parallel paths"]
  Soft["Soft Thinking / forced token-mixture embeddings"]
  Finetuned["fine-tuned Coconut / pretrained GPT-2"]
  Scratch["from-scratch Coconut / small symbolic task"]
  Probes["Logit Lens + entity-level probing"]
  Collapse["collapse to near-discrete token behavior"]
  Shortcut["shortcut: answer available without latent steps"]
  Limited["limited superposition in shallow from-scratch models"]
  Warning["continuous latents need probe, ablation, and capacity checks"]

  Claim --> Soft
  Claim --> Finetuned
  Claim --> Scratch
  Soft --> Probes --> Collapse
  Finetuned --> Probes --> Shortcut
  Scratch --> Probes --> Limited
  Collapse --> Warning
  Shortcut --> Warning
  Limited --> Warning

Method Notes

The paper defines superposition as a hidden state encoding a distribution over multiple candidate continuations in some basis of computation. It separates:

forced superposition: explicitly introduced at the input as a convex combination of token embeddings;
learned superposition: emergent through training on a task that rewards parallel exploration of reasoning paths.

The diagnostic stack is important:

Logit Lens projects intermediate hidden states through the unembedding matrix to inspect whether entropy and token beliefs remain multi-valued across layers.
Token-level interventions compare soft-token processing with the argmax discrete-token replacement.
Entity-level probes track whether a graph-reasoning model’s probability mass moves through correct intermediate entities or jumps directly to the final answer.

Evidence

Evidence thread	Reported result	Local interpretation
Off-the-shelf Soft Thinking on QwQ-32B, Qwen2-1.5B, and DeepSeek-R1-Distill-Llama-70B appendices	Entropy profiles for soft and discrete CoT become nearly identical; token-level intervention yields KL near zero and cosine similarity above 0.99 in the paper’s conclusion	Soft token mixtures are processed almost like discrete tokens, so forced superposition is not preserved as parallel reasoning.
Fine-tuned Coconut on ProsQA with GPT-2 124M	Coconut with 6 latent tokens reports 99.0% accuracy, but the same model without latent tokens still reports 96.6%; CoT baseline reports 85.3%	The fine-tuned model mostly learns a shortcut or answer-copying path rather than using latent steps for multi-hop reasoning.
Entity-level probing of fine-tuned Coconut	The target entity dominates from step 0 rather than after intermediate correct-next entities	Accuracy alone is a misleading benchmark; internal-state trajectory matters.
From-scratch Coconut on simplified ProsQA	2-layer and 4-layer models drop from 94.5/96.2 with latent tokens to 13.8/16.0 without latent tokens	Under constrained capacity and task design, latent steps can become necessary and show superposition-like belief evolution.
Capacity ablation	8-layer and 12-layer from-scratch models improve no-latent accuracy to 62.8/63.0 and show shortcut signs	Even when trained from scratch, too much capacity can make shortcut solutions easier than maintaining superposition.

Relevance To This Wiki

This is not a time-series foundation-model paper. Its value is a warning about latent reasoning claims: continuous hidden-state interfaces, latent thoughts, soft tokens, and recurrent depth should not be assumed to preserve multiple futures or alternative plans unless the source actually probes them.

For foundation time-series models, the transferable test is direct: if a TSFM uses latent steps, soft tokens, recurrent loops, latent trajectories, or hidden candidate futures, it should report whether those states preserve uncertainty over regimes, channels, events, and candidate actions, or whether the model commits early to a single shortcut future.

The paper also strengthens the benchmark-hygiene rule for dynamic compute. A method can score well with latent tokens and still fail the intended mechanism if a no-latent, no-loop, shallow, or direct-answer ablation remains strong.

Foundation TSFM Relevance

Agenda slot	Verdict	Evidence	Missing pieces
Dynamic compute allocation	warning	Extra latent steps in fine-tuned Coconut add little beyond a no-latent shortcut on ProsQA, while Soft Thinking collapses toward discrete token behavior.	Needs TSFM-specific no-loop/no-latent ablations and matched-latency tests on numeric windows, event streams, and candidate futures.
Representation quality: semantic state vs dense detail	warning	Logit Lens and entity probes show that a non-text latent interface can still collapse or encode the final answer shortcut.	Need probes for whether latent state preserves rare regimes, cross-channel state, timing, and action-relevant variables.
Multi-modal future distributions and generation	adjacent	Superposition is framed as maintaining multiple candidate continuations inside one representation.	No time-series sample paths, calibrated future distributions, or multimodal trajectory evidence.
Action-conditioned world models	insufficient evidence	ProsQA and ProntoQA are synthetic reasoning tasks, not action-conditioned trajectories.	Needs actions, control inputs, interventions, and rollout utility tests.
Benchmark hygiene	warning	Final accuracy hides whether latent computation is causally used; no-latent and probing ablations expose shortcuts.	TSFM papers should pair scores with mechanism ablations and internal-state diagnostics.

Limitations

The arXiv version is a preprint and not a main-conference accepted paper at ingest time.
No official code, checkpoints, or reproduction package was found.
Soft Thinking probes use a small selected set of math reasoning problems plus appendix sweeps, not a broad benchmark study.
Coconut experiments are synthetic ProsQA/ProntoQA-style reasoning tasks, not open-ended planning or time-series decision tasks.
The paper is mostly a mechanistic/interpretability analysis; it does not propose a new high-performing latent reasoning model.
The definition of superposition is useful but still operationalized through specific probes and entity categories.
Negative results for token-level superposition do not rule out higher-level latent search over plans, programs, constraints, or candidate trajectories.

Links Into The Wiki

Open Questions

Which latent reasoning objectives explicitly reward preserving multiple candidate continuations rather than shortcutting to the final answer?
What is the right abstraction level for superposition: token embeddings, entities, constraints, plans, programs, latent states, or candidate trajectories?
Can a pretrained model be made to preserve useful uncertainty through post-training, or is from-scratch training with constrained capacity required?
Which no-latent, no-loop, no-memory, direct-answer, or low-capacity ablations should become mandatory for latent-reasoning claims?
What is the TSFM analogue of entity-level probing for latent state: regime labels, channel dependencies, event clusters, candidate futures, or intervention effects?

Alex Open Research Wiki

Explorer

The Illusion of Superposition? A Principled Analysis of Latent Thinking in Language Models

The Illusion of Superposition? A Principled Analysis of Latent Thinking in Language Models

Source

Status And Credibility

Core Claim

Method Notes

Evidence

Relevance To This Wiki

Foundation TSFM Relevance

Limitations

Links Into The Wiki

Open Questions

Graph View

Table of Contents

Backlinks