LLMs as Noisy Channels

Summary

LLMs as Noisy Channels is an ICML 2026 / arXiv scaling-law study that models LLM training through Shannon-Hartley channel capacity. It maps model size to bandwidth, training tokens to signal power, and perturbations such as data noise, model interaction, supervised fine-tuning, and quantization to noise terms.

Interface

Main object: Shannon Scaling Law for LLM capacity.
Variables: model size $N$ , training tokens $D$ , fitted signal exponent $β$ , bandwidth exponent $α$ , model-interaction noise exponent $γ$ , data-noise exponent $δ$ , and fitted constants.
Core formula: $C_{e x t LL M} = a N^{α} lo g_{2} (1 + b D^{β} / (c (D N)^{γ} + d D^{δ} + e))$ .
Empirical substrate: Pythia and OLMo2 checkpoints under Gaussian noise, SFT learning-rate sweeps, and GPTQ quantization.
Released artifacts: arXiv paper and ICML poster page. No official code or model checkpoints were found at ingest time.

Role In The Wiki

This entity is the local object card for SNR-aware LLM scaling. Use it when a page needs the caveat that monotonic power-law scaling can be a high-SNR special case rather than a universal rule.

For time-series and world-model work, this is upstream language-model evidence. Its transfer value is the design question: a TSFM scaling law should include the relevant information-density and noise variables, not only parameters, samples, or FLOPs. Candidate TSFM noise variables include corrupt observations, missingness, channel interference, long-horizon rollout error, quantized latent state, post-training drift, context noise, and action/intervention ambiguity.

Evidence

LLMs as Noisy Channels

Alex Open Research Wiki

Explorer

LLMs as Noisy Channels

LLMs as Noisy Channels

Summary

Interface

Role In The Wiki

Evidence

Graph View

Table of Contents

Backlinks

Alex Open Research Wiki

Explorer

LLMs as Noisy Channels

LLMs as Noisy Channels

Summary

Interface

Role In The Wiki

Evidence

Related Pages

Graph View

Table of Contents

Backlinks