LeJEPA: Provable And Scalable Self-Supervised Learning Without The Heuristics

Source

Raw Markdown: paper_lejepa-2025.md
PDF: paper_lejepa-2025.pdf

Core Claim

LeJEPA argues that JEPA embeddings should follow an isotropic Gaussian distribution and introduces SIGReg to enforce that distribution efficiently.

Key Contributions

Provides a theory for the optimal embedding distribution for downstream prediction risk.
Introduces Sketched Isotropic Gaussian Regularization (SIGReg).
Combines JEPA predictive loss with SIGReg to reduce reliance on stop-gradient, EMA, teacher-student, and scheduler heuristics.
Validates across many datasets, architectures, and domains.

Method Notes

LeJEPA is central to JEPA, Representation Collapse, and Self-Supervised Representation Learning.

Evidence And Results

The source reports broad empirical validation, stable training across architectures and domains, and ImageNet-1k linear evaluation examples for large ViT models.

Limitations

The paper’s strongest claim is generality. The wiki should test that claim against multimodal and control-specific sources such as VL-JEPA and LeWorldModel.

Foundation TSFM Relevance

Agenda slot	Verdict	Evidence	Missing pieces
Anti-collapse regularization	partially closes	Combines JEPA prediction with SIGReg toward isotropic Gaussian embeddings to avoid complete and dimensional collapse.	Evidence is outside time series and does not test rare regimes or cross-channel deviations.
Representation quality	adjacent	Gives a principled target distribution for representations optimized for downstream prediction.	Does not show preservation of dense numeric detail for forecasting, generation, or editing.
Augmentation-free self-supervision	adjacent	JEPA-style latent prediction reduces reliance on handcrafted positive/negative augmentation pairs.	Needs time-series objectives that respect irregular sampling, channel identity, and event semantics.

Links Into The Wiki

Open Questions

Can SIGReg remain sufficient at frontier multimodal scale?
Is the isotropic Gaussian target universally optimal or domain-dependent?

Alex Open Research Wiki

Explorer

LeJEPA: Provable And Scalable Self-Supervised Learning Without The Heuristics

LeJEPA: Provable And Scalable Self-Supervised Learning Without The Heuristics

Source

Core Claim

Key Contributions

Method Notes

Evidence And Results

Limitations

Foundation TSFM Relevance

Links Into The Wiki

Open Questions

Graph View

Table of Contents

Backlinks