It’s All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization
Source
- Raw Markdown: paper_miras-2025.md
- PDF: paper_miras-2025.pdf
- Preprint: arXiv 2504.13173
Core Claim
MIRAS reframes Transformers, Titans, and linear recurrent models as associative memory modules defined by memory architecture, attentional-bias objective, retention gate, and learning algorithm.
Relevance To This Wiki
This is a unifying theory-side source for the test-time memorization branch, useful for comparing attention, recurrent memory, and online optimization as variants of the same memory interface.
Limitations
It is a broad architecture framework. Each concrete claim needs to be checked against task-specific baselines before promoting it as time-series evidence.
Foundation TSFM Relevance
Potentially useful for choosing memory objectives and retention mechanisms for multivariate time-series latent state, but currently adjacent rather than central.
Links Into The Wiki
- MIRAS
- Looped Transformers And Test-Time Memory
- Efficient Recurrent Sequence Models
- Time-Series Scaling And Efficiency
- Foundation Time-Series Model Research Agenda
Open Questions
- What matched-budget baseline should this source be compared against: unique-depth Transformer layers, recurrent state, explicit memory, or extra inference steps?
- Which claims transfer from token-sequence reasoning to multivariate time-series state tracking, event streams, or action-conditioned world models?