Entity Pages

Files

action100m-2026.md - Hierarchical HowTo100M action/caption annotation dataset with a public 120k-video preview.
aionoscope-2026.md - Process-to-View synthetic time-series diagnostic benchmark for latent-state accessibility.
act-2023.md - Action Chunking with Transformers method for continuous robot action chunks.
adajepa-2026.md - Adaptive JEPA latent world model for closed-loop MPC with test-time self-supervised updates.
agentic-automata-learning-2026.md - Controlled benchmark for testing whether tool-calling LLM agents can infer hidden DFA world models through interaction.
anomod-2026.md - Multimodal microservice anomaly-detection and root-cause-analysis dataset.
armt-2024.md - Associative Recurrent Memory Transformer method extending RMT with layerwise associative memory.
atlas-2025.md - ATLAS test-time memory module and DeepTransformers family.
audio-interaction-2026.md - Caveated Audio-Interaction streaming audio-language model for always-on silence/response decisions, with heavy curated-data dependence and no bounded-memory solution.
boom-2025.md - Datadog observability metrics forecasting benchmark.
bolmo-2025.md - Fully open byte-level LM family produced by byteifying subword LMs.
bridge-2025.md - Text-controlled time-series generation method using description refinement and diffusion/prototype conditioning.
bittokens-2025.md - Bit-level single-token number encoding method based on IEEE 754 features.
catsg-2025.md - Causal time-series generation method for observational, interventional, and counterfactual samples.
charm-2025.md - Channel-aware JEPA embedding model for multivariate time series.
chatts-2024.md - Time-series multimodal LLM trained from synthetic time-series/text data.
chronograph-2025.md - Graph-native multivariate microservice time-series benchmark with temporal node and edge features.
citylearn-2020.md - CityLearn building-energy control environment and dataset schema.
compress-attend-transformer-2025.md - Chunk-compressive Transformer architecture with a test-time quality/efficiency knob.
compute-optimal-tokenization-2026.md - Tokenization-aware scaling-law study centered on bytes per parameter and compute-optimal compression rate.
context-is-key-2024.md - ServiceNow benchmark for probabilistic forecasting with essential natural-language context.
cwm-2025.md - Meta FAIR Code World Model for code generation through computational-environment action-observation trajectories.
diff-mn-2026.md - Diffusion-parameterized MoE-NCDE method for irregular-to-continuous time-series generation.
diga-2026.md - Diffusion-guided meta-agent method for controllable financial market generation.
diffusionblocks-2026.md - Block-wise training framework that reinterprets residual-network depth as diffusion-style denoising blocks.
diffusion-policy-2023.md - Visuomotor diffusion policy over future continuous action trajectories.
dmax-2026.md - DMax diffusion-language-model method family for on-policy self-correction and soft parallel decoding.
dinov3-2025.md - Self-supervised vision foundation model suite.
dragon-hatchling-2025.md - Pathway BDH / Dragon Hatchling recurrent attention/state-space architecture.
ebt-2025.md - Energy-Based Transformer method for candidate-prediction scoring, gradient-based refinement, and dynamic compute.
eggroll-2025.md - Low-rank perturbation method for hyperscale evolution strategies.
fade-2026.md - Adaptive per-parameter weight-decay method for controlled forgetting in continual learning.
eidos-2026.md - Time-series foundation model family based on latent-space predictive learning.
elt-2026.md - Elastic Looped Transformer architecture for any-time visual generation.
flowstate-2025.md - SSM-based time-series foundation model with continuous functional-basis decoding.
florence-2-2023.md - Microsoft prompt-based vision foundation model built around the FLD-5B iterative data engine.
fone-2025.md - Fourier Number Embedding method for single-token number representations.
fprm-2026.md - Fixed-Point Reasoning Model looped Transformer with fixed-point halting.
frm-2026.md - Flow Reasoning Model framework for self-conditioned denoising, internal stability verification, and FlowDPO.
gaia-micross-2021.md - GAIA AIOps collection and MicroSS microservice telemetry subset.
gated-deltanet-2025.md - Linear recurrent attention method combining scalar gating with the delta rule for fixed-size associative-memory updates.
gated-deltanet-2-2026.md - Linear recurrent attention method that decouples key-side erase and value-side write gates for fixed-size memory editing.
fast-2025.md - Frequency-space tokenizer for robot action chunks.
gemma-4-12b-2026.md - Google DeepMind encoder-free multimodal 12B open-weight model.
gemini-robotics-1-5-2025.md - Google DeepMind embodied-reasoning and VLA robot model family.
genie-2024.md - Google DeepMind generative interactive environment model with learned latent actions from unlabeled video.
gift-eval-2024.md - General-purpose time-series forecasting benchmark and leaderboard.
gr00t-n1-2025.md - NVIDIA open humanoid VLA/action model.
gram-2026.md - GRAM probabilistic recursive-reasoning framework for stochastic multi-trajectory latent computation.
h-net-2025.md - Hierarchical sequence model with learned dynamic chunking.
helix-2025.md - Figure AI upper-body humanoid VLA with fast/slow control.
helix-02-2026.md - Figure AI full-body humanoid VLA/controller stack.
hierarchical-reasoning-model-2025.md - HRM fast/slow recurrent reasoning architecture.
hyperloop-transformers-2026.md - Looped language-model architecture using loop-level hyper-connections.
hybrid-associative-memories-2026.md - Hybrid recurrent-state and selective KV-cache layer for long-context sequence modeling.
hola-2026.md - Hippocampal Linear Attention model combining Gated DeltaNet recurrent state with a bounded exact KV cache selected by delta-rule update magnitude.
illada-2026.md - Improved LLaDA 8B masked diffusion language-model family with public base and instruct checkpoints.
justgrpo-2026.md - Diffusion-LM RL method that uses left-to-right rollouts during training while retaining parallel diffusion decoding at inference.
illusion-of-superposition-2026.md - Latent-CoT superposition analysis and shortcut/collapse warning.
implicit-curriculum-hypothesis-2026.md - Diagnostic hypothesis and ElementalTask suite for ordered LLM skill emergence during pretraining.
huginn-2025.md - Recurrent-depth language model using latent-space loops as test-time compute.
invdiff-2025.md - Invariant-guided diffusion debiasing method for unknown biases, primarily text-to-image, with limited OOD time-series forecasting evidence.
latent-context-language-models-2026.md - Encoder-decoder soft-token context-compression family for long-context language models.
latent-thought-flow-2026.md - LTF method for reward-proportional variable-length continuous latent reasoning trajectories.
latent-thoughts-2025.md - Looped Transformer framing for latent thought steps.
lemma-rca-2024.md - Large multi-modal multi-domain root-cause-analysis dataset collection.
lenepa-2026.md - No-augmentation next-latent-token prediction method for time-series representation learning.
levljepa-2026.md - Non-contrastive end-to-end vision-language pretraining method with cross-modal prediction and per-modality SIGReg.
l2rpn-grid2op-2020.md - L2RPN / Grid2Op benchmark ecosystem for action-conditioned graph time-series power-grid operation.
llm-emu-2026.md - Serving-native vLLM emulator for wall-clock online LLM inference experiments.
llm-sleep-2026.md - Sleep-time memory-consolidation method for SSM-attention hybrid language models.
llms-noisy-channels-2026.md - SNR-aware Shannon Scaling Law study for LLM capacity, overtraining, SFT, and quantization degradation.
lejepa-2025.md - JEPA objective combining predictive alignment with SIGReg.
leautoencoder-2026.md - Self-teaching autoencoder prototype using transformed latent consistency instead of direct image-space reconstruction loss.
loopformer-2026.md - Elastic-depth looped Transformer for budget-conditioned latent reasoning.
looped-world-models-2026.md - LoopWM action-conditioned world-model architecture with parameter-shared recurrent Transformer depth and deferred decoding.
lt2-2026.md - Linear-Time Looped Transformers method family replacing full attention inside looped Transformers with linear, sparse, or hybrid mixers.
mamba-2023.md - Selective state space model architecture for efficient recurrent sequence modeling.
mamba-2-2024.md - Structured state space duality architecture and SSD algorithm.
mamba-3-2026.md - Mamba-family architecture with richer discretization, complex state, and MIMO inference updates.
mantis-2025.md - Time-series classification foundation-model lineage covering Mantis, MantisV2, and UTICA.
mars-2025.md - Financial market simulation engine based on order-level generative foundation models.
mesanet-2025.md - Sequence model with locally optimal test-time training.
mhc-2025.md - Manifold-Constrained Hyper-Connections method for stable matrix-valued residual streams.
minimax-m3-2026.md - MiniMax open-weight native multimodal MoE model powered by MiniMax Sparse Attention.
minimax-sparse-attention-2026.md - Learned block-sparse GQA attention method for million-token contexts.
mg-tsd-2024.md - Multi-granularity diffusion method for probabilistic time-series forecasting.
mira-2025.md - Microsoft medical time-series foundation model for irregular clinical forecasting.
miras-2025.md - Associative-memory framework for attentional bias, retention, and online optimization.
moda-2026.md - Mixture-of-Depths Attention method for content-based inter-layer depth retrieval.
moirai-2024.md - Salesforce Uni2TS forecasting family covering Moirai 1.x, Moirai-MoE, and Moirai 2.0.
moshi-2024.md - Kyutai full-duplex speech-to-speech dialogue model with continuous audio streams and low-latency serving.
motive-2026.md - Query-conditioned motion-gradient data-attribution and fine-tuning-data selection method for video generators.
neorl2-2025.md - NeoRL-2 near-real-world offline reinforcement-learning benchmark.
nanotabpfn-looped-2026.md - Looped one-block nanoTabPFN variant from the One Layer Enough TFM inference-dynamics study.
nextlat-2026.md - Next-latent-prediction method that trains an autoregressive Transformer to predict its own next hidden state as a compact belief-state/world-model objective.
octo-2024.md - Open-source generalist robot policy.
olmo-hybrid-2026.md - Ai2 transformer—linear-RNN language-model family used here as upstream evidence for hybrid attention/recurrent-state tradeoffs.
oryx-2026.md - Sequence-axis hybrid model that switches between attention and linear recurrent mixers while sharing most representations.
openvla-2024.md - Open 7B VLA model using discretized action tokens.
openrca-2025.md - LLM-agent root-cause-analysis benchmark over large software telemetry.
ops-lite-2026.md - Compact RCA evaluation set with per-case causal graph ground truth.
oats-2026.md - Online TSFM pretraining augmentation method using influence scores and guided diffusion.
otf-lam-2026.md - Latent action model family built on factorized observed-transition primitives.
pdr-rtv-2026.md - Agentic-coding test-time scaling recipe using structured summaries, recursive tournament voting, and refinement.
parallel-samplers-recurrent-depth-2025.md - Parallel inference method for recurrent-depth models.
param-decomp-2026.md - Goodfire parameter-decomposition codebase for SPD/VPD-style weight-space interpretability and component-basis model editing.
pararnn-2025.md - Parallel nonlinear RNN training framework from Apple.
parcae-2026.md - Stable looped language-model architecture with scaling-law analysis.
perception-encoder-2025.md - Meta vision-encoder family whose strongest general features are often internal before alignment tuning.
pi0-2024.md - Physical Intelligence VLA flow model for general robot control.
pi0-7-2026.md - Steerable generalist VLA model with metadata, subgoal images, and flow action expert.
probabilistic-tiny-recursive-model-2026.md - PTRM inference framework for recurrent Gaussian perturbations, parallel latent trajectories, and Q-head candidate selection.
raev2-2026.md - Multi-layer representation-autoencoder recipe for generation and navigation world-model rollouts.
rcaeval-2025.md - Microservice root-cause-analysis benchmark and evaluation framework.
rdt-1b-2024.md - Robotics Diffusion Transformer for bimanual manipulation.
rate-2023.md - Recurrent Action Transformer with Memory offline RL policy architecture.
recurrent-transformer-2026.md - Transformer variant with layerwise recurrent memory.
rmt-2022.md - Recurrent Memory Transformer segment-level memory-token method.
reinpatch-2026.md - Learned adaptive patching method for time-series forecasting.
rt-2-2023.md - Vision-language-action model using action-as-language tokens.
rwkv-ts-2024.md - RWKV-style recurrent sequence model adapted to time-series tasks.
sensorimotor-world-models-2026.md - Inverse-dynamics-regularized JEPA world-model method for action-aligned latent dynamics.
sensorfm-2026.md - Google wearable-health foundation model for passive multivariate sensor representations, imputation, and downstream health prediction.
simmtm-2023.md - Masked time-series modeling framework based on multi-neighbor reconstruction.
skyjepa-2026.md - JEPA-style latent dynamics world model for real-time quadrotor control.
sparse-layers-looped-language-models-2026.md - Sparse MoE looped language-model branch.
supervised-memory-training-2026.md - SMT/DMT method for predictive-memory pretraining of nonlinear RNNs.
stable-worldmodel-2026.md - Reproducible world-model research platform with trajectory data handling, planning solvers, baselines, and factor-of-variation evaluation.
sundial-2025.md - THUML flow-matching time-series forecasting foundation-model family.
t2s-2025.md - Text-to-time-series generation model with LA-VAE and flow-matching Diffusion Transformer.
tabm-2024.md - MLP-based tabular deep-learning model with parameter-efficient ensembling.
telecomts-2025.md - Multimodal 5G observability benchmark for anomaly detection, root-cause analysis, forecasting, and time-series/text Q&A.
temporal-straightening-2026.md - Latent-trajectory curvature regularizer for planner-facing world-model geometry.
tennessee-eastman-process-2017.md - Tennessee Eastman Process simulation data for industrial anomaly detection and fault diagnosis.
thinking-pixel-2026.md - Recursive Sparse Reasoning method for multimodal diffusion latents.
time-2026.md - Contamination-resistant zero-shot forecasting benchmark.
time-hd-2025.md - High-dimensional time-series forecasting benchmark introduced with U-Cast.
time-series-library-2024.md - Time-series benchmark collection and LSF/LTSF dataset handle.
timecraft-2025.md - Microsoft framework and repository for prototype, text, and target-aware time-series generation.
timedp-2025.md - Prototype/domain-prompt diffusion method for cross-domain synthetic time-series generation.
tardiff-2025.md - Target-oriented diffusion method for downstream-useful synthetic EHR time-series generation.
timeraf-2025.md - Retrieval-augmented zero-shot time-series forecasting method.
timeomni-1-2026.md - Time-series reasoning model and TSR-Suite.
timeomni-vl-2026.md - Vision-centric time-series understanding/generation framework.
timesfm-2023.md - Google decoder-only time-series forecasting foundation model.
tiny-recursive-model-2025.md - TRM tiny recursive reasoning model.
titans-2025.md - Neural long-term memory sequence-model family.
titans-revisited-2025.md - Titans reimplementation and critical-analysis object.
tiny-time-mixers-2024.md - IBM Granite compact pretrained mixer-style forecasting model family.
toto-2025.md - Datadog observability-oriented forecasting family covering Toto 1.0 and Toto 2.0.
ts2vec-2021.md - Hierarchical contrastive time-series representation model.
tsmixer-2023.md - All-MLP time-series forecasting architecture.
turbo-gnn-2026.md - IO-aware GNN layer implementation package for fused graph attention, degree-aware reductions, and cached sparse aggregation baselines.
turboquant-2025.md - Online vector quantization method for KV-cache and vector-search state, with vLLM caveats around FP8, latency, throughput, and memory pressure.
tuna-2-2026.md - Pixel-space unified multimodal model.
units-2024.md - Unified multi-task time-series model.
universal-reasoning-model-2025.md - UT-derived recursive reasoning model.
universal-transformers-2018.md - Root recurrent-depth self-attention model.
universal-transformers-need-memory-2026.md - Study of UT memory tokens and adaptive-depth tradeoffs.
variable-width-transformers-2026.md - ><former variable-width Transformer architecture with a static bowtie hidden-width schedule and carry-forward residual stream.
visreg-2026.md - VISReg visual SSL regularizer that decouples scale and Sliced-Wasserstein shape matching for JEPA training.
vjepa-2026.md - Variational JEPA model object for explicit predictive distributions over future latent states and its BJEPA modular-prior extension.
vla-jepa-2026.md - Vision-language-action model using JEPA-style latent world-model pretraining and a flow-matching action head.
vl-jepa-2025.md - Vision-language JEPA system.
vlwm-2025.md - Language-state world model for direct and critic-ranked high-level procedural planning.

Explorer

Entity Pages

Entity Pages

Files