Toto

Summary

Toto is Datadog’s observability-oriented time-series forecasting foundation-model line. In this wiki it covers Toto-Open-Base-1.0 and the Toto 2.0 scaling family.

Lineage

  • Toto 1.0 introduces a 151M-parameter open-weights observability forecaster, BOOM benchmark, factorized time-variate attention, patch-based causal instance normalization, and Student-T mixture forecasting head.
  • Toto 2.0 extends the line into an open-weights scaling family from 4M to 2.5B parameters, uses contiguous patch masking, and reports strong BOOM, GIFT-Eval, and TIME results.
  • Toto 2.0 TSALM presentation records the workshop framing around scaling, data mix, UMuP/REX transfer, NormMuon optimization, ARFBench, Toto-1.0-QA-Experimental, and the planned multimodal observability world-model direction.

Official Artifacts

Role In The Wiki

Toto anchors the observability time-series branch. It is a strong passive forecasting line, but it is not yet an action-conditioned world model because deployments, rollbacks, autoscaling, remediation, and other operator actions are not first-class forecast-conditioning channels in the current sources.

Relation To Foundation TSFM Agenda

Use the source-level agenda mappings rather than duplicating verdict rows here:

At the entity level, Toto anchors the observability time-series branch. It is a strong passive forecasting line, but it is not yet an action-conditioned world model because deployments, rollbacks, autoscaling, remediation, and other operator actions are not first-class forecast-conditioning channels in the current sources. This page should stay as the object card; source pages carry slot-level verdicts, evidence, and missing pieces.