Moshi

Summary

Moshi is Kyutai’s 2024 full-duplex speech-to-speech dialogue model. It listens to the user audio stream while generating its own text and audio streams, including silence, without requiring explicit speaker turns.

Role In The Wiki

Moshi is an engineering example for continuous streaming data, low-latency serving, stream separation, and temporal artifact metrics. It is not a numeric time-series foundation model and should not be treated as an action-conditioned world model.

Its closest value for metrics work is the combination of:

low-latency stream processing;
separate observed and generated streams;
silence as first-class generated behavior;
evaluation of turn-taking and dialogue timing;
token-entropy diagnostics for artifact patterns over generated time.

Official Artifacts

Preprint: arXiv 2410.00037
Official technical report PDF: Moshi.pdf
Official launch blog: Meet Moshi, the first real-time voice AI
Official open-source release: Moshi open-source release: run Moshi locally!
Official code: kyutai-labs/moshi
Official demo: moshi.chat
Official Hugging Face collection: Moshi v0.1 Release
Official Mimi codec: kyutai/mimi
Official X thread: Kyutai release thread

Evidence

Moshi: a speech-text foundation model for real-time dialogue

Relation To Foundation TSFM Agenda

Use the source-level agenda mapping in moshi-2024 rather than duplicating verdict rows here.

At the entity level, Moshi is useful as a full-duplex audio event-stream analogue: it shows how an always-on model can maintain stream context, handle generated no-op/silence behavior, and expose temporal artifact metrics. It does not provide numeric observations, graph time series, topology, typed control inputs, interventions, or counterfactual next-state rollouts.

Alex Open Research Wiki

Explorer

Moshi

Moshi

Summary

Role In The Wiki

Official Artifacts

Evidence

Relation To Foundation TSFM Agenda

Graph View

Table of Contents

Backlinks

Alex Open Research Wiki

Explorer

Moshi

Moshi

Summary

Role In The Wiki

Official Artifacts

Evidence

Relation To Foundation TSFM Agenda

Related Pages

Graph View

Table of Contents

Backlinks