RT-2
Summary
RT-2 is a vision-language-action robot policy that turns robot control inputs into text-like action tokens emitted by a VLM.
Role In The Wiki
RT-2 anchors the action-as-language branch of modern robotics models. It is a counterexample to the idea that all modern fast robotics action heads are diffusion or flow based.
Evidence
Relation To Foundation TSFM Agenda
Use the source-level agenda mapping in rt-2-2023 rather than duplicating verdict rows here.
At the entity level, RT-2 anchors the action-as-language branch of modern robotics models. It is a counterexample to the idea that all modern fast robotics action heads are diffusion or flow based. This page should stay as the object card; source pages carry slot-level verdicts, evidence, and missing pieces.