Introducing Helix 02: Full-Body Autonomy

Source

Core Claim

Helix 02 extends Figure’s Helix hierarchy from upper-body control to full-body humanoid loco-manipulation. The official writeup describes a three-system stack from semantic context to visuomotor joint targets to high-rate balance/contact execution.

Method Notes

  • System 2 handles scene understanding, language, goals, and behavior sequencing.
  • System 1 remains a Transformer conditioned on System 2 latents and outputs full-body joint targets at 200 Hz.
  • System 0 is a learned whole-body controller, described as a 10M-parameter neural network that outputs joint-level actuator commands at 1 kHz.
  • Observations include head cameras, palm cameras, fingertip tactile sensors, and full-body proprioception; outputs cover legs, torso, head, arms, wrists, and fingers.
  • The official source does not state that Helix 02 uses diffusion, flow matching, or a regression loss, so those labels should not be inferred.

Evidence And Limitations

Figure reports a 4-minute autonomous dishwasher task, 61 ordered loco-manipulation actions, bimanual transfers, dexterous contact tasks, and all videos as autonomous rather than teleoperated. The evidence is still company-published demonstration evidence with no public weights, dataset, ablations, benchmark protocol, or failure-rate statistics.

Foundation TSFM Relevance

Agenda slotVerdictEvidenceMissing pieces
Dynamic compute allocationadjacentThe S2/S1/S0 stack separates semantic planning, trajectory-level control, and high-rate actuator control.Company writeup only; no open model, dataset, ablations, or digital-system equivalent.
Streaming state and constant updatesadjacentSystem 1 is described as producing joint targets at 200 Hz and System 0 actuator commands at 1 kHz from continuous sensor inputs.No explicit latent-state maintenance interface or long-horizon memory evaluation.
BenchmarkswarningAutonomous demonstration evidence shows a plausible hierarchy but lacks failure-rate statistics.Needs reproducible benchmark protocols and raw trajectory data.

Open Questions

  • Should the S0/S1/S2 split be the canonical fast/slow abstraction for whole-body humanoids?
  • How should this wiki compare company demonstration evidence against open paper benchmarks?