PDR+RTV

Summary

PDR+RTV is an agentic-coding test-time scaling recipe that combines structured rollout summaries, Recursive Tournament Voting, and agentic Parallel-Distill-Refine. It treats a long agent rollout as experience that should be compressed before it is compared or reused.

Interface

problem
  -> N independent action/observation rollouts
  -> structured summaries
  -> recursive tournament selection
  -> selected summaries as refinement context
  -> fresh rollouts
  -> final recursive selection

Role In The Wiki

PDR+RTV is not a world model, but it is useful evidence about the representation boundary for long-horizon agents. Raw trajectories are too long and noisy to be the default substrate for reuse. The method makes prior agent experience decision-usable by compressing it into structured summaries.

For Alex’s digital-world and observability agenda, the transfer hypothesis is that telemetry agents need an analogous state-summary layer. That layer must be stricter than a generic natural-language summary because it must preserve timing, magnitude, topology, action status, uncertainty, and safety constraints.

Evidence