PDR+RTV
Summary
PDR+RTV is an agentic-coding test-time scaling recipe that combines structured rollout summaries, Recursive Tournament Voting, and agentic Parallel-Distill-Refine. It treats a long agent rollout as experience that should be compressed before it is compared or reused.
Interface
problem
-> N independent action/observation rollouts
-> structured summaries
-> recursive tournament selection
-> selected summaries as refinement context
-> fresh rollouts
-> final recursive selectionRole In The Wiki
PDR+RTV is not a world model, but it is useful evidence about the representation boundary for long-horizon agents. Raw trajectories are too long and noisy to be the default substrate for reuse. The method makes prior agent experience decision-usable by compressing it into structured summaries.
For Alex’s digital-world and observability agenda, the transfer hypothesis is that telemetry agents need an analogous state-summary layer. That layer must be stricter than a generic natural-language summary because it must preserve timing, magnitude, topology, action status, uncertainty, and safety constraints.