The LLM-as-Compiler Pattern: Separating Plan Generation from Execution
When a PlanCompiler-style agent is benchmarked against direct LLM-to-code generation on 300 stratified multi-step tasks, the structured approach achieves 92.67% success at $0.00128 per task. The direct approach — where the LLM decides actions step-by-step in a free-form loop — achieves 62% success at $0.0106 per task. That is 50% more accurate at one-eighth the cost.
The difference isn't model capability. Both approaches use the same model. The difference is architecture: one separates plan generation from plan execution; the other conflates them.
