Planner sampling, scoring, and debugging
Planner runs score and debug each outline candidate before execution. Use these controls to understand scoring and inspect alternatives.
Sampling controls
PLANNER_SAMPLE_COUNTis defined but currently pinned to1inplanner.sh, so only one candidate is generated and scored per run.PLANNER_TEMPERATUREforwards directly to llama.cpp for planner generations. Lower values keep plans conservative; higher values explore more tool permutations. Values should stay between0and1for predictable entropy.
The normalized candidate is scored before selection to ensure the highest-quality plan is chosen, even when early samples look promising.
Scoring rules
Planner scoring rewards concise, compliant plans and penalizes risky or invalid suggestions:
- Plans within the
PLANNER_MAX_PLAN_STEPSbudget earn a baseline bonus; going over budget subtracts points per extra step. - Ending with
final_answeris required; missing it incurs a heavy penalty. - Steps that reference unknown tools reduce the score, while registered tools earn a small bonus.
- Side-effecting tools that appear after information-gathering steps receive a positive adjustment; starting with a side-effecting action introduces a deduction.
- A tie-breaker favors shorter plans when scores are equal by comparing remaining budget vs. overages.
See src/lib/planning/scoring.sh for the exact heuristics applied to each candidate.
Debugging planner output
Every candidate plan is normalized and appended to PLANNER_DEBUG_LOG (default ${TMPDIR:-/tmp}/okso_planner_candidates.log) as a JSON object containing:
index: 1-based sample order.scoreandtie_breaker: numeric values produced by the scoring pass.rationale: explanation strings backing each score component.response: the normalized planner output with the validated plan steps.
The log is truncated at the start of each planner invocation to keep runs isolated. Use the file to audit why the winning plan beat the alternatives or to reproduce scoring decisions during development.