Architecture

This page follows a typical run from the first prompt through tool execution so you can see where planning happens, when llama.cpp is invoked, and how tools are ranked and executed.

End-to-end flow

sequenceDiagram
    participant User
    participant CLI as okso CLI
    participant Planner
    participant Intent as Intent filter
    participant Llama as llama.cpp
    participant Approver as Approval prompts
    participant Executor as Executor
    participant Tool as Tool runner
    participant Trace as Trace/logs

    User->>CLI: Provide request
    CLI->>Intent: Classify intent + filter tool catalog
    Intent-->>Planner: Filtered tools + intent context
    CLI->>Planner: Build planner prompt (tools, guardrails)
    Planner->>Llama: Generate JSON plan (schema)
    Llama-->>Planner: Plan outline draft
    Planner-->>Approver: Show plan for confirmation/refinement
    Approver-->>Executor: Approved plan
    Executor->>Llama: Fill context-marked arguments (optional)
    Llama-->>Executor: Enriched arguments
    Executor->>Tool: Execute with sandbox/guards
    Tool-->>Trace: Stream stdout/stderr and status
    Trace-->>Executor: Observations captured
    Executor-->>User: Final answer once all steps complete

Planner pass

Executor

Step-by-step execution checklist

  1. Load the approved plan and current step guidance.
  2. Use the planned tool in order.
  3. Run the tool with its sandbox (for example, the terminal’s guarded rm -i or the Python REPL sandbox).
  4. Record stdout/stderr and exit status for the execution summary.
  5. Continue until all planned steps are replayed or final_answer is returned.

llama.cpp dependency and fallbacks

Tool ranking and execution