Cost Tracking¶
Pipelit tracks token usage and USD costs for every LLM call across every execution. Costs roll up from individual nodes to executions, from executions to tasks, and from tasks to epics -- giving you visibility at every level and enabling budget enforcement that prevents runaway spending.
How costs are tracked¶
flowchart TB
subgraph Execution
N1[Node 1\n+150 input, +80 output] --> Acc[Accumulator\n_execution_token_usage]
N2[Node 2\n+200 input, +120 output] --> Acc
N3[Node 3\n+500 input, +300 output] --> Acc
end
Acc -->|persist on completion| Exec[WorkflowExecution\ntotal_tokens, total_cost_usd]
Exec -->|sync to linked task| Task[Task\nactual_tokens, actual_usd]
Task -->|roll up to epic| Epic[Epic\nspent_tokens, spent_usd] Per-node token extraction¶
After each node executes, the orchestrator extracts token usage from the component's output via the _token_usage reserved key. The token usage dict contains:
| Field | Description |
|---|---|
input_tokens | Tokens sent to the LLM (prompt + context) |
output_tokens | Tokens generated by the LLM (response) |
total_tokens | Sum of input and output tokens |
llm_calls | Number of LLM API calls in this node |
For agent nodes that make multiple LLM calls (e.g., during a ReAct reasoning loop with tool calls), the usage is extracted from all AI messages in the conversation via extract_usage_from_messages().
Execution-level accumulation¶
Token usage from each node is merged into a running total stored in the execution state as _execution_token_usage:
The merge_usage() function sums all numeric fields across the two dicts. When the execution completes (success or failure), the accumulated totals are persisted to the WorkflowExecution model.
USD cost calculation¶
USD cost is calculated using a built-in pricing table that maps model name prefixes to per-million-token rates:
| Model Prefix | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
gpt-4o-mini | $0.15 | $0.60 |
gpt-4o | $2.50 | $10.00 |
gpt-4-turbo | $10.00 | $30.00 |
gpt-4 | $30.00 | $60.00 |
gpt-3.5-turbo | $0.50 | $1.50 |
o3-mini | $1.10 | $4.40 |
o1-mini | $3.00 | $12.00 |
o1 | $15.00 | $60.00 |
claude-3-5-sonnet | $3.00 | $15.00 |
claude-3-5-haiku | $0.80 | $4.00 |
claude-3-opus | $15.00 | $75.00 |
claude-sonnet-4 | $3.00 | $15.00 |
claude-opus-4 | $15.00 | $75.00 |
The pricing lookup uses longest prefix matching -- the model name is compared against prefixes in order, so gpt-4o-mini-2024-07-18 correctly matches the gpt-4o-mini rate rather than the more expensive gpt-4o rate.
Unknown models
Models that do not match any known prefix are tracked with $0.00 cost. Token counts are still recorded accurately -- only the USD calculation is skipped.
The cost formula is:
Execution cost fields¶
Every WorkflowExecution records:
| Field | Type | Description |
|---|---|---|
total_input_tokens | int | Total input tokens across all nodes |
total_output_tokens | int | Total output tokens across all nodes |
total_tokens | int | Sum of input + output tokens |
total_cost_usd | decimal | Calculated USD cost |
llm_calls | int | Total number of LLM API calls |
These fields are populated when the execution completes (or fails) via _persist_execution_costs().
Epic-level budget enforcement¶
When an execution is linked to a task within an epic, the orchestrator enforces budget limits before every node execution.
flowchart TD
Start[Node about to execute] --> Check{Budget check}
Check -->|within budget| Execute[Execute node]
Check -->|tokens exceeded| Fail[Fail execution]
Check -->|USD exceeded| Fail
Check -->|no epic linked| Execute
Execute --> Next[Continue workflow]
Fail --> Stop[Execution status: failed] How budget enforcement works¶
The _check_budget() function runs before each node in the orchestrator:
- Look up the
Tasklinked to this execution (viaTask.execution_id) - If no task or no epic, skip budget checks
- Read the current execution's accumulated token usage from state
- Token budget: Add current execution tokens to the epic's
spent_tokens. If the total exceedsepic.budget_tokens, fail the execution. - USD budget: Add current execution cost to the epic's
spent_usd. If the total exceedsepic.budget_usd, fail the execution.
Budget overruns are hard stops
When a budget is exceeded, the execution is immediately marked as failed with a descriptive error message. The execution's accumulated costs are persisted, and the task and epic costs are synced before the failure is broadcast.
Cost roll-up¶
When a child execution completes, the orchestrator syncs costs in two steps:
Step 1 -- Execution to task:
task.actual_tokens = execution.total_tokens
task.actual_usd = float(execution.total_cost_usd)
task.llm_calls = execution.llm_calls
task.tool_invocations = exec_usage.get("tool_invocations", 0)
Step 2 -- Tasks to epic (recalculated, not incremental):
epic.spent_tokens = SUM(task.actual_tokens) for all tasks in epic
epic.spent_usd = SUM(task.actual_usd) for all tasks in epic
The epic totals are always recalculated from scratch by summing all task costs. This avoids double-counting issues that could arise from incremental updates.
Monitoring costs¶
Execution detail page¶
The Execution Detail page (/executions/:id) displays token counts and cost for each completed execution.
WebSocket events¶
Node-level token usage is included in node_status WebSocket events when a node completes with status: success:
{
"type": "node_status",
"node_id": "agent_abc123",
"status": "success",
"token_usage": {
"input_tokens": 1500,
"output_tokens": 450,
"total_tokens": 1950,
"llm_calls": 3
}
}
API endpoints¶
Query execution costs via the REST API:
| Endpoint | Description |
|---|---|
GET /api/v1/executions/ | List executions with cost fields |
GET /api/v1/executions/{id}/ | Execution detail including token counts and USD cost |
Epic and task costs are available through the epic_tools and task_tools agent tools (e.g., epic_status returns spent_tokens and spent_usd).
What's next?¶
- Learn how epics organize tasks with budgets: Epics & Tasks
- Understand how workflows execute: Execution
- Configure recurring executions: Scheduler