Slipstream/auto_review

Fork 0

feat(review): attribute per-review LLM costs #261

Merged

jwilger merged 19 commits from issue-28-cost-per-review into main

2026-05-18 22:19:33 -07:00

jwilger commented

2026-05-18 21:48:47 -07:00

Owner

Why:

Review history needs per-review LLM cost attribution so operators can understand and tune review spend.
See issue #28.

What:

Add LLM pricing defaults, provider-qualified price overrides, routed usage capture, and embedding usage capture.
Append an LLM usage/cost footer to posted reviews when usage and pricing are available.
Expose estimated review cost on ReviewOutcome and persist it to SQLite review history.
Support AR_PRICE_TABLE_PATH for operator pricing overrides and AR_REVIEW_COST_FOOTER=false to suppress the public footer while preserving persisted cost attribution.
Document review cost attribution settings and systemd/deployment examples.
Move configured subagents from gpt-5.3-codex-spark to standard gpt-5.3-codex per branch request.

Validation:

cargo test -p ar-orchestrator --no-run
cargo clippy -p ar-review --tests -- -D warnings
cargo test -p ar-review review_pull_request_posts_review_with_llm_usage_cost_footer -- --nocapture
cargo nextest run -p ar-orchestrator run_review_job_records_review_outcome_cost_in_sqlite_history
cargo test -p ar-review review_pull_request_omits_llm_usage_cost_footer_when_disabled_by_env
cargo test -p ar-review review_pull_request_cost_footer_uses_price_table_override_from_env_path
python -m json.tool opencode.json
just opencode-test
just fmt
just clippy
just test
just ci

Closes #28

Why: - Review history needs per-review LLM cost attribution so operators can understand and tune review spend. - See issue #28. What: - Add LLM pricing defaults, provider-qualified price overrides, routed usage capture, and embedding usage capture. - Append an LLM usage/cost footer to posted reviews when usage and pricing are available. - Expose estimated review cost on `ReviewOutcome` and persist it to SQLite review history. - Support `AR_PRICE_TABLE_PATH` for operator pricing overrides and `AR_REVIEW_COST_FOOTER=false` to suppress the public footer while preserving persisted cost attribution. - Document review cost attribution settings and systemd/deployment examples. - Move configured subagents from `gpt-5.3-codex-spark` to standard `gpt-5.3-codex` per branch request. Validation: - `cargo test -p ar-orchestrator --no-run` - `cargo clippy -p ar-review --tests -- -D warnings` - `cargo test -p ar-review review_pull_request_posts_review_with_llm_usage_cost_footer -- --nocapture` - `cargo nextest run -p ar-orchestrator run_review_job_records_review_outcome_cost_in_sqlite_history` - `cargo test -p ar-review review_pull_request_omits_llm_usage_cost_footer_when_disabled_by_env` - `cargo test -p ar-review review_pull_request_cost_footer_uses_price_table_override_from_env_path` - `python -m json.tool opencode.json` - `just opencode-test` - `just fmt` - `just clippy` - `just test` - `just ci` Closes #28

jwilger added 19 commits

2026-05-18 21:48:47 -07:00

feat(llm): add price table defaults 4c40cc1559

Why:
- Operators need stable default rates and overrideable model pricing before per-review cost attribution can calculate estimates.

What:
- Add an ar-llm price table with pinned OpenAI defaults and JSON override loading.

Validation:
- cargo nextest run -p ar-llm openai_price_table_has_defaults_and_operator_overrides

feat(llm): collect routed token usage 9e00145758

Why:
- Per-review cost attribution needs a scoped way to observe token usage from routed LLM calls.

What:
- Add an optional Router usage collector and record completion and embedding calls by tier.

Validation:
- cargo nextest run -p ar-llm router_usage_collector_records_complete_and_embedding_calls

feat(llm): estimate usage costs 6e969c1f29

Why:
- Per-review cost reporting needs model pricing to convert token usage into an estimated USD value.

What:
- Add provider/model-aware price estimation with provider-qualified override precedence.

Validation:
- cargo nextest run -p ar-llm price_table_estimates_usage_by_provider_and_model estimate_usage_uses_provider_qualified_override_before_model_fallback openai_price_table_has_defaults_and_operator_overrides

feat(llm): include provider metadata in usage 5647fa1021

Why:
- Per-review cost attribution must price usage by provider and model instead of placeholder tier labels.

What:
- Add LlmProvider metadata hooks and have Router pass provider base URL and model names to usage collectors.

Validation:
- cargo nextest run -p ar-llm router_usage_collector_records_provider_and_model_names

feat(llm): record embedding token usage 980297436e

Why:
- Per-review cost attribution needs embedding token counts from provider responses, not zeroed placeholder usage.

What:
- Add provider embed_with_usage plumbing and parse OpenAI-compatible embedding usage into routed usage collection.

Validation:
- cargo test -p ar-llm --test usage_capture router_usage_collector_records_embedding_prompt_tokens_from_openai_response

feat(review): append LLM cost footer b3a88f9a33

Why:
- Review authors and operators need per-review visibility into token usage and estimated cost where the bot posts its findings.

What:
- Collect routed LLM usage during review generation and append an estimated usage/cost footer before posting the review.
- Chain usage collectors so pipeline-local accounting preserves existing observers.

Validation:
- cargo nextest run -p ar-review review_pull_request_posts_review_with_llm_usage_cost_footer

refactor(llm): format cost usage code 6b5c738298

Why:
- Keep the issue 28 LLM cost and usage helpers aligned with workspace formatting before continuing behavior work.

What:
- Apply formatting-only cleanup to the previously approved ar-llm pricing and usage changes.

Validation:
- cargo nextest run -p ar-review review_pull_request_posts_review_with_llm_usage_cost_footer

feat(orchestrator): add review cost history column 452bed1976

Why:
- Review-history storage needs a place to persist per-review cost aggregates for issue 28.

What:
- Add a per-review cost column to the SQLite review_history table with focused coverage.

Validation:
- cargo test -p ar-orchestrator persist_sha_with_per_review_cost_aggregate -- --nocapture

feat(orchestrator): record explicit review costs 00939cffa8

Why:
- Review history needs to store the actual per-review cost estimate, not only a placeholder column.

What:
- Add a SQLite history helper to record a concrete per-review cost with the reviewed SHA.

Validation:
- cargo test -p ar-orchestrator caller_can_record_explicit_per_review_cost_and_read_it_back -- --nocapture

feat(review): expose estimated review cost 63ee31bdd5

Why:\n- Orchestrator history needs the actual per-review LLM cost computed by the review pipeline.\n\nWhat:\n- Return the cost footer total on ReviewOutcome and update compile fallout.\n- Clean up lint-only usage capture types and unused test bindings.\n\nValidation:\n- cargo test -p ar-orchestrator --no-run\n- cargo clippy -p ar-review --tests -- -D warnings\n- cargo test -p ar-review review_pull_request_posts_review_with_llm_usage_cost_footer -- --nocapture

feat(orchestrator): persist review outcome cost f9e2a5c80d

Why:\n- Review history should attribute the actual cost computed for each successful review instead of relying on the SQLite default.\n\nWhat:\n- Add a cost-aware ReviewHistory recording method with a default fallback.\n- Wire successful review jobs to persist ReviewOutcome estimated cost through SQLite history.\n\nValidation:\n- cargo nextest run -p ar-orchestrator run_review_job_records_review_outcome_cost_in_sqlite_history

chore(opencode): use standard codex subagents 674b708a5a

Why:\n- The issue branch should run specialist subagents on gpt-5.3-codex with standard reasoning instead of the spark variant.\n\nWhat:\n- Add a standard gpt-5.3-codex model alias and point affected subagents at it.\n- Mirror the model choice in subagent frontmatter.\n\nValidation:\n- python -m json.tool opencode.json\n- just opencode-test

feat(review): allow disabling cost footer 584416ff1d

Why:\n- Operators may want per-review cost attribution persisted without adding a public usage footer to every review comment.\n\nWhat:\n- Honor AR_REVIEW_COST_FOOTER=false by skipping the LLM usage/cost footer while keeping the default enabled.\n\nValidation:\n- cargo test -p ar-review review_pull_request_omits_llm_usage_cost_footer_when_disabled_by_env

test(review): assert disabled footer reports zero cost 09528a352c

Why:\n- The disabled footer contract should cover both posted review text and the returned cost outcome.\n\nWhat:\n- Assert AR_REVIEW_COST_FOOTER=false yields a zero estimated review cost.\n\nValidation:\n- cargo test -p ar-review review_pull_request_omits_llm_usage_cost_footer_when_disabled_by_env -- --nocapture

feat(review): load cost pricing overrides 1632a61030

Why:\n- Operators need per-review cost attribution to reflect their configured model pricing instead of built-in defaults.\n\nWhat:\n- Honor AR_PRICE_TABLE_PATH when estimating review footer and outcome costs, falling back to defaults if absent or invalid.\n- Cover the override path with a focused review pipeline test.\n\nValidation:\n- cargo nextest run -p ar-review review_pull_request_cost_footer_uses_price_table_override_from_env_path --no-fail-fast\n- cargo test -p ar-review review_pull_request_omits_llm_usage_cost_footer_when_disabled_by_env -- --nocapture\n- cargo clippy -p ar-review --tests -- -D warnings

fix(review): preserve cost when footer hidden 61e4c090e1

Why:\n- Per-review cost attribution should remain available for history even when operators suppress the public review footer.\n\nWhat:\n- Compute the review cost with the same pricing path while leaving the posted review body unchanged when AR_REVIEW_COST_FOOTER=false.\n\nValidation:\n- cargo test -p ar-review review_pull_request_omits_llm_usage_cost_footer_when_disabled_by_env\n- cargo test -p ar-review review_pull_request_cost_footer_uses_price_table_override_from_env_path

docs: describe review cost attribution 7c8e20004d

Why:\n- Operators need to know how per-review cost attribution is persisted and how to tune pricing/footer visibility.\n\nWhat:\n- Document AR_PRICE_TABLE_PATH, AR_REVIEW_COST_FOOTER, and per_review_cost_usd behavior.\n- Add deployment and systemd env examples for cost attribution settings.\n\nValidation:\n- docs-operator-reviewer approved the operator docs diff

refactor(review): format cost tests 53ce258bac

Why:\n- The issue branch should satisfy the workspace formatting gate before broader verification.\n\nWhat:\n- Apply cargo fmt output for the review cost attribution tests.\n\nValidation:\n- cargo fmt --all

test(llm): satisfy clippy in cost tests

CI / Classify changed paths (pull_request) Successful in 3s

Details

CI / Clippy (pull_request) Has been skipped

Details

CI / Format check (pull_request) Has been skipped

Details

CI / Test (pull_request) Has been skipped

Details

CI / Dependency policy (pull_request) Has been skipped

Details

CI / Build (pull_request) Has been skipped

Details

CI / Request auto_review semantic review (pull_request) Successful in 1s

Details

CI / opencode plugin tests (pull_request) Successful in 8s

Details

CI / Build PR artifacts (no token) (pull_request) Has been skipped

Details

auto_review auto_review: no findings

919313eaed

Why:\n- The issue branch should pass the workspace clippy gate with warnings denied.\n\nWhat:\n- Remove redundant PathBuf and vec conversions in LLM cost usage tests.\n\nValidation:\n- just clippy

auto-review approved these changes

2026-05-18 21:49:13 -07:00

auto-review left a comment

Walkthrough

LLM Cost Attribution:
- Added functionality to estimate and record LLM usage costs per review.
- Introduced environment variables AR_PRICE_TABLE_PATH and AR_REVIEW_COST_FOOTER for cost management.
- Updated documentation to reflect new configuration options.
Code Changes:
- Modified Router to include usage collection.
- Added record_with_cost method to ReviewHistory and its implementations.
- Updated tests to cover new cost attribution features.
Documentation:
- Updated deployment and operations guides to include new cost attribution settings.
- Provided examples for setting up cost attribution in systemd environments.

LLM usage and cost

Reasoning (gpt-4o) in=21391 out=361 cost=$0.112370
Cheap (gpt-4o-mini) in=609 out=51 cost=$0.000122
Estimated total USD: $0.112492 via https://api.openai.com and https://api.openai.com

This PR introduces per-review LLM cost attribution, allowing operators to understand and manage review expenses. It includes changes to capture and estimate LLM usage costs, append cost footers to reviews, and persist cost data in SQLite. The changes appear well-tested and safe to merge. ## Walkthrough - **LLM Cost Attribution**: - Added functionality to estimate and record LLM usage costs per review. - Introduced environment variables `AR_PRICE_TABLE_PATH` and `AR_REVIEW_COST_FOOTER` for cost management. - Updated documentation to reflect new configuration options. - **Code Changes**: - Modified `Router` to include usage collection. - Added `record_with_cost` method to `ReviewHistory` and its implementations. - Updated tests to cover new cost attribution features. - **Documentation**: - Updated deployment and operations guides to include new cost attribution settings. - Provided examples for setting up cost attribution in systemd environments. ## LLM usage and cost - Reasoning (gpt-4o) in=21391 out=361 cost=$0.112370 - Cheap (gpt-4o-mini) in=609 out=51 cost=$0.000122 Estimated total USD: $0.112492 via https://api.openai.com and https://api.openai.com

jwilger merged commit d143b56324 into main

2026-05-18 22:19:33 -07:00

jwilger referenced this pull request from a commit

2026-05-18 22:19:33 -07:00

feat(review): attribute per-review LLM costs (#261)