feat: persist runtime state across restarts (closes #19) #25

Merged
jwilger merged 1 commit from feat/persist-state into main 2026-05-02 06:47:29 -07:00
Owner

Summary

Persistent SQLite is now the default for the four runtime stores: a restart no longer wipes learnings, review history, symbol embeddings, or webhook delivery dedup.

  • Each store opens at `$XDG_STATE_HOME/auto_review/.db` (`$HOME/.local/state/auto_review/...` fallback). Operators opt out with `=:memory:` or override the path. New env vars: `AR_VECTOR_DB`, `AR_DEDUP_DB` (existing `AR_LEARNINGS_DB` / `AR_HISTORY_DB` keep their semantics; only the default changes).
  • `VectorStore::fetch_by_keys` lets `build_review_context_with_store` reuse cached embeddings whose stored snippet matches the current source bytes — re-reviewing an open PR no longer re-embeds unchanged symbols. Content equality is the invalidation rule.
  • New `DeliveryDedup` trait abstracts in-memory LRU vs new `SqliteDeliveries` (PK on delivery id) so a Forgejo retry that arrives during or after a gateway bounce is still recognised as a duplicate. Webhook handler awaits + fail-opens on storage errors.
  • `/info` exposes the concrete backing for every store; new fields `vector` and `dedup`. `learnings` and `history` change from a tag (`"sqlite"`) to a full backing string (`"sqlite:/path/learnings.db"`). A single startup log line summarises all four.

Closes #19.

Test plan

  • `cargo nextest run --workspace` — 1012 pass (with FORGEJO_/LLM_ env vars cleared, since two pre-existing CLI tests assume an unset env)
  • `cargo clippy --workspace --all-targets -- -D warnings` — clean
  • `cargo fmt --all -- --check` — clean
  • `cargo deny check licenses bans sources` — clean
  • `nix flake check` — all five derivations pass
  • Manual smoke: bring up gateway with default env; confirm `/info` shows four `sqlite:` backings and the startup log emits the summary line
  • Manual smoke: restart gateway; confirm `SqliteDeliveries` deduplicates a redelivered webhook UUID across the bounce
  • Manual smoke: re-review the same PR twice; confirm second review skips symbol embeds in the gateway log

🤖 Generated with Claude Code

## Summary Persistent SQLite is now the default for the four runtime stores: a restart no longer wipes learnings, review history, symbol embeddings, or webhook delivery dedup. * Each store opens at \`\$XDG_STATE_HOME/auto_review/<name>.db\` (\`\$HOME/.local/state/auto_review/...\` fallback). Operators opt out with \`<VAR>=:memory:\` or override the path. New env vars: \`AR_VECTOR_DB\`, \`AR_DEDUP_DB\` (existing \`AR_LEARNINGS_DB\` / \`AR_HISTORY_DB\` keep their semantics; only the default changes). * \`VectorStore::fetch_by_keys\` lets \`build_review_context_with_store\` reuse cached embeddings whose stored snippet matches the current source bytes — re-reviewing an open PR no longer re-embeds unchanged symbols. Content equality is the invalidation rule. * New \`DeliveryDedup\` trait abstracts in-memory LRU vs new \`SqliteDeliveries\` (PK on delivery id) so a Forgejo retry that arrives during or after a gateway bounce is still recognised as a duplicate. Webhook handler awaits + fail-opens on storage errors. * \`/info\` exposes the concrete backing for every store; new fields \`vector\` and \`dedup\`. \`learnings\` and \`history\` change from a tag (\`"sqlite"\`) to a full backing string (\`"sqlite:/path/learnings.db"\`). A single startup log line summarises all four. Closes #19. ## Test plan - [x] \`cargo nextest run --workspace\` — 1012 pass (with FORGEJO_*/LLM_* env vars cleared, since two pre-existing CLI tests assume an unset env) - [x] \`cargo clippy --workspace --all-targets -- -D warnings\` — clean - [x] \`cargo fmt --all -- --check\` — clean - [x] \`cargo deny check licenses bans sources\` — clean - [x] \`nix flake check\` — all five derivations pass - [ ] Manual smoke: bring up gateway with default env; confirm \`/info\` shows four \`sqlite:<path>\` backings and the startup log emits the summary line - [ ] Manual smoke: restart gateway; confirm \`SqliteDeliveries\` deduplicates a redelivered webhook UUID across the bounce - [ ] Manual smoke: re-review the same PR twice; confirm second review skips symbol embeds in the gateway log 🤖 Generated with [Claude Code](https://claude.com/claude-code)
feat: persist runtime state across restarts (closes #19)
All checks were successful
auto_review auto_review: no findings
CI / Nix flake check (pull_request) Successful in 3m25s
0094005744
Persistent SQLite is now the default for the four runtime stores: a
restart no longer wipes learnings, review history, symbol embeddings,
or webhook delivery dedup. The motivation — every restart was a
silent reset of the bot's working memory, costing real tokens
(re-embedding the full workspace on every PR), real correctness
(redeliveries dispatched a second time, full re-reviews on open
PRs), and real user-visible behaviour (`remember`/`forget` evaporated).

* Each store opens at `$XDG_STATE_HOME/auto_review/<name>.db`
  (`$HOME/.local/state/auto_review/...` fallback). Operators opt out
  with `<VAR>=:memory:` or override the path. New env vars
  `AR_VECTOR_DB` and `AR_DEDUP_DB` join the existing
  `AR_LEARNINGS_DB` / `AR_HISTORY_DB` with uniform semantics.
* `VectorStore::fetch_by_keys` lets `build_review_context_with_store`
  reuse cached embeddings whose stored snippet matches the current
  source bytes — re-reviewing an open PR no longer re-embeds the
  unchanged symbols. Content equality is the invalidation rule
  (no schema or mtime coupling). The shared store threads through
  `SpawningDispatcher::with_vector_store` → `run_review_job`.
* `DeliveryDedup` trait abstracts in-memory LRU vs new
  `SqliteDeliveries` (PK on delivery id) so a Forgejo retry that
  arrives during or after a gateway bounce is still recognised as
  a duplicate. Webhook handler awaits + fail-opens on storage errors.
* `/info` exposes the concrete backing for every store
  (`sqlite:<path>` or `in-memory(...)`); a single startup log line
  summarises the four backings so an operator confirms config at a
  glance without diffing four separate lines.

Closes #19.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
auto-review left a comment

This PR introduces persistent SQLite storage for runtime state, enhancing the robustness of the application by maintaining state across restarts. The changes appear well-structured and beneficial, with appropriate abstractions and tests. Ensure that the new dependencies and environment variable configurations are well-documented and compatible with existing systems.

Walkthrough

  • CHANGELOG.md: Provides a detailed overview of the new persistence features, including default paths and environment variable configurations.
  • Cargo.lock: Updates dependencies to include sqlx and tempfile, necessary for SQLite operations.
  • crates/ar-gateway/src/config.rs: Introduces a new module for resolving database backings, enhancing configurability and testability.
  • crates/ar-gateway/src/dedup.rs: Implements a new DeliveryDedup trait with both in-memory and SQLite-backed implementations, improving webhook deduplication.
  • crates/ar-gateway/src/main.rs: Modifies the main application logic to initialize persistent storage backings, ensuring state persistence across restarts.
  • crates/ar-index/src/vector_store.rs: Adds a fetch_by_keys method to the VectorStore trait, optimizing symbol embedding retrieval.
  • crates/ar-orchestrator/src/dispatcher.rs: Updates the dispatcher to support a shared vector store, reducing redundant computations.
  • crates/ar-review/src/context_builder.rs: Enhances the context builder to support persistent symbol embeddings, improving performance.

Pre-merge checks

  • CHANGELOG updated — CHANGELOG.md is in the diff
  • Tests touched — source changed but no test file appears in the diff
  • No new TODO/FIXME comments — no new TODO/FIXME markers
This PR introduces persistent SQLite storage for runtime state, enhancing the robustness of the application by maintaining state across restarts. The changes appear well-structured and beneficial, with appropriate abstractions and tests. Ensure that the new dependencies and environment variable configurations are well-documented and compatible with existing systems. ## Walkthrough - **CHANGELOG.md**: Provides a detailed overview of the new persistence features, including default paths and environment variable configurations. - **Cargo.lock**: Updates dependencies to include `sqlx` and `tempfile`, necessary for SQLite operations. - **crates/ar-gateway/src/config.rs**: Introduces a new module for resolving database backings, enhancing configurability and testability. - **crates/ar-gateway/src/dedup.rs**: Implements a new `DeliveryDedup` trait with both in-memory and SQLite-backed implementations, improving webhook deduplication. - **crates/ar-gateway/src/main.rs**: Modifies the main application logic to initialize persistent storage backings, ensuring state persistence across restarts. - **crates/ar-index/src/vector_store.rs**: Adds a `fetch_by_keys` method to the `VectorStore` trait, optimizing symbol embedding retrieval. - **crates/ar-orchestrator/src/dispatcher.rs**: Updates the dispatcher to support a shared vector store, reducing redundant computations. - **crates/ar-review/src/context_builder.rs**: Enhances the context builder to support persistent symbol embeddings, improving performance. ## Pre-merge checks - [x] CHANGELOG updated — CHANGELOG.md is in the diff - [ ] Tests touched — source changed but no test file appears in the diff - [x] No new TODO/FIXME comments — no new TODO/FIXME markers
jwilger force-pushed feat/persist-state from 0094005744
All checks were successful
auto_review auto_review: no findings
CI / Nix flake check (pull_request) Successful in 3m25s
to a88d7eb725
All checks were successful
CI / Nix flake check (pull_request) Successful in 1m42s
auto_review auto_review: no findings
2026-05-01 23:44:59 -07:00
Compare
auto-review left a comment

This PR introduces persistent SQLite storage for runtime state, enhancing the resilience of learnings, review history, symbol embeddings, and webhook delivery deduplication across restarts. The changes appear well-implemented and safe to merge, with comprehensive test coverage and documentation.

Pre-merge checks

  • CHANGELOG updated — CHANGELOG.md is in the diff
  • Tests touched — source changed but no test file appears in the diff
  • No new TODO/FIXME comments — no new TODO/FIXME markers
This PR introduces persistent SQLite storage for runtime state, enhancing the resilience of learnings, review history, symbol embeddings, and webhook delivery deduplication across restarts. The changes appear well-implemented and safe to merge, with comprehensive test coverage and documentation. ## Pre-merge checks - [x] CHANGELOG updated — CHANGELOG.md is in the diff - [ ] Tests touched — source changed but no test file appears in the diff - [x] No new TODO/FIXME comments — no new TODO/FIXME markers
jwilger deleted branch feat/persist-state 2026-05-02 06:47:30 -07:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
jwilger/auto_review!25
No description provided.