Persist runtime state (learnings, review history, vector index, webhook dedup) across restarts #19
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Problem
State the gateway accumulates during operation does not survive a restart unless the operator opts in via env vars, and one of the most expensive pieces of state — the symbol embedding index — has no persistent backing wired into the review pipeline at all. Every
systemctl restart auto_review(or, in our current dev workflow, every push tomainthat the watcher picks up) effectively resets the bot's working memory.Current behaviour
crates/ar-gateway/src/main.rs:174-211defaults both stores to the in-memory implementations:learnings→InMemoryLearningsStoreunlessAR_LEARNINGS_DBis setreview_history→InMemoryReviewHistoryunlessAR_HISTORY_DBis setGatewayInfo(main.rs:297-306) reports"in-memory"for both in/infoso this is observable, but operators have to read the docs to discover the env-var opt-in.crates/ar-review/src/context_builder.rs:134constructs a freshInMemoryVectorStore::new()on every review —SqliteVectorStoreexists inar-indexbut is never wired in. Result: we re-walk + re-embed the entire workspace on every PR even though the symbol set rarely changes between reviews of the same repo.crates/ar-gateway/src/main.rs:329-337keeps webhook delivery dedup (RecentDeliveries) in an in-memory LRU. After a restart, in-flight redeliveries from Forgejo can trigger duplicate reviews of the same SHA.crates/ar-orchestrator/src/dispatcher.rs:156defaults the dispatcher's history toInMemoryReviewHistory, which is the right factory default but means the gateway has to remember to override it.Why this matters
ReviewHistory.last_reviewed. After a restart, every open PR's next webhook becomes a full review again — wasted tokens and duplicate inline comments on lines that haven't changed.remember/forgetchat commands write toLearningsStore. With the in-memory default, anything the bot was asked to remember evaporates on restart.Proposed direction
$XDG_STATE_HOME/auto_review/{learnings,history,vector}.db), open SQLite there unless the operator explicitly opts out viaAR_*_DB=:memory:or similar.SqliteVectorStoreintoembed_and_query_symbols(or thread a shared store throughBuildReviewContext) so symbol embeddings persist across reviews and restarts. Decide on an invalidation strategy keyed off file mtime or content hash so stale entries don't accumulate.RecentDeliveries(or move dedup into the review_history table keyed bydelivery_id) so post-restart redeliveries are still suppressed./infoand structured startup logs so operators can confirm at a glance.Out of scope