Review-quality gaps surfaced by dogfooding (PR #4) #5

Closed
opened 2026-05-01 13:56:41 -07:00 by auto-review · 1 comment
Collaborator

Closed in favour of separate per-concern issues — see linked tickets below. Keeping the original body intact for the dogfooding context.

Closed in favour of separate per-concern issues — see linked tickets below. Keeping the original body intact for the dogfooding context.
Author
Collaborator

Adding a design note from continued dogfooding:

The note tier is for the LLM, not the human. Severity=note observations seem to actively help the LLM produce a more thorough review pass — they force it to externalize what it noticed about the diff. But those same observations are noise on the PR. The 16 notes across reviews 709–711 are the failure mode: every one is a restatement of "this change does X."

Proposed fix, lighter than rewriting the prompt:

  • Default AR_SEVERITY_FLOOR=warning instead of note. Notes still appear in the schema (so the LLM can emit them as scratchpad reasoning) but the post-step filter drops them before anything reaches Forgejo.
  • Operators who want notes can opt back in by setting AR_SEVERITY_FLOOR=note.

This sidesteps having to teach the LLM "don't be noisy" via prompt — much harder — and recovers most of section 1's value mechanically. Combined with section 2 (pre-merge fail → RequestChanges) and section 3 (dedup-on-resolved), the bot's output should converge on "silent unless there's something actionable."

Adding a design note from continued dogfooding: **The note tier is for the LLM, not the human.** Severity=note observations seem to actively help the LLM produce a more thorough review pass — they force it to externalize what it noticed about the diff. But those same observations are noise on the PR. The 16 notes across reviews 709–711 are the failure mode: every one is a restatement of "this change does X." Proposed fix, lighter than rewriting the prompt: - **Default `AR_SEVERITY_FLOOR=warning`** instead of `note`. Notes still appear in the schema (so the LLM can emit them as scratchpad reasoning) but the post-step filter drops them before anything reaches Forgejo. - Operators who want notes can opt back in by setting `AR_SEVERITY_FLOOR=note`. This sidesteps having to teach the LLM "don't be noisy" via prompt — much harder — and recovers most of section 1's value mechanically. Combined with section 2 (pre-merge fail → RequestChanges) and section 3 (dedup-on-resolved), the bot's output should converge on "silent unless there's something actionable."
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
jwilger/auto_review#5
No description provided.