On Fri, 12 Jun 2026 15:08:07 -0400 Matthew Gee <[email protected]> wrote:
> Previous review-patch.py would detect and report and error or warning > based off of the occurrence of the headers of the error and warning > sections. This led to consistent false positives as often AI reviewers > will create the header but put "none" or similar filler text within > the following body. > > This patch updates the code in order to check if the AI review has a > body with error or warnings to fix and not just filler text. This is > done by querying multiple lines at once and adjusting the regex to > filter out headers followed only by filler text or formatting in the > review. > > These changes were tested against 10+ AI review outputs with several > variations in formatting and filler text in order to catch a good > variety of cases to make sure code reviews with actual errors or > warnings are caught and not missed. > > Signed-off-by: Matthew Gee <[email protected]> > --- AI saw some stuff which I missed. Matthew, The windowing (tee/islice/chain) is aligned correctly, but the new matching has a few problems. The bigger issue is that this only recognizes markdown '#' headers now. The old scan matched an optional '#' prefix, '**' bold, and '<h1-3>' too. rgx_should_match requires "#+\s", so plain-text "Errors:" / "**Errors**" and HTML "<h2>Errors</h2>" no longer match. Default --format is text and html is supported, so reviews in those formats classify clean (exit 0) even when they report errors, which silently breaks the 2/3 exit codes compare-patch-reviews.sh relies on. The '#'/'**'/'<h>' alternatives need to stay. The 3-line concatenation also reintroduces the false positive you're trying to kill. curr+next+next_next are joined with no separator, so ## Errors None ## Warnings collapses to "## errorsnone## warnings"; the 'none...$' filler anchor fails because the next header follows, and has_errors gets set. Any output that doesn't blank-line-separate sections hits this. And in the other direction, content more than two lines below a header is missed: two blank lines between "## Errors" and the body leave stripped == "## errors", the '$' branch matches, and a real finding is suppressed. The heuristic ends up sensitive to exact line spacing both ways. Minor: iter[str] / iter[str | None] aren't valid generics (use the already-imported Iterator); the vars are also re-annotated with conflicting types and annotate the iterators rather than the loop targets. mypy will reject these. Run black too - the new for/zip line and the stripped assignment are over width. Suggest keeping header matching in all three formats, then scanning past blank lines to the first non-empty body line and suppressing only if it's filler (none/n/a/empty). That decouples detection from spacing instead of fixing it to a 3-line shape.

