The run of Github CoPilot on my PR's yesterday have certainly helped, as they 
already can pin-point a lot of issues otherwise the reviewer would have to do 
and also because I can see stuff we as human might not directly think of.  Not 
all remarks are spot on, but most of them certainly are, so the ones that are 
not helpful I just downvote. So yes, that is a good thing to have imho.
________________________________
From: Jarek Potiuk <[email protected]>
Sent: Friday, April 3, 2026 14:12
To: [email protected] <[email protected]>
Subject: [DISCUSS] Current auto-triage stats/ learnigns and ask for maintainers

EXTERNAL MAIL: Indien je de afzender van deze e-mail niet kent en deze niet 
vertrouwt, klik niet op een link of open geen bijlages. Bij twijfel, stuur deze 
e-mail als bijlage naar [email protected]<mailto:[email protected]>.

*TL;DR: After a long triage session yesterday, I have a kind request for
maintainers: if you could look at "ready for maintainer review" PRs in your
areas, that would be great. *

PRs marked "ready for maintainer review" have passed initial triage and
require Your attention. The tool's main focus now is to move any pull
requests that simply fail initial validation out of your view (Draft and
Close). Also Kaxil ran CodePilot reviews on many of those "ready"  ones
(and some others) yesterday, and we want to see if that is helpful for
reviews as well.

I have been busy the last few weeks and it took longer than expected, but
yesterday I finally completed a four-week loop of trying it and triaged all
(!) 500+ opened PRs in about 4 hours. During this triage, 540 open PRs
decreased to 492. The average triage time was about 1 minute per PR during
a focused session. Many issues were skipped automatically because they do
not need triage. Triage is only for issues from non-collaborators and for
already triaged issues that might need some action (like Draft / Not
responded / Closed).

More details and instructions on how to provide feedback follow for
interested parties.

------

*# Feedback*

I would also love to hear specific feedback from maintainers, reviewers,
and contributors. I created the #auto-triage-feedback
<https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapache-airflow.slack.com%2Farchives%2FC0AQNS4DV2A&data=05%7C02%7Cdavid.blain%40infrabel.be%7Cf8cc9f6fcc724b0a4eca08de917a58e6%7Cb82bc314ab8e4d6fb18946f02e1f27f2%7C0%7C0%7C639108151817628149%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=zMc2BpMkoYfFKeMEirHRtSMJ9QbWmpAjdlMKSOTP%2F3Q%3D&reserved=0<https://apache-airflow.slack.com/archives/C0AQNS4DV2A>>
  - I do not promise
to engage with all feedback if there is a lot, as the tools are in early
stages, meaning there might be many issues and areas for improvement. At
this stage let's gather the feedback and try to refine it so more people
can use it regularly, and possibly we can automate it further. Or maybe
even we will learn that it does not help at all, and only gets in the way.

*# Findings so far*

Some current findings (See the stats below):

* We have about 80% of our PRs currently coming from external contributors
(i.e. non-committers, non-collaborators) - that's a lot of work for
maintainers

* About 40% of the PRs marked as "done" are already merged (which is good),
and most of those received responses and incorporated the triage comments.
Which is cool.

* About 60% of them were closed without being merged—some immediately, but
mostly following this path: Draft -> Triage -> No response (more than 2
weeks) -> Close. This means those are really drive-by-contributors.

* We have 127 PRs that seem ready for the "ready for maintainer review"
label. It would be great if in your reviews of contributor issues in "your"
areas you focus on those.

*# Current status of the tool*
I have not yet asked others to participate much yet, but if anyone wants to
try it, feel free to start using it - with `breeze pr auto-triage
--reviews-for-me`. This will only select issues where CODEOWNERS
automatically sets you as a reviewer or where you are mentioned.

But for that, 
https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fairflow%2Fpull%2F64669&data=05%7C02%7Cdavid.blain%40infrabel.be%7Cf8cc9f6fcc724b0a4eca08de917a58e6%7Cb82bc314ab8e4d6fb18946f02e1f27f2%7C0%7C0%7C639108151817644852%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=6qoQkbB8WBJtS%2FjfyeGAeSBb7hVPHdYMB8h2Evi2iE4%3D&reserved=0<https://github.com/apache/airflow/pull/64669>
 will have to be
merged. Yes, a lot of changes and tweaks have accumulated—sorry for such a
huge PR. This will likely finally stabilise and I will refactor the
algorithms, split them into smaller pieces, and then we can proceed with
more incremental updates. Next week I am on a PyCon LT conference but I
will focus mostly on incremental triaging and tweaking.

It includes cumulative learning from about 20 smaller triage sessions I've
done in the past weeks. I also have a few things to add after yesterday's
longer session, specifically cleaning up the algorithmic choices to better
determine default actions.

The tool is not perfect yet, and requires careful choices especially since
we still have many flakes. I had to do more manual assessment than I would
like to - I hope we can stabilise them after 3.0.0 release. And make it
more useful and I hope it will be ready for others to participate. I am
also going to look at the responses—I guess in some cases the triage was
"unfair," and I am trying to optimise it. It's still far from full
automation; it requires close human supervision (as expected at this
stage).

I am iterating **fast** on it - learning with every triage run while also
doing other things as well. I will try to make it really simple to follow.
We have a TUI mode that is good for testing and debugging (and possibly
later for a focused review mode - which we already have but it's not as
useful) - but I found the CLI mode far more useful overall. TUI is far too
much of a distraction - but might be cool if you want to focus on smaller
groups of PRs to review and later can help with review- and we have Andre
Ahlert who already contributes some nice improvements there.

*# Stats*

I've also built the `pr stats` command:
https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fairflow%2Fpull%2F64667&data=05%7C02%7Cdavid.blain%40infrabel.be%7Cf8cc9f6fcc724b0a4eca08de917a58e6%7Cb82bc314ab8e4d6fb18946f02e1f27f2%7C0%7C0%7C639108151817656613%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=SfmU4QfGqpwlCgzc5jKZDAzRQqoBXXqmPRTMyy23p8k%3D&reserved=0<https://github.com/apache/airflow/pull/64667>
 - happy to receive reviews,
and this stats command still needs some tweaking and improvement, which
will follow.

I have also built stats and track the current status of triaged
collaborator PRs.

*## Triaged "final" state:*

In short 40 out of 102 have already been merged after responding to triage,
62 have been closed without merging (no response on triage).

https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fibb.co%2FLz2pj259&data=05%7C02%7Cdavid.blain%40infrabel.be%7Cf8cc9f6fcc724b0a4eca08de917a58e6%7Cb82bc314ab8e4d6fb18946f02e1f27f2%7C0%7C0%7C639108151817668047%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=BrQVxG5pG%2BX2rIp9WjKkMuaXNBTdmayVYSCZWXVoTZY%3D&reserved=0<https://ibb.co/Lz2pj259>
 - image was too large to attach

*## Current open PRs status*

* As of yesterday we had 492 open PRS
* 400 of those are contributor PRs
* 126 of those are "ready for maintainer review"
* 200 of those are already drafted and triaged, waiting for the
contributor's response (128) or they are simply unfinished drafts.

Those stats will change daily - and there might be some missing things
there that I will track and add any missing items over the coming days
(After Easter).

https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fibb.co%2FNHzdrQx&data=05%7C02%7Cdavid.blain%40infrabel.be%7Cf8cc9f6fcc724b0a4eca08de917a58e6%7Cb82bc314ab8e4d6fb18946f02e1f27f2%7C0%7C0%7C639108151817682501%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=kgJadYL2TrXnuBqCEykT7OUxchdNX1GWPfcxqfxseHA%3D&reserved=0<https://ibb.co/NHzdrQx>
 -  - image was too large to attach

Also - if you have general feedback and comments to it - feel free.

I will pick it up after Easter - and for those who celebrate Easter, have a
happy, AI-free, family-focused one.

I certainly plan it this way.

J.

Reply via email to