Re: shared tools to validate convenience binaries and artifacts

Jarek Potiuk Sat, 22 Nov 2025 12:45:47 -0800

Thanks Dave. I do believe ATR is one of the best things that happened to
ASF since sliced bread :D and try to be very, very, very up-to-date on
where we are and where we are heading.


Butt to add to it - a very interesting discussion - SUPER relevant to our
discussion which we **just** have (Justin - VERY MUCH to your point):

https://github.com/apache/airflow/pull/58578

We are discussing there two different approaches to our release
verification:

1) fully automated check of everything as single command returning "OK/NOK"
(full automation with the python `breeze` tool we have in Airflow)
2) current way where each of the checks is a separate step (also somewhat
automated but many of those steps are just copy & pasted shell commands to
run the same checks - one-at-a-time

Different people might have different opinions on those. I have mine, but I
am not 100% sure it's correct. My last comment (and also that's something
in-line with what Justin wrote) -> there is a value in "manual" checks,
because it allows the RM to pause and check things rather than rely on a
fully automated script that does everything. This is very much my
philosophy - that removing the "human-in-the-loop" has the point of
diminishing returns - that when you remove "too much of a human" - it might
have negative effects.  I do not know it, It's my belief, but I have no
hard data to back it up. Currently no-one in ASF has that date.

So yes I am a big fan of automating things, but I am also aware about the
limits.  I think there is a HUGE added value of ATR - once we deploy it and
make our PMCs want to use it in order to simplify their stuff, we (ASF) get
a fantastic "single point of control" where we can experiment and try
different things as part of the workflow. This gives us a fantastic way not
only to rely on things what we "believe" are good, but something we can
actually check - we can ask our users (PMCs) to participate and try
different things, we can "enforce" other things, we can track misbehaviours
and have some stats about it

ASF - thanks to that tool might get a way to influence, monitor and
sometimes enforce what is the most important part of our mission "Release
software for public good".  And we can start measuring things and getting
some data to back up our beliefs.

J,



On Sat, Nov 22, 2025 at 8:54 PM Dave Fisher <[email protected]> wrote:

> Hi Jarek,
>
> Thanks for all your help in testing ATR and your excellent understanding
> of the current state.
>
> > On Nov 22, 2025, at 8:07 AM, Jarek Potiuk <[email protected]> wrote:
> >
> > I think even if ATR does not **currently** support more checks than
> > **basic** checks for binary releases, there is absolutely nothing wrong
> in
> > adding them there. ATR will (hopefully) be one of the most common used
> tool
> > in the ASF, and we have tooling team that supports developing and
> > maintenance of it, also all the code is super-easy-python code using
> modern
> > standards, uv to run the tooling and if anyone would like to contribute a
> > check for certain artifact types - like PyPI Rcs, I am 100% sure Sean and
> > Dava and others who are already contributing and adding issues and tools,
> > will be super happy to accept.
>
> We will definitely be adding checks. One that will be important will be
> SBOM validation which will go along way to being able to check dependency
> licenses.
>
>
> >
> > What my post was mostly about to suggest is that very soon we will have a
> > common "platform" for release verification - we (ASF) already do basic
> > checks with ATR on our binary artifacts, we already use RAT from creadur
> > mentioned above for licence checking and there is **absolutely no
> reason**
> > anyone here could not add a new check there - I am sure contributions
> will
> > be very welcome there. My cooperation with the tooling time has been
> > nothing-but-stellar.
>
> The RAT check we have in place includes Creadur’s recent developments. One
> area we need to refactor is handling of excludes patterns. Another is that
> we are discussing moving that to a GitHub action workflow. If anyone has
> some Python and GH API experience your help would be awesome.
>
> Feel free to share any and all ideas for new checks either in
> https://github.com/apache/tooling-trusted-releases/ or
> [email protected]
>
> Thanks PJ for sharing some of your ideas already.
>
> >
> > So my main point is that if there are ideas how to improve this "common
> > platform" we are going to have which is already plugging in our release
> > process - they are absolutely welcome, but Ideally they should be added
> to
> > ATR, rather than developed separately. It could also be - of course -
> > developed separately in creadur (like RAT is) and used in ATR, but I
> think
> > having those checks integrated with ATR is all-but-guarantee that it's
> > going to be useful across the whole ASF.
> >
> > That's all I wanted to stress. I feel a bit defensive approach when I
> > mentioned ATR, but that was more "Hey - we have this great platform for
> > releases which is already funded by Alpha-Omega, and driven by board
> > decision, so we should rather work on strenghtening it and adding things
> to
> > something that is **precisely** targeting to automate the workflow that
> has
> > been mentioned here that one that **is in a need of automation**.
> >
> > Yes, it is, and we have an ASF-wide effort to improve exactly that
> workflow
> > that the board not only recognised and secured funds for and staffed, but
> > also (in a recent conversation with some board members) have been named
> as
> > the absolute game-changer for the ASF (which I 100% agree with).
> >
> > So ... let's do it as a combined effort - as simple as that :) .
> >
> > J.
> >
> >
> >
> > On Sat, Nov 22, 2025 at 3:20 PM sebb <[email protected]> wrote:
> >
> >> On Sat, 22 Nov 2025 at 14:03, PJ Fanning <[email protected]> wrote:
> >>>
> >>> My issue is not really about the source release and there is some
> >>> tooling and typically the review checks are to be done at vote time.
> >>> Here is a check that might be useful to automate and that can't be
> >>> properly done without it - does the source code in the tarball match
> >>> what is announced as the git commit. If there is a pre-existing tool
> >>> that does that check, I'd love to use it.
> >>
> >> I agree that this is vital, as the tarballs are generally created from
> >> whatever happens to be in the source directories.
> >> It's very easy for spurious files to be added to the tarball, e.g.
> >> files left over from testing.
> >> An exact match is not necessary, so long as every file in the source
> >> tarball can be derived from the source tag.
> >>
> >> I have used diff -r in the past, and some editors can show recursive
> >> directory differences.
> >>
> >>> My issue is really with the convenience binaries. Are reviewers really
> >>> unzipping jar files to check the contents and checking the text in the
> >>> pom files?
> >>>
> >>> What format are the pypi RCs supposed to be in? Are we sure that the
> >>> apache prefix appears in the target pypi project?
> >>>
> >>> And the big binary tarballs that some teams ship, full of jars or
> >>> other compiled components? Those can be a real time consumer to
> >>> manually review.
> >>>
> >>> Some reviewers do these convenience binary checks and maybe it's my
> >>> bad luck to try checking on votes but I see a lot of issues when I
> >>> review convenience binaries.
> >>>
> >>>
> >>>
> >>> On Sat, 22 Nov 2025 at 14:49, tison <[email protected]> wrote:
> >>>>
> >>>>> a mention of a GPL license can be fine
> >>>>
> >>>> Typically, you'd end up with an allow list, like [1][2]
> >>>>
> >>>> [1]
> >>
> https://github.com/apache/flink/blob/d0c9ed9ff47cd0f0fae62958521a0b18e5cd9bf3/tools/ci/flink-ci-tools/src/main/java/org/apache/flink/tools/ci/licensecheck/JarFileChecker.java#L194-L260
> >>>> [2]
> >>
> https://github.com/apache/opendal/blob/c35da0d92442756d5742eaf70a2259dd23621b53/deny.toml#L28-L48
> >>>>
> >>>> Best,
> >>>> tison.
> >>>>
> >>>> <[email protected]> 于2025年11月22日周六 21:44写道：
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> One extra point that is worth mentioning. On several occasions, I’ve
> >> seen automation give a false sense of security. A tool reports
> everything
> >> as clean, and people assume the release is fine when it is not. It’s
> only
> >> when humans look deeper that a serious issue is discovered. For
> example, a
> >> mention of a GPL license can be fine, depending on the context, and
> >> automation is unlikely to detect it.
> >>>>>
> >>>>> Kind Regards.
> >>>>>
> >>>>> Justin
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: [email protected]
> >>>> For additional commands, e-mail: [email protected]
> >>>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: [email protected]
> >>> For additional commands, e-mail: [email protected]
> >>>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [email protected]
> >> For additional commands, e-mail: [email protected]
> >>
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Re: shared tools to validate convenience binaries and artifacts

Reply via email to