Hi Simon,

On Mon, May 25, 2026 at 04:25:54PM +0200, Simon Josefsson wrote:
> I have made the above kind of mistake embarassingly often lately.  A
> cross-package file conflict seems like a simple problem that our QA
> tooling should have cought for me earlier than waiting for reports.

Yes, that thinking is quite popular, but what looks simple turns out to
not be. At least in my experience.

> I would prefer a Salsa pipeline job to detect these file conflicts.
> Second to that, a lintian check.  Third, some testing migration blocking
> check that ends up visible in the PTS.  Maybe the third really ought to
> be implemented anyway.
> 
> But how to implement such a check?

We can readily rule out lintian. It focuses on a source package and/or a
build for one architecture. Most file conflicts (Multi-Arch or not)
require several builds (of different or same packages, often covering
different suites).

The problem here is that this kind of check is not a package-level
check, it is a vendor-level check covering several releases and their
interaction.

> Helmut, is your code to detect this situation available somewhere?  I
> guess it involves an online check at some Debian service that knows
> about all files shipped by all (current) packages?  Your (excellent) bug
> reports about these findings suggests it is all automated somehow.

I started experimenting with what is available in
https://salsa.debian.org/helmutg/dacpa. Unfortunately, it turns out that
there are very many tricky corner cases that look obvious but are not.
I'm glad that you call my reports excellent, but that's largely due to
me double checking every single one of them.

Roughly speaking what you need here is:
 * Every single file installed into every single package of every
   architecture of a released suite. This is largely covered by Contents
   indices as you point out.
 * Also all symbolic links. Contents do not tell you about file types.
 * Also all directories. Contents do not include directories.
 * Ideally also file modes and ownership information.
 * Also checksums of all files (for M-A:same). Contents don't give
   these.
 * Lots of package relations (Conflicts, Pre-Depends, Provides,
   Replaces, ...). Otherwise you'll be generating a lot of false
   positives.
 * Knowledge of what diversions packages install. Again, not doing this
   causes lots of false positives. Keep in mind that diversions are not
   declarative.

Let me argue that, no this data is not readily available for download
from our mirrors or some other available database such as udd. Acquiring
this data is not utterly difficult and it all tends to fit into a little
15GB sqlite3 file.

When analyzing a package, you never know in advance which of this data
you will need. After all, it's about unrelated packages having a file
conflict, right?

So in your additional salsa job you now have a choice. You gather all
this data for your package and send it off to a network service that has
those 15GB. Or you download those 15GB. Neither of these sounds
particularly attractive for a salsa job to me.

>From my pov, the best way forward is testing migration blocking. We
partially get that due to me filing properly versioned RC bugs now. To
get it fully automated, there are two prerequisites from my pov:
 * We must fix tooling issues and issues that affect many packages. For
   instance binNMUs systematically violate M-A:same.
 * gcc versions before gcc-14 are utterly broken.
 * llvm versions before llvm-toolchain-21 are utterly broken.
 * The false positive rate must be reduced to an acceptable level.

Are you actually interested in contributing to his or merely interested
in $someone doing the work?

Helmut

Reply via email to