Hi Simon, On Mon, May 25, 2026 at 04:25:54PM +0200, Simon Josefsson wrote: > I have made the above kind of mistake embarassingly often lately. A > cross-package file conflict seems like a simple problem that our QA > tooling should have cought for me earlier than waiting for reports.
Yes, that thinking is quite popular, but what looks simple turns out to not be. At least in my experience. > I would prefer a Salsa pipeline job to detect these file conflicts. > Second to that, a lintian check. Third, some testing migration blocking > check that ends up visible in the PTS. Maybe the third really ought to > be implemented anyway. > > But how to implement such a check? We can readily rule out lintian. It focuses on a source package and/or a build for one architecture. Most file conflicts (Multi-Arch or not) require several builds (of different or same packages, often covering different suites). The problem here is that this kind of check is not a package-level check, it is a vendor-level check covering several releases and their interaction. > Helmut, is your code to detect this situation available somewhere? I > guess it involves an online check at some Debian service that knows > about all files shipped by all (current) packages? Your (excellent) bug > reports about these findings suggests it is all automated somehow. I started experimenting with what is available in https://salsa.debian.org/helmutg/dacpa. Unfortunately, it turns out that there are very many tricky corner cases that look obvious but are not. I'm glad that you call my reports excellent, but that's largely due to me double checking every single one of them. Roughly speaking what you need here is: * Every single file installed into every single package of every architecture of a released suite. This is largely covered by Contents indices as you point out. * Also all symbolic links. Contents do not tell you about file types. * Also all directories. Contents do not include directories. * Ideally also file modes and ownership information. * Also checksums of all files (for M-A:same). Contents don't give these. * Lots of package relations (Conflicts, Pre-Depends, Provides, Replaces, ...). Otherwise you'll be generating a lot of false positives. * Knowledge of what diversions packages install. Again, not doing this causes lots of false positives. Keep in mind that diversions are not declarative. Let me argue that, no this data is not readily available for download from our mirrors or some other available database such as udd. Acquiring this data is not utterly difficult and it all tends to fit into a little 15GB sqlite3 file. When analyzing a package, you never know in advance which of this data you will need. After all, it's about unrelated packages having a file conflict, right? So in your additional salsa job you now have a choice. You gather all this data for your package and send it off to a network service that has those 15GB. Or you download those 15GB. Neither of these sounds particularly attractive for a salsa job to me. >From my pov, the best way forward is testing migration blocking. We partially get that due to me filing properly versioned RC bugs now. To get it fully automated, there are two prerequisites from my pov: * We must fix tooling issues and issues that affect many packages. For instance binNMUs systematically violate M-A:same. * gcc versions before gcc-14 are utterly broken. * llvm versions before llvm-toolchain-21 are utterly broken. * The false positive rate must be reduced to an acceptable level. Are you actually interested in contributing to his or merely interested in $someone doing the work? Helmut

