Hi Simon, On Tue, Apr 15, 2025 at 01:47:54PM +0100, Simon McVittie wrote: > I think a regression for amd64/i386 co-installation would have a > considerably larger practical negative impact on Debian users than > ABI conflicts between less-commonly-used architecture pairs like > armel/armhf, and a very much larger practical negative impact than > conflicts between architecture pairs involving -ports (amd64/hurd-amd64 > or amd64/kfreebsd-amd64) or architectures that are not yet in Debian at > all (amd64/musl-linux-amd64).
This reasoning convinces me. As it stands, I only see solutions to this problem that are inappropriate for trixie. > (b) glibc/ld.so can't distinguish between armel and armhf libraries. > But if this is true, then we will already have the problem that loading > an ordinary library dependency like "libc.so.6" or "libvulkan.so.1" to > satisfy DT_NEEDED can load the wrong flavour, so we have already lost, > even before loading a Vulkan driver plugin; and I don't see how Mesa > doing a dlopen("libvulkan_radeon.so", ...) is going to make this > any worse. This is an argument that I missed earlier. Thank you. It changes preference on solution as it removes my argument for not preferring option 2. > But can you dlopen() kfreebsd-amd64 libraries into a running Linux > amd64 process? That's what matters here. If you can't, then the drivers > from mesa-vulkan-drivers:kfreebsd-amd64 will gracefully fail to load, > no harm done. Or if you can, then we likely already have worse problems. I fear I cannot answer that question anymore given that there no longer is kfreebsd-amd64. Experiments with armel and armhf (see my other mail) though yield that the error returned from dlerror() is different from when your ELF class is incompatible. I'd like to understand the semantics here, but my research of the glibc source was inconclusive in this regard. > Presumably musl-linux-amd64 has a library search path (either hard-coded > into it or via configuration) that is distinct from glibc's; if it didn't, > and if glibc's and musl's dynamic linkers are unable to avoid loading > libraries from the "other" ABI (scenario b above), we would already have > worse problems. The reason we don't have those problems primarily is that the ongoing disagreement between musl and systemd upstreams make it impossible to bootstrap musl-based Debian ports and therefore there is noone who attempts to mix musl and glibc on a single system. In any case, I expect musl to search /usr/lib, so there is at least that shared path. > However if a musl process calls > dlopen("/usr/lib/x86_64-linux-gnu/libvulkan_radeon.so", ...), as it > would if option 1 is taken, then I can see how that might accidentally > succeed if their ELF flags happen to be the same, leading to problems > when musl and glibc ABI assumptions collide. This is another fairly convincing argument! > This works as intended for the most common multiarch scenarios like > amd64/i386. I suspect it also works as intended for armel/armhf, although > your assertion is that it does not. Indeed, my assumption was that you could dlopen an armel library on armhf, but I didn't succeed in practically doing that. I cannot tell whether this is due to me not trying hard enough or whether there is some mechanism systematically preventing this from working in a reliable way. I propose the following consensus: None of the known solutions (options 1 and 2) or workarounds (dropping m-a:same) is appropriate for Debian trixie and the best course of short-term action is not fixing this bug for trixie while working towards a long term solution in forky. Regarding the precise implementation going forward, I prefer deferring to you (plural) as I've shared my limited knowledge and trust that you find a more sensible solution than I could. Helmut