Control: reassign -1 dkms
Control: close -1
On 4/27/25 10:25, Russell Coker wrote:
On Sunday, 27 April 2025 17:04:25 AEST Andreas Beckmann wrote:
The error about the module version not being newer would be because the
postinst has been run many times (every time I install packages) and
compiles the same files. Maybe there should be a --force to address that
case.
dkms fails to get the version from the module because of that weird modinfo
failure and then uses an empty version string (notice the double space in
the error message where the version should have been printed)
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1104199
OK thanks for pointing that out. It turned out that there was EACCES which
Good. I don't think dkms could do much in that case...
And having no version still makes a valid module.
After I fixed that error I still got the following (which happens even if
running in permissive mode so it's not SE Linux at fault) so it looks like the
--force is needed:
Signing module /var/lib/dkms/nvidia-current/535.216.03/build/nvidia-uvm.ko
Signing module /var/lib/dkms/nvidia-current/535.216.03/build/nvidia-peermem.ko
Module /lib/modules/6.12.22-amd64/updates/dkms/nvidia-current.ko.xz already
installed at version 535.216.03, override by specifying --force
Module /lib/modules/6.12.22-amd64/updates/dkms/nvidia-current-modeset.ko.xz
already installed at version 535.216.03, override by specifying --force
Module /lib/modules/6.12.22-amd64/updates/dkms/nvidia-current-drm.ko.xz
already installed at version 535.216.03, override by specifying --force
Module /lib/modules/6.12.22-amd64/updates/dkms/nvidia-current-uvm.ko.xz
already installed at version 535.216.03, override by specifying --force
Module /lib/modules/6.12.22-amd64/updates/dkms/nvidia-current-peermem.ko.xz
already installed at version 535.216.03, override by specifying --force
At some point dkms got confused and lost the information that the module
was already installed. Or could it be that it couldn't delete the
previously installed module because of some permission error?
(I've never used SE Linux and I doubt I could do tests for these issues
in a chroot on a "normal" host kernel.)
This could also be caused by an earlier dkms version that made it easier
to get dkms into a messier state.
Maybe add --force
# dkms install -k 6.12.22-amd64 nvidia-current/535.216.03 --force
Found pre-existing /lib/modules/6.12.22-amd64/updates/dkms/nvidia-
current.ko.xz, archiving for uninstallation
Installing /lib/modules/6.12.22-amd64/updates/dkms/nvidia-current.ko.xz
Found pre-existing /lib/modules/6.12.22-amd64/updates/dkms/nvidia-current-
modeset.ko.xz, archiving for uninstallation
Installing /lib/modules/6.12.22-amd64/updates/dkms/nvidia-current-
modeset.ko.xz
Found pre-existing /lib/modules/6.12.22-amd64/updates/dkms/nvidia-current-
drm.ko.xz, archiving for uninstallation
Installing /lib/modules/6.12.22-amd64/updates/dkms/nvidia-current-drm.ko.xz
Found pre-existing /lib/modules/6.12.22-amd64/updates/dkms/nvidia-current-
uvm.ko.xz, archiving for uninstallation
Installing /lib/modules/6.12.22-amd64/updates/dkms/nvidia-current-uvm.ko.xz
Found pre-existing /lib/modules/6.12.22-amd64/updates/dkms/nvidia-current-
peermem.ko.xz, archiving for uninstallation
Installing /lib/modules/6.12.22-amd64/updates/dkms/nvidia-current-
peermem.ko.xz
Running depmod..... done.
# echo $?
0
Yes, dkms stomps over the not-cleaned up module of the same version.
Reinstalling on 6.12.22 would need --force again, but 6.12.25 was
recently uploaded and you should get clean results without manual steps
there. Upon removing 6.12.22 you may have some leftover stray modules in
/lib/modules/6.12.22-* but it will be hard for dkms to clean that up
properly.
I'm reassigning to dkms and closing this bug report for now, but I'm
open to try fixing this in dkms if there is a reproducible way for dkms
(3.1.8+) to get into this state. (Even if it involves "user errors" to
e.g. get the SE Linux label wrong.) Please reopen if there is more
action needed.
Andreas
PS: If you want to experiment, you could try with the dkms-test-dkms
package. Builds a single trivial module which does nothing. ;-)
PPS: I have an idea what could have happened:
- initial 'dkms install' succeded
- SE label gets corrupted
- upon 'dkms uninstall', the "unaccessible" built module in
/var/lib/dkms) has version '' (empty) and is thus older than the version
found in /lib/modules - so dkms concludes it didn't install its
"outdated" module and therefore does not delete it from /lib/modules
- after processing all modules (and removing none after the version
check always failed) dkms moves the module state from 'installed' to 'built'
- on a subsequent 'dkms install' dkms stomps over the supposedly
"in-kernel" module of the same version.
==> https://github.com/dell/dkms/issues/525