Package: nvidia-driver Version: 525.89.02-1 Nvidia drivers newer than the 510 series fail to load on my system, which is a Lenovo Thinkpad P51 with a Quadro M2200 GPU, with BIOS 1.60 and ECP 1.10. I have encountered this bug with driver versions 515, 520 and now the 525 that landed in testing, as well as with a version of 525 installed using nvidia's official installer, and kernels including 6.0.7 and 6.2.2 from xanmod and 6.1.0-5-amd64 from Debian's official repository. My system is a mixture of packages from stable and testing, with libc6=2.36-7. Driver version 510.108.03-1 works (but is unstable in sleep and broken in hibernation).
Below is an excerpt from journalctl's output including what appears to be potentially pertinent clusters of lines to me. All logs are from a boot on the xanmod 6.2.2 kernel, but there is no appreciable difference in the relevant outputs when running with Debian's 6.1.0-5. The operational failure points seem to be the ones pertaining to RmInitAdapter and NvKmsKapiDevice, but I'm not sure what, if any, causality there is between the two issues. Mar 08 21:28:46 tangerine kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-6.2.2-x64v1-xanmod1 root=(***) ro quiet mitigations=off psi=1 nvidia-drm.modeset=1 (...) Mar 08 21:28:46 tangerine kernel: nvidia: module verification failed: signature and/or required key missing - tainting kernel Mar 08 21:28:46 tangerine kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 235 Mar 08 21:28:46 tangerine kernel: Mar 08 21:28:46 tangerine kernel: nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem Mar 08 21:28:46 tangerine kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module 525.89.02 Wed Feb 1 23:23:25 UTC 2023 Mar 08 21:28:46 tangerine kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 525.89.02 Wed Feb 1 23:09:40 UTC 2023 Mar 08 21:28:46 tangerine systemd[1]: Finished Rebuild Hardware Database. Mar 08 21:28:46 tangerine systemd[1]: Starting Rule-based Manager for Device Events and Files... Mar 08 21:28:46 tangerine kernel: [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver Mar 08 21:28:46 tangerine kernel: ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20221020/nsarguments-61) Mar 08 21:28:46 tangerine systemd[1]: Started Rule-based Manager for Device Events and Files. Mar 08 21:28:46 tangerine systemd[1]: Starting Show Plymouth Boot Screen... (...) Mar 08 21:28:55 tangerine kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x25:0x65:1457) Mar 08 21:28:55 tangerine kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0 Mar 08 21:28:55 tangerine kernel: [drm:nv_drm_load [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NvKmsKapiDevice Mar 08 21:28:55 tangerine kernel: [drm:nv_drm_probe_devices [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to register device Mar 08 21:28:55 tangerine systemd-modules-load[306]: Inserted module 'nvidia_drm' Some possibly pertinent information from nvidia-bug-report.log.gz: ____________________________________________ *** /sys/bus/pci/devices/0000:01:00.0/power/control *** ls: -rw-r--r-- 1 root root 4096 2023-03-08 22:17:25.703945619 +0100 /sys/bus/pci/devices/0000:01:00.0/power/control on ____________________________________________ *** /sys/bus/pci/devices/0000:01:00.0/power/runtime_status *** ls: -r--r--r-- 1 root root 4096 2023-03-08 22:17:25.711945656 +0100 /sys/bus/pci/devices/0000:01:00.0/power/runtime_status active ____________________________________________ *** /sys/bus/pci/devices/0000:01:00.0/power/runtime_usage *** ls: -r--r--r-- 1 root root 4096 2023-03-08 22:17:25.715945675 +0100 /sys/bus/pci/devices/0000:01:00.0/power/runtime_usage 3 ____________________________________________ *** /sys/bus/pci/devices/0000:01:00.1/power/control *** ls: -rw-r--r-- 1 root root 4096 2023-03-08 22:17:25.749945832 +0100 /sys/bus/pci/devices/0000:01:00.1/power/control auto ____________________________________________ *** /sys/bus/pci/devices/0000:01:00.1/power/runtime_status *** ls: -r--r--r-- 1 root root 4096 2023-03-08 22:17:25.761945887 +0100 /sys/bus/pci/devices/0000:01:00.1/power/runtime_status suspended ____________________________________________ *** /sys/bus/pci/devices/0000:01:00.1/power/runtime_usage *** ls: -r--r--r-- 1 root root 4096 2023-03-08 22:17:25.807946100 +0100 /sys/bus/pci/devices/0000:01:00.1/power/runtime_usage 0 ____________________________________________ *** /proc/driver/nvidia/./gpus/0000:01:00.0/power *** ls: -r--r--r-- 1 root root 0 2023-03-08 22:17:25.819946156 +0100 /proc/driver/nvidia/./gpus/0000:01:00.0/power Runtime D3 status: ? Video Memory: ? GPU Hardware Support: Video Memory Self Refresh: ? Video Memory Off: ? ____________________________________________ /usr/bin/lspci -d "10de:*" -v -xxx 01:00.0 VGA compatible controller: NVIDIA Corporation GM206GLM [Quadro M2200 Mobile] (rev a1) (prog-if 00 [VGA controller]) Subsystem: Lenovo GM206GLM [Quadro M2200 Mobile] Flags: bus master, fast devsel, latency 0, IRQ 16 Memory at eb000000 (32-bit, non-prefetchable) [size=16M] Memory at c0000000 (64-bit, prefetchable) [size=256M] Memory at d0000000 (64-bit, prefetchable) [size=32M] I/O ports at d000 [size=128] Expansion ROM at 000c0000 [virtual] [disabled] [size=128K] Capabilities: [60] Power Management version 3 Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [78] Express Endpoint, MSI 00 Capabilities: [100] Virtual Channel Capabilities: [250] Latency Tolerance Reporting Capabilities: [258] L1 PM Substates Capabilities: [128] Power Budgeting <?> Capabilities: [420] Advanced Error Reporting Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?> Capabilities: [900] Secondary PCI Express Kernel driver in use: nvidia Kernel modules: nvidia 00: de 10 36 14 07 00 10 00 a1 00 00 03 00 00 80 00 10: 00 00 00 eb 0c 00 00 c0 00 00 00 00 0c 00 00 d0 20: 00 00 00 00 01 d0 00 00 00 00 00 00 aa 17 51 22 30: 00 00 00 00 60 00 00 00 00 00 00 00 0a 01 00 00 40: aa 17 51 22 00 00 00 00 00 00 00 00 00 00 00 00 50: 01 00 00 00 01 00 00 00 ce d6 23 00 00 00 00 00 60: 01 68 03 00 08 00 00 00 05 78 80 00 00 00 00 00 70: 00 00 00 00 00 00 00 00 10 00 02 00 e1 8d 2c 01 80: 30 21 00 00 03 3d 45 00 40 01 01 11 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 13 08 04 00 a0: 00 04 00 00 0e 00 00 00 03 00 1f 00 00 00 00 00 b0: 00 00 00 00 09 00 14 01 00 00 10 80 00 00 00 00 c0: 61 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01:00.1 Audio device: NVIDIA Corporation GM206 High Definition Audio Controller (rev a1) Flags: bus master, fast devsel, latency 0, IRQ 17 Memory at ec000000 (32-bit, non-prefetchable) [size=16K] Capabilities: [60] Power Management version 3 Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [78] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Kernel driver in use: snd_hda_intel Kernel modules: snd_hda_intel 00: de 10 ba 0f 06 00 10 00 a1 00 03 04 00 00 80 00 10: 00 00 00 ec 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 60 00 00 00 00 00 00 00 ff 02 00 00 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50: 00 00 00 00 00 00 00 00 ce d6 23 00 00 00 00 00 60: 01 68 03 00 0b 00 00 00 05 78 80 00 00 00 00 00 70: 00 00 00 00 00 00 00 00 10 00 02 00 e1 8d 2c 01 80: 30 29 09 00 03 3d 45 00 43 00 01 11 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 13 08 04 00 a0: 00 00 00 00 0e 00 00 00 00 00 01 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ____________________________________________ /usr/bin/lspci -d "10b5:*" -v -xxx ____________________________________________ /usr/bin/lspci -t -[0000:00]-+-00.0 +-01.0-[01]--+-00.0 | \-00.1 +-08.0 +-14.0 +-14.2 +-15.0 +-16.0 +-16.3 +-17.0 +-1c.0-[03]-- +-1c.2-[04]----00.0 +-1c.4-[05-3d]-- +-1d.0-[3e]----00.0 +-1d.4-[3f]----00.0 +-1f.0 +-1f.2 +-1f.3 +-1f.4 \-1f.6 ____________________________________________ /usr/bin/lspci -nn 00:00.0 Host bridge [0600]: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers [8086:5918] (rev 05) 00:01.0 PCI bridge [0604]: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 05) 00:08.0 System peripheral [0880]: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model [8086:1911] 00:14.0 USB controller [0c03]: Intel Corporation 100 Series/C230 Series Chipset Family USB 3.0 xHCI Controller [8086:a12f] (rev 31) 00:14.2 Signal processing controller [1180]: Intel Corporation 100 Series/C230 Series Chipset Family Thermal Subsystem [8086:a131] (rev 31) 00:15.0 Signal processing controller [1180]: Intel Corporation 100 Series/C230 Series Chipset Family Serial IO I2C Controller #0 [8086:a160] (rev 31) 00:16.0 Communication controller [0780]: Intel Corporation 100 Series/C230 Series Chipset Family MEI Controller #1 [8086:a13a] (rev 31) 00:16.3 Serial controller [0700]: Intel Corporation 100 Series/C230 Series Chipset Family KT Redirection [8086:a13d] (rev 31) 00:17.0 SATA controller [0106]: Intel Corporation Q170/Q150/B150/H170/H110/Z170/CM236 Chipset SATA Controller [AHCI Mode] [8086:a102] (rev 31) 00:1c.0 PCI bridge [0604]: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #1 [8086:a110] (rev f1) 00:1c.2 PCI bridge [0604]: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #3 [8086:a112] (rev f1) 00:1c.4 PCI bridge [0604]: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #5 [8086:a114] (rev f1) 00:1d.0 PCI bridge [0604]: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #9 [8086:a118] (rev f1) 00:1d.4 PCI bridge [0604]: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #13 [8086:a11c] (rev f1) 00:1f.0 ISA bridge [0601]: Intel Corporation CM238 Chipset LPC/eSPI Controller [8086:a154] (rev 31) 00:1f.2 Memory controller [0580]: Intel Corporation 100 Series/C230 Series Chipset Family Power Management Controller [8086:a121] (rev 31) 00:1f.3 Audio device [0403]: Intel Corporation CM238 HD Audio Controller [8086:a171] (rev 31) 00:1f.4 SMBus [0c05]: Intel Corporation 100 Series/C230 Series Chipset Family SMBus [8086:a123] (rev 31) 00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (5) I219-LM [8086:15e3] (rev 31) 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM206GLM [Quadro M2200 Mobile] [10de:1436] (rev a1) 01:00.1 Audio device [0403]: NVIDIA Corporation GM206 High Definition Audio Controller [10de:0fba] (rev a1) 04:00.0 Network controller [0280]: Intel Corporation Wireless 8265 / 8275 [8086:24fd] (rev 78) 3e:00.0 Non-Volatile memory controller [0108]: Lenovo Device [17aa:0004] 3f:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS525A PCI Express Card Reader [10ec:525a] (rev 01) ____________________________________________ ____________________________________________ *** /sys/devices/system/node/has_cpu *** ls: -r--r--r-- 1 root root 4096 2023-03-08 22:17:28.447958149 +0100 /sys/devices/system/node/has_cpu 0 ____________________________________________ *** /sys/devices/system/node/has_memory *** ls: -r--r--r-- 1 root root 4096 2023-03-08 22:17:28.449958159 +0100 /sys/devices/system/node/has_memory 0 ____________________________________________ *** /sys/devices/system/node/has_normal_memory *** ls: -r--r--r-- 1 root root 4096 2023-03-08 22:17:28.449958159 +0100 /sys/devices/system/node/has_normal_memory 0 ____________________________________________ *** /sys/devices/system/node/online *** ls: -r--r--r-- 1 root root 4096 2023-03-08 22:17:28.451958167 +0100 /sys/devices/system/node/online 0 ____________________________________________ *** /sys/devices/system/node/possible *** ls: -r--r--r-- 1 root root 4096 2023-03-08 22:17:28.453958177 +0100 /sys/devices/system/node/possible 0 ____________________________________________ *** /sys/bus/pci/devices/0000:01:00.0/local_cpulist *** ls: -r--r--r-- 1 root root 4096 2023-03-08 21:54:55.862015759 +0100 /sys/bus/pci/devices/0000:01:00.0/local_cpulist 0-7 ____________________________________________ *** /sys/bus/pci/devices/0000:01:00.0/numa_node *** ls: -rw-r--r-- 1 root root 4096 2023-03-08 21:54:55.862015759 +0100 /sys/bus/pci/devices/0000:01:00.0/numa_node -1 ____________________________________________ *** /proc/driver/nvidia/./version *** ls: -r--r--r-- 1 root root 0 2023-03-08 22:17:25.667945453 +0100 /proc/driver/nvidia/./version NVRM version: NVIDIA UNIX x86_64 Kernel Module 525.89.02 Wed Feb 1 23:23:25 UTC 2023 GCC version: gcc version 11.3.0 (Debian 11.3.0-8) ____________________________________________ *** /proc/driver/nvidia/./gpus/0000:01:00.0/information *** ls: -r--r--r-- 1 root root 0 2023-03-08 22:17:41.954016177 +0100 /proc/driver/nvidia/./gpus/0000:01:00.0/information Model: Quadro M2200 IRQ: 140 GPU UUID: GPU-????????-????-????-????-???????????? Video BIOS: ??.??.??.??.?? Bus Type: PCIe DMA Size: 40 bits DMA Mask: 0xffffffffff Bus Location: 0000:01:00.0 Device Minor: 0 GPU Excluded: No ____________________________________________ *** /proc/driver/nvidia/./gpus/0000:01:00.0/registry *** ls: -rw-r--r-- 1 root root 0 2023-03-08 22:17:42.072016657 +0100 /proc/driver/nvidia/./gpus/0000:01:00.0/registry Binary: "" ____________________________________________ *** /proc/driver/nvidia/./params *** ls: -r--r--r-- 1 root root 0 2023-03-08 21:55:09.912015263 +0100 /proc/driver/nvidia/./params ResmanDebugLevel: 4294967295 RmLogonRC: 1 ModifyDeviceFiles: 1 DeviceFileUID: 0 DeviceFileGID: 0 DeviceFileMode: 438 InitializeSystemMemoryAllocations: 1 UsePageAttributeTable: 4294967295 EnableMSI: 1 EnablePCIeGen3: 0 MemoryPoolSize: 0 KMallocHeapMaxSize: 0 VMallocHeapMaxSize: 0 IgnoreMMIOCheck: 0 TCEBypassMode: 0 EnableStreamMemOPs: 0 EnableUserNUMAManagement: 1 NvLinkDisable: 0 RmProfilingAdminOnly: 1 PreserveVideoMemoryAllocations: 0 EnableS0ixPowerManagement: 0 S0ixPowerManagementVideoMemoryThreshold: 256 DynamicPowerManagement: 3 DynamicPowerManagementVideoMemoryThreshold: 200 RegisterPCIDriver: 1 EnablePCIERelaxedOrderingMode: 0 EnableGpuFirmware: 18 EnableGpuFirmwareLogs: 2 EnableDbgBreakpoint: 0 OpenRmEnableUnsupportedGpus: 0 DmaRemapPeerMmio: 1 RegistryDwords: "" RegistryDwordsPerDevice: "" RmMsg: "" GpuBlacklist: "" TemporaryFilePath: "" ExcludedGpus: "" ____________________________________________ *** /proc/driver/nvidia/./registry *** ls: -rw-r--r-- 1 root root 0 2023-03-08 22:17:42.076016674 +0100 /proc/driver/nvidia/./registry Binary: "" In the event it is helpful, I can try to provide more complete information as gathered by reportbug, which however would be a bit burdensome since I just reverted my machine to 510.108.03-1 to restore functionality. I also have access to a full nvidia-bug-report.log.gz gathered in the broken configuration, but wasn't sure if the bug tracker supports attachments.