I manage 300+ machines that run openafs that has a dkms built kernel module like the nvidia module that needs to be built. I also manage dozens of nvidia gpu servers where users have sudo access and can install anything they want. Here is a snippet of what I found. Note this is for 16.04 systems but 14.04 systems running the 4.4.0-116 kernel will have similar problems:
Short story, if your machine is not using the Ubuntu supplied gcc you will have issues with afs and nvidia built kernel modules or any dkms built kernel modules. Longer story below. NOTE! this problem affects at least, openafs, nvidia, virtual box or any dkms built module. I am going to forward this info to gr...@cs.unc.edu. This started with the latest Ubuntu 4.4.0-116 kernel version. Looking through that bug and testing took me hours. The short story is the machines having issues with openafs.ko module are ones that have the Ubuntu toolchain ppa that has a gcc compiler suite that does not support the "retpoline" feature which was recently put in to fix the Spectre security issue. The nvidia module will also have issues. The machines using the Ubuntu supplied gcc compiler are the ones that are not having issues. But, host olympia was a special case. The compiler that works, using "gcc -v" gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.9) The ones that don't work like host bvisionserver8: gcc version 5.4.1 20160904 You can use "apt-cache policy gcc" to show what repo the compiler comes from. WARNING, /usr/bin/gcc is a link to /usr/bin/gcc-5, the gcc package is a meta package and you need to query gcc-5. If you query gcc it shows coming from the standard Ubuntu repo but /usr/bin/gcc-5 is coming from the toolchain repo. A good gcc-5 shows: ---------------------------- classroom:55% apt-cache policy gcc-5 gcc-5: Installed: 5.4.0-6ubuntu1~16.04.9 Candidate: 5.4.0-6ubuntu1~16.04.9 Version table: *** 5.4.0-6ubuntu1~16.04.9 500 500 http://us.archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages 500 http://security.ubuntu.com/ubuntu xenial-security/main amd64 Packages 100 /var/lib/dpkg/status 5.3.1-14ubuntu2 500 500 http://us.archive.ubuntu.com/ubuntu xenial/main amd64 Packages The bad compilers show: ---------------------------------- bvisionserver8:/> apt-cache policy gcc-5 gcc-5: Installed: 5.4.1-2ubuntu1~16.04 Candidate: 5.4.1-2ubuntu1~16.04 Version table: *** 5.4.1-2ubuntu1~16.04 500 500 http://ppa.launchpad.net/ubuntu-toolchain-r/test/ubuntu xenial/main amd64 Packages 100 /var/lib/dpkg/status 5.4.0-6ubuntu1~16.04.9 500 500 http://us.archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages 500 http://security.ubuntu.com/ubuntu xenial-security/main amd64 Packages 5.3.1-14ubuntu2 500 500 http://us.archive.ubuntu.com/ubuntu xenial/main amd64 Packages And you can see /etc/apt/sources.list.d/ubuntu-toolchain-r-ubuntu-test-xenial.list repo is configure on those machines. On a good machine modinfo openafs shows that retpoline is turned on in the vermagic: line: classroom:56% modinfo openafs filename: /lib/modules/4.4.0-116-generic/updates/dkms/openafs.ko license: http://www.openafs.org/dl/license10.html srcversion: 4E1BEB8CE16072EF8E64542 depends: vermagic: 4.4.0-116-generic SMP mod_unload modversions retpoline And not turned on a bad machine: bvisionserver8:/> modinfo openafs filename: /lib/modules/4.4.0-116-generic/updates/dkms/openafs.ko license: http://www.openafs.org/dl/license10.html srcversion: 66044F5DC18AA3288DB22FF depends: vermagic: 4.4.0-116-generic SMP mod_unload modversions -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to xorg in Ubuntu. https://bugs.launchpad.net/bugs/1750937 Title: 4.4.0-116 Kernel update on 2/21 breaks Nvidia drivers (on 14.04 and 16.04) by an insufficient compiler! Status in gcc: New Status in linux package in Ubuntu: Confirmed Status in nvidia-graphics-drivers-384 package in Ubuntu: Confirmed Status in xorg package in Ubuntu: Confirmed Bug description: Running fine with nvidia-384 until this kernel update came along. When booted into the new kernel, got super low resolution and nvidia- settings was missing most of its functionality - could not change resolution. Rebooted into 4.4.0-112 kernel and all was well. The root cause of the problem has been found to be installing the -116 kernel without a sufficiently updated version of gcc. In my case, my system received the gcc update AFTER the kernel update. Uninstalling the -116 kernel and reinstalling it with the updated version of gcc solved the problem for me. ProblemType: Bug DistroRelease: Ubuntu 14.04 Package: xorg 1:7.7+1ubuntu8.1 ProcVersionSignature: Ubuntu 4.4.0-112.135~14.04.1-generic 4.4.98 Uname: Linux 4.4.0-112-generic x86_64 NonfreeKernelModules: nvidia_uvm nvidia_drm nvidia_modeset nvidia .proc.driver.nvidia.registry: Binary: "" .proc.driver.nvidia.version: NVRM version: NVIDIA UNIX x86_64 Kernel Module 384.111 Tue Dec 19 23:51:45 PST 2017 GCC version: gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.3) ApportVersion: 2.14.1-0ubuntu3.27 Architecture: amd64 CurrentDesktop: LXDE Date: Wed Feb 21 19:23:39 2018 DistUpgraded: Fresh install DistroCodename: trusty DistroVariant: ubuntu DkmsStatus: bbswitch, 0.7, 4.4.0-112-generic, x86_64: installed bbswitch, 0.7, 4.4.0-116-generic, x86_64: installed nvidia-384, 384.111, 4.4.0-112-generic, x86_64: installed nvidia-384, 384.111, 4.4.0-116-generic, x86_64: installed GraphicsCard: NVIDIA Corporation Device [10de:1c82] (rev a1) (prog-if 00 [VGA controller]) Subsystem: eVga.com. Corp. Device [3842:6253] InstallationDate: Installed on 2015-03-03 (1086 days ago) InstallationMedia: Ubuntu 14.04.2 LTS "Trusty Tahr" - Release amd64 (20150218.1) MachineType: ASUSTeK COMPUTER INC. M11AD ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-112-generic root=UUID=5a88d2a1-0a24-415b-adc2-28435b13248a ro quiet splash vt.handoff=7 SourcePackage: xorg Symptom: display UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 08/15/2013 dmi.bios.vendor: American Megatrends Inc. dmi.bios.version: 0302 dmi.board.asset.tag: To be filled by O.E.M. dmi.board.name: M11AD dmi.board.vendor: ASUSTeK COMPUTER INC. dmi.board.version: Rev X.0x dmi.chassis.asset.tag: Asset-1234567890 dmi.chassis.type: 3 dmi.chassis.vendor: Chassis Manufacture dmi.chassis.version: Chassis Version dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr0302:bd08/15/2013:svnASUSTeKCOMPUTERINC.:pnM11AD:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnM11AD:rvrRevX.0x:cvnChassisManufacture:ct3:cvrChassisVersion: dmi.product.name: M11AD dmi.product.version: System Version dmi.sys.vendor: ASUSTeK COMPUTER INC. version.compiz: compiz 1:0.9.11.3+14.04.20160425-0ubuntu1 version.ia32-libs: ia32-libs N/A version.libdrm2: libdrm2 2.4.67-1ubuntu0.14.04.2 version.libgl1-mesa-dri: libgl1-mesa-dri N/A version.libgl1-mesa-dri-experimental: libgl1-mesa-dri-experimental N/A version.libgl1-mesa-glx: libgl1-mesa-glx N/A version.nvidia-graphics-drivers: nvidia-graphics-drivers N/A version.xserver-xorg-core: xserver-xorg-core N/A version.xserver-xorg-input-evdev: xserver-xorg-input-evdev N/A version.xserver-xorg-video-ati: xserver-xorg-video-ati N/A version.xserver-xorg-video-intel: xserver-xorg-video-intel N/A version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau N/A xserver.bootTime: Wed Feb 21 18:48:14 2018 xserver.configfile: default xserver.errors: xserver.logfile: /var/log/Xorg.0.log xserver.outputs: xserver.version: 2:1.18.3-1ubuntu2.3~trusty4 To manage notifications about this bug go to: https://bugs.launchpad.net/gcc/+bug/1750937/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp