Let's take this one point at a time: * fan not running at full speed in disengaged mode in a thermal emergency - as mentioned earlier, the default fan mode on the machine is to run under firmware control, in which case it runs in engaged mode with a loop feed back controller so it never exceeds a top speed of 3500 RPM. This matches the original thermal design by the manufacturer. So either they made a mistake and all machines like yours overheat (and we would see lots of owners with your machine reporting this bug) or this issue is particular to your machine
* CPU not throttling in a thermal emergency (unless the frequency readings are wrong) - that needs investigation as thermald should be doing that (but as I mentioned earlier, I will examine the thermald issues later) * shutting down when supposed to suspend as a reaction to overheat, unnecessarily destroying session - when a critical thermal event occurs one has a very short time window to react. Potentially the silicon may be permanently damaged, so the kernel chooses to power down rather ran try to suspend (since this can get stuck and exacerbate the issue). Without the handling of this thermal event, the next step is for the hardware to physically shut itself down which is out of any form of operating system control, so either way, the machine is desperately trying to save itself from breaking. * destroying session in a shut down/restart cycle (I heard rumours this may be fixed later in Snappy with containers) - again, in a rush to save your silicon from becoming irreparably damaged shutdown is the fastest mechanism. Snappy containers will not help. I'd recommend reading https://en.wikipedia.org/wiki/Thermal_design_power, there is paragraph that states: "Most modern processors will cause a therm-trip only upon a catastrophic cooling failure, such as a no longer operational fan or an incorrectly mounted heatsink." So, the next step will be to see if we can see what thermald is doing. 1. Stop thermald so we can re-enable it with full debug on: sudo systemctl stop thermald (if you are using systemd) or sudo service thermald stop (if you are using upstart) 2. Run thermald for a while from the command line and capture debug output: sudo thermald --no-daemon --dbus-enable --loglevel=debug | tee thermald.log ..run this say for 5-10 minutes and use your machine, then attach the thermald.log to the bug report 3. Re-start themrald sudo systemctl start thermald (if you are using systemd) or sudo service thermald start (if you are using upstart) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1491797 Title: Shuts down when supposed to suspend as a reaction to self-caused overheat, session lost Status in linux package in Ubuntu: In Progress Bug description: Error: Kernel foolishly shuts down the computer when it overheats. /var/log/kern.log W500 kernel: [1448.648529] thermal thermal_zone1: critical temperature reached (100 C), shutting down Consequence: Shutting down destroys session in Ubuntu, Gnome, and all applications that can't remember their latest conscious state (most applications). Attempted repair, failed: Laptop has suspending ability, but I can't find the setting for the kernel to make the computer suspend instead of shutting down. Repair suggestions: 1. Persistence of session, so that everything would reappear after the restart. (this would also make updating less disruptive) 2. Do not heat the machine like crazy; speed up fans or slow down processes. (problematic Lenovo Thinkpad W500 fan on low speed right up to the fiery end) 3. Put the computer to suspend when it's too hot. (The problem has remained the same from at least Ubuntu 11.10 through 14.04) ProblemType: Bug DistroRelease: Ubuntu 14.04 Package: linux-image-3.13.0-62-generic 3.13.0-62.102 ProcVersionSignature: Ubuntu 3.13.0-62.102-generic 3.13.11-ckt24 Uname: Linux 3.13.0-62-generic x86_64 ApportVersion: 2.14.1-0ubuntu3.12 Architecture: amd64 AudioDevicesInUse: USER PID ACCESS COMMAND /dev/snd/controlC0: user 2171 F.... pulseaudio CurrentDesktop: Unity Date: Thu Sep 3 13:42:28 2015 HibernationDevice: RESUME=UUID=991e1383-ff5b-46c1-84c4-c904e1d81256 InstallationDate: Installed on 2013-12-29 (612 days ago) InstallationMedia: Ubuntu 13.10 "Saucy Salamander" - Release amd64 (20131016.1) MachineType: LENOVO 4063B22 PccardctlIdent: Socket 0: no product info available PccardctlStatus: Socket 0: no card ProcFB: 0 radeondrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-62-generic root=UUID=bd426989-b545-41b3-97b8-de9410f27aa6 ro persistent quiet splash vt.handoff=7 RelatedPackageVersions: linux-restricted-modules-3.13.0-62-generic N/A linux-backports-modules-3.13.0-62-generic N/A linux-firmware 1.127.15 SourcePackage: linux UpgradeStatus: Upgraded to trusty on 2014-04-27 (494 days ago) dmi.bios.date: 12/14/2011 dmi.bios.vendor: LENOVO dmi.bios.version: 6FET92WW (3.22 ) dmi.board.name: 4063B22 dmi.board.vendor: LENOVO dmi.board.version: Not Available dmi.chassis.asset.tag: No Asset Information dmi.chassis.type: 10 dmi.chassis.vendor: LENOVO dmi.chassis.version: Not Available dmi.modalias: dmi:bvnLENOVO:bvr6FET92WW(3.22):bd12/14/2011:svnLENOVO:pn4063B22:pvrThinkPadW500:rvnLENOVO:rn4063B22:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable: dmi.product.name: 4063B22 dmi.product.version: ThinkPad W500 dmi.sys.vendor: LENOVO To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1491797/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp