On Fri, 12 Sep 2014 09:23:46 -0500 Rafael Vega <rv...@elsoftwarehamuerto.org> wrote: > Package: thermald > Version: 1.3-3 > Severity: critical > Justification: breaks the whole system > > Dear Maintainer, > > > * What exactly did you do (or not do) that was effective (or > ineffective)? > > Installed thermald 1.3-3 from testing repos and booted into realtime kernel > (3.12-4-rt-amd64) > > * What was the outcome of this action? > > Sometimes the system will stop responding completely, no keyboard or mouse is > accepted, not even switching to ttys. A hard shutdown is required and it has > lead to file system corruption. Inspecting dmesg logs, I found this: > > [ 39.776121] BUG: scheduling while atomic: Xorg/1007/0x00010001 > [ 39.776123] BUG: scheduling while atomic: swapper/2/0/0x00010002 > > ..... > > [ 39.776221] CPU: 1 PID: 1007 Comm: Xorg Tainted: P O 3.14-2-rt- > amd64 #1 Debian 3.14.15-2 > [ 39.776222] Hardware name: Apple Inc. MacBookPro8,1/Mac-94245B3640C91C81, > BIOS MBP81.88Z.0047.B27.1201241646 01/24/12 > [ 39.776223] ffff88026401db20 ffffffff814e04fc ffff88026401db20 > ffffffff814dce83 > [ 39.776224] ffffffff814e2cd8 0000000000065340 0000000000065340 > ffff88026392ffd8 > [ 39.776225] ffff88026401db20 ffff880267083ee0 ffff88026401e278 > ffff88026401db20 > [ 39.776225] Call Trace: > [ 39.776230] <IRQ> [<ffffffff814e04fc>] ? dump_stack+0x4a/0x75 > [ 39.776232] [<ffffffff814dce83>] ? __schedule_bug+0x96/0xa3 > [ 39.776234] [<ffffffff814e2cd8>] ? __schedule+0x5f8/0x670 > [ 39.776235] [<ffffffff814e2d77>] ? schedule+0x27/0xa0 > [ 39.776237] [<ffffffff814e4335>] ? rt_spin_lock_slowlock+0xb5/0x220 > [ 39.776240] [<ffffffffa0cf8ada>] ? > pkg_temp_thermal_platform_thermal_notify+0x3a/0x126 [x86_pkg_temp_thermal] > [ 39.776242] [<ffffffff8103b140>] ? therm_throt_process+0x10/0x140 > [ 39.776243] [<ffffffff8103b471>] ? intel_thermal_interrupt+0x201/0x240 > [ 39.776244] [<ffffffff8103b4f8>] ? smp_thermal_interrupt+0x18/0x40 > [ 39.776247] [<ffffffff814edadd>] ? thermal_interrupt+0x6d/0x80 > [ 39.776249] <EOI> [<ffffffff814ec9fd>] ? > system_call_fast_compare_end+0x10/0x15 > [ 39.776251] CPU: 2 PID: 0 Comm: swapper/2 Tainted: P W O 3.14-2-rt- > amd64 #1 Debian 3.14.15-2 > [ 39.776252] Hardware name: Apple Inc. MacBookPro8,1/Mac-94245B3640C91C81, > BIOS MBP81.88Z.0047.B27.1201241646 01/24/12 > [ 39.776253] ffff8802655c32a0 ffffffff814e04fc ffff8802655c32a0 > ffffffff814dce83 > [ 39.776254] ffffffff814e2cd8 0000000000065340 0000000000065340 > ffff8802655f5fd8 > [ 39.776255] ffff8802655c32a0 ffff880267103ee0 ffff8802655c39f8 > ffff8802655c32a0 > [ 39.776255] Call Trace:
Looks like scheduling issue with PREEMP_RT config. Short term solution is change in .config. Thermald will fallback to coretemp driver # CONFIG_X86_PKG_TEMP_THERMAL is is not set I have to look at the fixing driver bug when PREEMT_RT is defined. I am not much familiar at scheduling in RT context except here spin_lock_irq_save is preemptable.