Re: perf interrupt took too long

Darac Marjal Mon, 22 Sep 2014 06:57:03 -0700

On Mon, Sep 22, 2014 at 03:26:39PM +0200, Thorsten Glaser wrote:
> Hi,
> 
> just got this vomited onto the console and into dmesg:
> 
> [  998.354300] perf interrupt took too long (2516 > 2500), lowering 
> kernel.perf_event_max_sample_rate to 50000
> 
> What sort of problem is this, and why is it so important that it
> occurs basically on every boot, and what can I do to “fix” it,
> whatever a “fix” is. I do not use perf.


As I understand it (from the patch at
https://lkml.org/lkml/2014/2/11/314), perf (part of the kernel relating
to performance monitoring), tries to queue up some work in response to
an IRQ firing. Now, if it tries to queue up too much work, then you get
into a state where "NMIs [are] firing off so fast that
nothing else [gets] a chance to run". Which, I think, means that the
kernel is spending all its time monitoring its own performance and no
time doing what you want it to do.

So, instead of filling that work queue, it prints out a warning to the
kernel buffer (which, in your case is also sent to the console, probably
by your syslogd) that it is reducing the rate at which it takes
performance samples. If you were to use the perf tool, this would be a
warning that you'd lose resolution in your numbers.

Now, in terms of a root cause, I've not been able to find anything
concrete. I don't know if large numbers of other interrupts (e.g. disk
access, network access etc) will impact this bit of code, or if it's
simply because the kernel is busy at one particular point and can't
service the interrupt quickly enough, or what the problem is.

So far, it looks like it's not a particular problem, though. You've been
given a warning that the sample rate for a function you don't use has
been reduced.

If you'd like to avoid the message, you could probably just tune the
sysctl mentioned in the error message to a suitably low level.

> 
> This is an IBM X61 laptop.
> 
> processor       : 1
> vendor_id       : GenuineIntel
> cpu family      : 6
> model           : 15
> model name      : Intel(R) Core(TM)2 Duo CPU     T7300  @ 2.00GHz
> stepping        : 11
> microcode       : 0xba
> cpu MHz         : 800.000
> cache size      : 4096 KB
> physical id     : 0
> siblings        : 2
> core id         : 1
> cpu cores       : 2
> apicid          : 1
> initial apicid  : 1
> fpu             : yes
> fpu_exception   : yes
> cpuid level     : 10
> wp              : yes
> flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm 
> constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 
> monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm ida dtherm tpr_shadow 
> vnmi flexpriority
> bogomips        : 3990.28
> clflush size    : 64
> cache_alignment : 64
> address sizes   : 36 bits physical, 48 bits virtual
> power management:
> 
> 
> Thanks,
> //mirabilos
> -- 
> tarent solutions GmbH
> Rochusstraße 2-4, D-53123 Bonn • http://www.tarent.de/
> Tel: +49 228 54881-393 • Fax: +49 228 54881-235
> HRB 5168 (AG Bonn) • USt-ID (VAT): DE122264941
> Geschäftsführer: Dr. Stefan Barth, Kai Ebenrett, Boris Esser, Alexander Steeg
> 
> 
> --
> To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org
> with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
> Archive: 
> https://lists.debian.org/alpine.deb.2.11.1409221524530....@tglase.lan.tarent.de
> 


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140922135624.ga13...@darac.org.uk

Re: perf interrupt took too long

Reply via email to