Hi Chris,

Chris Hofstaedtler wrote:
> > As far as I can see, I didn't get a reply back from you on these
> > suggestions of mine. Maybe my mail fell through the cracks. But I
> > think we should take the discussion up again, probably in this bug
> > report.
> 
> Right. I think I forgot to reply back then - sorry.

Happens…

> Experimental should have util-linux-extra 2.38-4+exp1 very soon,
> with irqtop installed. Obviously this can only be used for testing.

Thanks. That package though seems to miss the "Conflicts: irqtop". :-/
But I was aware of it and uninstalled irqtop beforehand. :-)

> Personally I think we should have only one irqtop - from my point of
> view it does not matter which one. Maybe the new version is
> superior.

Hmmmm.

> In any case we should not confuse our users.

Fully agree. Nevertheless, Debian is a lot about having choice between
different implementations (compared to e.g. Ubuntu). And choice
sometimes makes things less easier to understand.

> > Another point which comes to my mind now is that it might make sense
> > to rename the current irqtop package to irqtop-nf (or irqtop-ruby)
> > just to make clear that it does not contain the irqtop tool from
> > util-linux.
> 
> Might be an idea. But lets see what the differences are, first.

Ack.

zhenwei pi wrote:
> The main difference between the two versions:
> - original irqtop shows separated interrupt information
> - new irqtop shows aggregate interrupt information

Thanks for that summary.

(I btw. just noticed that zhenwei is actually the author of
util-linux's implementation of irqtop:
https://git.kernel.org/pub/scm/utils/util-linux/util-linux.git/commit/?id=d511011c
refers to https://github.com/pizhenwei/irqtop as previous place of
development. :-)

> Test env: Debian 10; 96 CPUs on a server, 230 characters per line in
> termial.
> 
> - irqtop (original version) shows uncompleted interrupts(31 / 96 CPUs).

Hrm, interesting.

> n194-087-081 - irqtop - 2022-04-15 09:42:48 +0800
>               CPU0   CPU1   CPU2   CPU3   CPU4   CPU5   CPU6   CPU7   CPU8   
> CPU9  CPU10  CPU11  CPU12  CPU13  CPU14  CPU15  CPU16  CPU17  CPU18  CPU19  
> CPU20  CPU21  CPU22  CPU23  CPU24  CPU25  CPU26  CPU27  CPU28  CPU29  CPU30  
> […]
>   cpuUtil:     0.0    0.0    0.4    0.0    1.3    0.0    0.0    0.2    0.0    
> 0.0    0.0    0.0    0.0    0.0    0.2    0.0    0.0    0.0    0.0    0.0    
> 0.0    0.0    0.0    0.0    0.0    0.2    0.2    0.2    0.2    0.0    0.2
>      %irq:     0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    
> 0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    
> 0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0
>     %sirq:     0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    
> 0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    
> 0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0
>  irqTotal:      32    293    477      5     34      7      5   1805    112    
>  51      3      2     28      2   1901      1     13     16      0      6     
>  1     19      2      2     67     29     51     51     42      9     34
> i       9:       .      2      0      0      0      .      .      .      .    
>   .      .      .      .      .      .      .      .      .      .      .     
>  .      .      .      .      .      .      .      .      .      .      .
> i      48:       .      .      .      .      .      .      0      .      .    
>   .      .      .      .      .      .      .      .      .      .      .     
>  .      .      .      .      .      .      .      .      .      .      .
> i      49:       .      .      .      .      .      .      .      .      .    
>   .      .      .      .      .      .      .      .      .      .      .     
>  .      .      .      .      .      .      .      .      .      .      .
> i      50:       .      .      .      .      .      .      .      .      .    
>   .      .      .      .      .      .      .      .      .      .      .     
>  .      .      .      .      .      .      .      .      .      .      .
> i      51:       .      .      .      .      .      .      .      .      .    
>   .      .      .      .      .      .      .      .      .      .      .     
>  .      .      .      .      .      .      .      .      .      .      .


I currently only have access to boxes with 32 cores, but it shows all
of them and also some additional information in the last column which
seems to have been stripped from your instance due to probably the
limited terminal width. Mine looks like this and also has IRQ names
shown instead of numbers:

somehost - irqtop - 2022-04-15 15:13:12 +0000
              CPU0   CPU1   CPU2   CPU3   CPU4   CPU5   CPU6   CPU7   CPU8   
CPU9  CPU10  CPU11  CPU12  CPU13  CPU14  CPU15  CPU16  CPU17  CPU18  CPU19  
CPU20  CPU21  CPU22  CPU23  CPU24  CPU25  CPU26  CPU27  CPU28  CPU29  CPU30  
CPU31
  cpuUtil:     0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    
0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    
0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    3.8    
0.0   total CPU utilization %
     %irq:     0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    
0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    
0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    
0.0   hardware IRQ CPU util%
    %sirq:     0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    
0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    
0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    
0.0   software IRQ CPU util%
 irqTotal:       5      0      1      0      0      5      0      0      0      
0      0      0      5      0      0     36      0      0      0      1      0  
    0      0      0      0      0      0      0      0      0     17      0   
total hardware IRQs
i     117:       .      .      .      .      .      .      .      .      .      
.      .      .      3      .      .      .      .      .      .      .      .  
    .      .      .      .      .      .      .      .      .      .      .   
IR-PCI-MSI 4194317-edge      i40e-eno1-TxRx-12
i     LOC:       5      0      1      0      0      5      0      0      0      
0      0      0      1      0      0     36      0      0      0      1      0  
    0      0      0      0      0      0      0      0      0     17      0    
Local timer interrupts
s   TIMER:       5      0      0      0      0      7      0      0      0      
0      0      0      1      0      0      1      0      0      0      1      0  
    0      0      0      0      0      0      0      0      0      0      0
s  NET_RX:       0      0      0      0      0      0      0      0      0      
0      0      0      3      0      0      0      0      0      0      0      0  
    0      0      0      0      0      0      0      0      0      0      0
s   SCHED:       5      0      0      0      0      7      0      0      0      
0      0      0      1      0      0     33      0      0      0      1      0  
    0      0      0      0      0      0      0      0      0      9      0
s     RCU:       1      0      0      0      0      7      0      0      0      
0      0      0      1      0      0      7      0      0      0      1      0  
    0      0      0      0      0      0      0      0      0     13      0

(I currently suspect that zhenwei used the irqtop from Debian 10
Buster, i.e. version 2.3 instead of the current version 2.6 as can be
found in Debian Unstable and Testing. That might have caused these
differences.)

> - irqtop (from util-linux) shows aggregate interrupt information.
> irqtop | total: 575548749447 delta: 518913 | n148-134-075 | 2022-04-15 
> 10:02:14+08:00
> 
>       IRQ        TOTAL     DELTA NAME
> 
>       LOC 218396041027    393883 Local timer interrupts
>       RES 217686711923     50039 Rescheduling interrupts
>       PIN  40532867503     10053 Posted-interrupt notification event
>       CAL  15012540676      2013 Function call interrupts
>       PIW  13810059255     57692 Posted-interrupt wakeup event
>       TLB   8699607720      1597 TLB shootdowns
>       221   4235495788        88 IR-PCI-MSI 50331656-edge eth0-4

That's quite a difference IMHO.

The from irqtop from util-linux though shows on my box also some per
CPU respectively per core stats (Debian Unstable, with irqtop from
Christian's util-linux-extra package version 2.38-4+exp1 from Debian
Experimental):

irqtop | total: 22014142315 delta: 9471 | c6 | 2022-04-15 16:47:26+02:00

        cpu0 cpu1 cpu2 cpu3
  %irq: 30.4 24.1 20.0 25.5
%delta: 36.1 18.5 16.5 28.8

            IRQ           TOTAL           DELTA NAME                            
                                                                             
            LOC     14019433563            6020 Local timer interrupts
            129      2573943707            1722 IR-PCI-MSI 520192-edge enp0s31f6
            RES      2091668263             575 Rescheduling interrupts
            130       794066902              17 IR-PCI-MSI 376832-edge 
ahci[0000:00:17.0]
            CAL       763171794             140 Function call interrupts
            138       612474851             790 IR-PCI-MSI 524288-edge nvkm
            128       463147433              67 IR-PCI-MSI 327680-edge xhci_hcd
            TLB       455459030               0 TLB shootdowns
            137       221045281             140 IR-PCI-MSI 514048-edge 
snd_hda_intel:card0
            131        19266170               0 IR-PCI-MSI 1572864-edge xhci_hcd
            NMI          198160               0 Non-maskable interrupts
            PMI          198160               0 Performance monitoring 
interrupts
            MCP           68615               0 Machine check polls
             17             217               0 IR-IO-APIC 17-fasteoi 
snd_hda_intel:card1

> Other enhanced features from the new version:
>  - sort by several rules, include IRQ, TOTAL, DELTA and NAME.
>  - specify cpus in list format to monitor.
>  - specify output columns to print.
>  - enable/disable per-cpu statistics by specified mode.

>From my point of view, they seem to have quite a different feature
set. The main advantage of the irqtop from util-linux seems to be that
it is more readable with a lot of CPUs, but gives less detailed statistics.

The ruby-written irqtop does more detailed per-cpu/per-core statistics
— which might be helpful with a few cores, but you'll loose overview
with a lot of cores, as seen by zhenwei's "screenshot" which is
truncated at CPU 30.

>  - performance improvement. New irqtop written by C uses a little CPU when
> running 'irqtop -d 1', the Ruby version spends more time(quite obvious on a
> 96 CPUs platform).

Yeah, it's obvious, but the reason is not the 96 CPUs but the fact
that its written in an interpreted language and not compiled.

Anyway, IMHO we should:

* Figure out how to get the util-linux implementation into Debian
  proper.

* irqtop from util-linux should in some way become the future default,
  as its probably what the user usually expects. The ruby-written
  irqtop is only a niche tool written for analysing the performace of
  the ipt_NETFLOW.ko iptables plugin kernel module. (But seems to have
  been useful elsewhere, too, as probably shown by the fact that
  util-linux added a similar tool, which is probably less focussed on
  that one job. :-)

Regarding the ruby-written irqtop:

* It is currently endangered to be removed from testing by the
  horribly outdated ruby-curses (https://bugs.debian.org/958973) in
  Debian which is also no more maintained; see
  https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=959115#10 and
  https://bugs.debian.org/1009727 (Christian: I X-Debbugs-Cc'ed you on
  #1009727 for that and because I know that you're also active in
  Debian's Ruby packaging.)

* It has a higher popularity than I expected:
  
https://qa.debian.org/popcon-graph.php?packages=irqtop&show_installed=on&show_vote=on&want_legend=on&beenhere=1

Because I as user and Linux admin prefer having choice and because the
two irqtop implementations seem to rather different, I really would
prefer to keep the ruby-written irqtop in Debian nevertheless at least
for now.

My currently preferred variant (probably needs to be a bit more
polished) to go forward is:

* Renaming the current irqtop package (and binary) to irqtop-nf.

* Making a "irqtop" a transitional package which pulls in either
  irqtop-nf or util-linux-extra , i.e. has a

    Depends: irqtop-nf | util-linux-extra

  in its control file. That way those who upgrade automatically get
  the same implementation as before. But those who look at the package
  see that there are two choices.

* After the Bookworm release, the "irqtop" package should be removed
  and provided by the util-linux-extra package, so that those who do
  "apt install irqtop" actually get the more expected implementation
  from util-linux.

* I think we should also try to use /etc/alternatives/irqtop +
  update-alternatives with irqtop from util-linux-extra having the
  higher priority so that those who install both, get the probably
  more expected util-linux-extra's implementation by default.

In case you agree, I'd upload an updated iptables-netflow source
package to Debian Experimental implementing these changes so we can
cross-installability and upgrade paths.

                Regards, Axel
-- 
 ,''`.  |  Axel Beckert <a...@debian.org>, https://people.debian.org/~abe/
: :' :  |  Debian Developer, ftp.ch.debian.org Admin
`. `'   |  4096R: 2517 B724 C5F6 CA99 5329  6E61 2FF9 CD59 6126 16B5
  `-    |  1024D: F067 EA27 26B9 C3FC 1486  202E C09E 1D89 9593 0EDE

Reply via email to