Does it seem correct to say that the general intention of irqbalance wrt to system performance is to improve throughput (translating in some cases to a more responsive system) at a cost of increased processing latency?
If so, then it should be considered and tuned generally with regards to usage scenarios that consider latency vs throughput. Eg digital audio workstations and gaming machines might disable it. But as Steve says, we don't have any data on the tradeoffs. How much more throughput/responsiveness for what cost in latency under what configurations? All I can find is a recommendation not to use it on CPUs with 2 or fewer cores as the overhead is said to be too high (which acc to above would translate to "unreasonable amount of latency for relatively little or even no throughput gains"), but even then, are we talking physical or logical/virtual cores? It seems like the more cores a system has, the more trivial the overhead from running irqbalance per performance/responsiveness gain. Is there a threshold number of cores beyond which something like IRQ balance becomes strongly recommended for general computing applications? But even then just like power scaling I can imagine it might still add undesirable or even critical latency in applications that are highly latency sensitive (eg when milliseconds or fractions of milliseconds matter) This website gave me some clarity on the theory and purpose: https://www.baeldung.com/linux/irqbalance-modern-hardware There is another dimension, one related to one of the reasons why Apple became known as the "AV professional's workstation" for so long, is that (apparently for fascinating reasons of a historical accident) the multimedia system engineers gained enough influence in the company to allow them to tune the default system configuration to prioritize latency and then system responsiveness over throughput (and even some compromises in system security) to allow for minimal system config in applications requiring both low and consistent (eg low jitter) latency. As it turned out, they got away with doing this for so many years in part because the growing "AV professional wannabe" crowd who just used the system mostly for general (rather than low latency sensitive) applications didn't really notice or care about the hit to throughput or security vulnerability. Noticeable in benchmarks, but not in real life. My first point in saying this is that benchmarks don't necessarily tell us what will give the greatest benefit to the greatest number of users with minimal or no reconfiguration. Eg, who cares if it takes even 10% more milliseconds to transcode an AV file or compile code (on same hardware configred differently) if it means you could also run latency sensitive apps at a consistent (low jitter) amd low latency without having to reconfigure anything and maintaining a generally responsive system? People often just walk away from that anyway (either physically, eg smoke or coffee break, or figuratively, eg task switching, in which case a responsive system would be a higher priority than crunching the numbers slightly faster). My second point is I think obsession over benchmarking risks losing the forest through the trees and really often doesn't account for anything close to real world performance optimizations. But even then it could be argued this is only because we fail to consider important parameters in common performance benchmarks, such as "responsiveness" and "jitter" and "latency" alongside obsessing over throughput and to a lesser extent power management. For me, the core of this question means finally coming to clarity about what "optimal balance" means for the widest variety of desktop and server applications, just as Apple did accidentally a few decades ago with its client systems I think this requires considering a variety of factors instead of an unrealistically narrow idea of "performance" that does not factor in real world user experience. Eg the idea that most users often will appreciate improvements in latency and responsiveness without really much noticing the cost of throughput until someone starts obsessing over throughput benchmarks with relatively minute differences as far as our intuitive or subconscious experience is concerned. Less user frustration from fewer to no buffer overruns or perceived interface hiccups again draw on concepts such as "reliability" and "default breadth of utility" FWIW I think a lot of throughput obsession is about internalized and institutionalized planned obsolescence. It's the primary benchmark of OEM system performance, and a fairly lazy way to measure performance at that. 4min 30sec for a transcoded file would be considered hugely different than 5min 30sec on the same hardware, but for the average user who would just take a smoco break or switch tasks it doesn't matter as long as the system remains responsive and functional. And you'll never get to the transcoding in the first place if the system keeps recording buffer xruns that ruin a file being processed or recorded in real time due to system latency that is too high or variable for the sort of performance and responsiveness needed. Another way to put this is there's the social-economic dimension of performance vs the psychological dimension of performance. While the psychological one is arguably more important and humane it is often at odds with the social-economic dimension which seeks to sell more new systems because they are "faster" (and only very marginally and narrowly defined). I cpuld easily be out of the loop bur I just don't see this stuff considered often enough in discussions of performance optimization. Ethan On Sat, Jan 6, 2024, 20:20 Steve Langasek <1833...@bugs.launchpad.net> wrote: > Hi Christian, > > I see a lot of strong opinions being given, but aside from the "don't > use it in KVM" guidance which appears to be based on GCE's engineering > expertise, very little evidence that irqbalance is actually a problem. > > I think it's true that in the default config, irqbalance can interfere > with putting CPUs into higher C states to conserve power. However, I > don't see any indication of quantitative analysis showing the impact. > > Recent versions of irqbalance have a '--powerthresh' argument that can > be used to tell irqbalance to rebalance across fewer cores when CPU load > is low, to allow some of the cores to be put into a sleep state and > conserve power. My own initial testing on my desktop shows that this > gets used for all of about 10 seconds at a time every few hours, before > the load increases and irqbalance wakes the core back up... > > I would want any decision to remove irqbalance from the desktop to be > based on evidence, not conjecture. At a minimum, I think what I would > like to see is output from powertop showing both power consumption and > CPU idle stats over a reasonable amount of time (10 minutes?), on a > representative client machine, for a 2x3 matrix of configurations: > > - idle vs normal desktop load > - irqbalance disabled vs irqbalance enabled with defaults vs irqbalance > enabled with IRQBALANCE_ARGS=--powerthresh=1 > > System should be rebooted between each of the irqbalance configurations, > as I'm not sure what does or doesn't persist in the CPU config after > irqbalance exits. > > I am specifically not going to try to rebut the various webpages > referenced here, beyond saying that there's an awful lot of these pages > pointing to one other as authoritative sources on irqbalance without > there actually being evidence to back them up (and a heaping spoonful of > misinformation / outdated information along the way). So if we're going > to make a change, there should be due diligence to demonstrate a > benefit, it should not be based on Internet hype. > > -- > You received this bug notification because you are subscribed to the bug > report. > https://bugs.launchpad.net/bugs/1833322 > > Title: > Consider removing irqbalance from default install on desktop images > > Status in irqbalance package in Ubuntu: > New > Status in ubuntu-meta package in Ubuntu: > Confirmed > > Bug description: > as per https://github.com/pop-os/default-settings/issues/60 > > Distribution (run cat /etc/os-release): > > $ cat /etc/os-release > NAME="Pop!_OS" > VERSION="19.04" > ID=ubuntu > ID_LIKE=debian > PRETTY_NAME="Pop!_OS 19.04" > VERSION_ID="19.04" > HOME_URL="https://system76.com/pop" > SUPPORT_URL="http://support.system76.com" > BUG_REPORT_URL="https://github.com/pop-os/pop/issues" > PRIVACY_POLICY_URL="https://system76.com/privacy" > VERSION_CODENAME=disco > UBUNTU_CODENAME=disco > > Related Application and/or Package Version (run apt policy $PACKAGE > NAME): > > $ apt policy irqbalance > irqbalance: > Installed: 1.5.0-3ubuntu1 > Candidate: 1.5.0-3ubuntu1 > Version table: > *** 1.5.0-3ubuntu1 500 > 500 http://us.archive.ubuntu.com/ubuntu disco/main amd64 Packages > 100 /var/lib/dpkg/status > > $ apt rdepends irqbalance > irqbalance > Reverse Depends: > Recommends: ubuntu-standard > gce-compute-image-packages > > Issue/Bug Description: > > as per konkor/cpufreq#48 and > http://konkor.github.io/cpufreq/faq/#irqbalance-detected > > irqbalance is technically not needed on desktop systems (supposedly it > is mainly for servers), and may actually reduce performance and power > savings. It appears to provide benefits only to server environments > that have relatively-constant loading. If it is truly a server- > oriented package, then it shouldn't be installed by default on a > desktop/laptop system and shouldn't be included in desktop OS images. > > Steps to reproduce (if you know): > > This is potentially an issue with all default installs. > > Expected behavior: > > n/a > > Other Notes: > > I can safely remove it via "sudo apt purge irqbalance" without any > apparent adverse side-effects. If someone is running a situation where > they need it, then they always have the option of installing it from > the repositories. > > To manage notifications about this bug go to: > > https://bugs.launchpad.net/ubuntu/+source/irqbalance/+bug/1833322/+subscriptions > > ** Bug watch added: github.com/pop-os/default-settings/issues #60 https://github.com/pop-os/default-settings/issues/60 -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ubuntu-meta in Ubuntu. https://bugs.launchpad.net/bugs/1833322 Title: Consider removing irqbalance from default install on desktop images Status in irqbalance package in Ubuntu: New Status in ubuntu-meta package in Ubuntu: Confirmed Bug description: as per https://github.com/pop-os/default-settings/issues/60 Distribution (run cat /etc/os-release): $ cat /etc/os-release NAME="Pop!_OS" VERSION="19.04" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Pop!_OS 19.04" VERSION_ID="19.04" HOME_URL="https://system76.com/pop" SUPPORT_URL="http://support.system76.com" BUG_REPORT_URL="https://github.com/pop-os/pop/issues" PRIVACY_POLICY_URL="https://system76.com/privacy" VERSION_CODENAME=disco UBUNTU_CODENAME=disco Related Application and/or Package Version (run apt policy $PACKAGE NAME): $ apt policy irqbalance irqbalance: Installed: 1.5.0-3ubuntu1 Candidate: 1.5.0-3ubuntu1 Version table: *** 1.5.0-3ubuntu1 500 500 http://us.archive.ubuntu.com/ubuntu disco/main amd64 Packages 100 /var/lib/dpkg/status $ apt rdepends irqbalance irqbalance Reverse Depends: Recommends: ubuntu-standard gce-compute-image-packages Issue/Bug Description: as per konkor/cpufreq#48 and http://konkor.github.io/cpufreq/faq/#irqbalance-detected irqbalance is technically not needed on desktop systems (supposedly it is mainly for servers), and may actually reduce performance and power savings. It appears to provide benefits only to server environments that have relatively-constant loading. If it is truly a server- oriented package, then it shouldn't be installed by default on a desktop/laptop system and shouldn't be included in desktop OS images. Steps to reproduce (if you know): This is potentially an issue with all default installs. Expected behavior: n/a Other Notes: I can safely remove it via "sudo apt purge irqbalance" without any apparent adverse side-effects. If someone is running a situation where they need it, then they always have the option of installing it from the repositories. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/irqbalance/+bug/1833322/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp