On Thu, 26 Mar 2026 17:42:05 GMT, Alan Bateman <[email protected]> wrote:
>> Francesco Nigro has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Disable edge-triggered epoll for POLLER_PER_CARRIER mode >> >> Per-carrier sub-pollers have carrier affinity, which creates a >> scheduling conflict with edge-triggered registrations: the sub-poller >> competes with user VTs for the same carrier. By the time the sub-poller >> runs, user VTs have often already consumed data via tryRead(), causing >> the sub-poller to find a POLLED sentinel and waste a full park/unpark >> cycle on the master (each costing an epoll_ctl). Under load this >> causes a 2x throughput regression. >> >> VTHREAD_POLLERS mode is unaffected because its sub-pollers have no >> carrier affinity and can run on any available carrier, processing >> events before user VTs consume the data. > > This looks like a 1% improvement in ops/sec. I think we'll need to get a more > real-world benchmark. Do you have something other than the micro. > > Do you agree with the proposal to put this in its own branch so that we can > iterate on it? @AlanBateman end 2 end results via https://github.com/franz1981/Netty-VirtualThread-Scheduler/pull/97 (will be merged soon!) ## EPOLL ET vs ONE_SHOT: VIRTUAL_NETTY, 10K connections, 30ms RTT, 2 server cores ### Command JAVA_HOME=<jdk> OUTPUT_DIR=<out> ./run-benchmark.sh \ --mode VIRTUAL_NETTY --threads 2 --io nio \ --server-cpuset "4-5" --mock-cpuset "8-11" --load-cpuset "0-3" \ --jvm-args "-Xms8g -Xmx8g" \ --connections 10000 --load-threads 4 \ --mock-think-time 30 --mock-threads 4 \ --perf-stat - **ONE_SHOT**: Shipilev openjdk-jdk-loom b549 (2026-03-20) - **ET**: Custom Loom build with EPOLL ET patch ### Throughput (req/s, best of 3) | JDK | Run 1 | Run 2 | Run 3 | Best | |-----|-------|-------|-------|------| | ONE_SHOT | 46,137 | 45,717 | 45,171 | **46,137** | | ET | 46,771 | 48,350 | 46,173 | **48,350** | | **Delta** | | | | **+4.8%** | ### Server perf stat (best runs) | Metric | ONE_SHOT | ET | |--------|----------|-----| | CPUs utilized | 1.999 | 1.999 | | Ctx switches/sec | 51 | 52 | | IPC | 1.20 | 1.24 | | Branch misses | 2.73% | 2.51% | ------------- PR Comment: https://git.openjdk.org/loom/pull/223#issuecomment-4142895965
