On Thu, 26 Mar 2026 17:42:05 GMT, Alan Bateman <[email protected]> wrote:

>> Francesco Nigro has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   Disable edge-triggered epoll for POLLER_PER_CARRIER mode
>>   
>>   Per-carrier sub-pollers have carrier affinity, which creates a
>>   scheduling conflict with edge-triggered registrations: the sub-poller
>>   competes with user VTs for the same carrier. By the time the sub-poller
>>   runs, user VTs have often already consumed data via tryRead(), causing
>>   the sub-poller to find a POLLED sentinel and waste a full park/unpark
>>   cycle on the master (each costing an epoll_ctl). Under load this
>>   causes a 2x throughput regression.
>>   
>>   VTHREAD_POLLERS mode is unaffected because its sub-pollers have no
>>   carrier affinity and can run on any available carrier, processing
>>   events before user VTs consume the data.
>
> This looks like a 1% improvement in ops/sec. I think we'll need to get a more 
> real-world benchmark. Do you have something other than the micro.
> 
> Do you agree with the proposal to put this in its own branch so that we can 
> iterate on it?

@AlanBateman end 2 end results via 
https://github.com/franz1981/Netty-VirtualThread-Scheduler/pull/97 (will be 
merged soon!)

## EPOLL ET vs ONE_SHOT: VIRTUAL_NETTY, 10K connections, 30ms RTT, 2 server 
cores

### Command


JAVA_HOME=<jdk> OUTPUT_DIR=<out> ./run-benchmark.sh \
  --mode VIRTUAL_NETTY --threads 2 --io nio \
  --server-cpuset "4-5" --mock-cpuset "8-11" --load-cpuset "0-3" \
  --jvm-args "-Xms8g -Xmx8g" \
  --connections 10000 --load-threads 4 \
  --mock-think-time 30 --mock-threads 4 \
  --perf-stat


- **ONE_SHOT**: Shipilev openjdk-jdk-loom b549 (2026-03-20)
- **ET**: Custom Loom build with EPOLL ET patch

### Throughput (req/s, best of 3)

| JDK | Run 1 | Run 2 | Run 3 | Best |
|-----|-------|-------|-------|------|
| ONE_SHOT | 46,137 | 45,717 | 45,171 | **46,137** |
| ET | 46,771 | 48,350 | 46,173 | **48,350** |
| **Delta** | | | | **+4.8%** |

### Server perf stat (best runs)

| Metric | ONE_SHOT | ET |
|--------|----------|-----|
| CPUs utilized | 1.999 | 1.999 |
| Ctx switches/sec | 51 | 52 |
| IPC | 1.20 | 1.24 |
| Branch misses | 2.73% | 2.51% |

-------------

PR Comment: https://git.openjdk.org/loom/pull/223#issuecomment-4142895965

Reply via email to