https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86952
Daniel Borkmann <daniel at iogearbox dot net> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |daniel at iogearbox dot net --- Comment #12 from Daniel Borkmann <daniel at iogearbox dot net> --- I've been looking into this issue quite recently and improved the benchmark tool a bit along the way. There need to be multiple considerations wrt to traversing the switch cases, the case is here is doing round robin, but additional distributions / tests could be added. Pushed here just in case: https://github.com/borkmann/microbenchmark Numbers I'm getting are stable: * Xeon E3-1240, packet.net c1.small.x86 instance: # make prep [...] # make gcc -g -I. -O2 -c -o test.o test.c gcc -g -I. -O2 -mindirect-branch=thunk --param=case-values-threshold=20 -c -o switch-no-table.o switch-no-table.c gcc -g -I. -O2 -mindirect-branch=thunk -c -o switch.o switch.c gcc -g -I. -O2 -c -o switch-no-retpol.o switch-no-retpol.c gcc -o test test.o switch-no-table.o switch.o switch-no-retpol.o taskset 1 ./test no retpoline : 6098325270 no jump table: 6298192058 (no retpoline: 103.28%) jump table : 22081802856 (no retpoline: 362.10%, no jump table: 350.61%) # make taskset 1 ./test no retpoline : 6098439816 no jump table: 6298242270 (no retpoline: 103.28%) jump table : 22107872854 (no retpoline: 362.52%, no jump table: 351.02%) # make taskset 1 ./test no retpoline : 6098187038 no jump table: 6298308128 (no retpoline: 103.28%) jump table : 22071053524 (no retpoline: 361.93%, no jump table: 350.43%) * Xeon Gold 5120, packet.net m2.xlarge.x86 instance: # make prep [...] # make gcc -g -I. -O2 -c -o test.o test.c gcc -g -I. -O2 -mindirect-branch=thunk --param=case-values-threshold=20 -c -o switch-no-table.o switch-no-table.c gcc -g -I. -O2 -mindirect-branch=thunk -c -o switch.o switch.c gcc -g -I. -O2 -c -o switch-no-retpol.o switch-no-retpol.c gcc -o test test.o switch-no-table.o switch.o switch-no-retpol.o taskset 1 ./test no retpoline : 5450356814 no jump table: 5620673036 (no retpoline: 103.12%) jump table : 21448285314 (no retpoline: 393.52%, no jump table: 381.60%) # make taskset 1 ./test no retpoline : 5450356100 no jump table: 5620678302 (no retpoline: 103.12%) jump table : 21448119720 (no retpoline: 393.52%, no jump table: 381.59%) # make taskset 1 ./test no retpoline : 5450331258 no jump table: 5620839740 (no retpoline: 103.13%) jump table : 21446922902 (no retpoline: 393.50%, no jump table: 381.56%) I've also looked into clang for their -mretpoline flag, and they generally turn off jump table generation in this case. For gcc, the s390 folks implemented a target override for the default case-values-threshold to raise it to 20. For x86 something similar could be done. Anyway, H.J. Lu asked me to reopen this issue (but seems like I cannot make this change from my account).