https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86952
--- Comment #21 from Martin Liška <marxin at gcc dot gnu.org> --- (In reply to Daniel Borkmann from comment #20) > (In reply to Martin Liška from comment #19) > > Ok, I updated the benchmark and push it here: > > https://github.com/marxin/microbenchmark-1 > > > > And I see following on my Haswell machine: > > Thanks for working on it! Bit strange why some of your numbers are quite > fluctuating e.g. in your 'normal' column. What do you use to tune your setup > for testing? I've been running the `make prep` part which I added back then, > and the numbers I see are quite stable. I ran a quick test this morning with > your repo, and here's what I got for the round-robin walk: Yes, it's without taskset and tuned. I don't have any experience with tuned. > > * Xeon E3-1240 (3.7GHz): > > # ./test.py > normal retpoline retpo+no-JT retpo+JT=20 retpo+JT=40 > cases: 8: 0.49 (100%) 2.09 (426%) 0.53 (108%) 0.53 (108%) 0.53 (108%) > cases: 16: 0.49 (100%) 2.09 (426%) 0.58 (119%) 0.58 (119%) 0.58 (119%) > cases: 32: 0.49 (100%) 2.09 (426%) 0.61 (125%) 2.09 (426%) 0.61 (125%) > cases: 64: 0.49 (100%) 2.26 (458%) 0.69 (140%) 2.27 (459%) 2.27 (459%) > cases: 128: 0.50 (100%) 2.37 (476%) 0.76 (153%) 2.32 (466%) 2.41 (483%) > cases: 256: 0.52 (100%) 2.33 (451%) 0.91 (175%) 2.33 (450%) 2.36 (456%) > cases: 1024: 1.05 (100%) 2.54 (242%) 1.08 (103%) 2.59 (246%) 2.54 (242%) > cases: 2048: 1.63 (100%) 2.56 (157%) 1.94 (119%) 2.61 (160%) 2.59 (159%) > cases: 4096: 2.19 (100%) 3.12 (143%) 3.22 (147%) 3.09 (142%) 3.13 (143%) > > * Xeon Gold 5120 (2.6GHz): > > # ./test.py > normal retpoline retpo+no-JT retpo+JT=20 retpo+JT=40 > cases: 8: 0.70 (100%) 2.98 (425%) 0.75 (107%) 0.75 (107%) 0.75 (107%) > cases: 16: 0.70 (100%) 2.98 (425%) 0.82 (117%) 0.82 (117%) 0.82 (117%) > cases: 32: 0.70 (100%) 3.01 (430%) 0.87 (124%) 2.98 (426%) 0.87 (124%) > cases: 64: 0.70 (100%) 3.52 (501%) 0.94 (134%) 3.52 (501%) 3.52 (501%) > cases: 128: 0.71 (100%) 3.51 (495%) 1.07 (151%) 3.50 (495%) 3.50 (494%) > cases: 256: 0.76 (100%) 3.14 (414%) 1.27 (167%) 3.14 (414%) 3.14 (414%) > cases: 1024: 1.46 (100%) 3.36 (230%) 1.49 (102%) 3.36 (230%) 3.36 (230%) > cases: 2048: 2.25 (100%) 3.19 (142%) 2.70 (120%) 3.19 (142%) 3.19 (142%) > cases: 4096: 2.90 (100%) 3.74 (129%) 4.48 (155%) 3.73 (129%) 3.72 (129%) > > Probably makes sense to also add other walk tests aka input distributions > for foo{,_no_table,_no_retpol}(<x>) for further comparison if plan would be > to disable jump tables entirely. There are number for: + int x = i % 57; + foo ((3 * x * x + 17 * x) / 100); distribution: normal retpoline retpo+no-JT retpo+JT=20 retpo+JT=40 cases: 8: 1.55 (100%) 2.65 (171%) 0.59 ( 38%) 0.60 ( 39%) 0.60 ( 39%) cases: 16: 1.53 (100%) 2.66 (174%) 0.67 ( 44%) 0.66 ( 43%) 0.66 ( 43%) cases: 32: 1.76 (100%) 2.68 (152%) 0.70 ( 40%) 2.69 (153%) 0.70 ( 39%) cases: 64: 1.31 (100%) 2.71 (206%) 0.75 ( 57%) 2.69 (205%) 2.66 (202%) cases: 128: 0.53 (100%) 2.75 (515%) 0.78 (147%) 2.73 (513%) 2.75 (516%) cases: 256: 0.55 (100%) 2.76 (504%) 0.85 (154%) 2.76 (504%) 2.76 (503%) cases: 1024: 0.54 (100%) 2.73 (506%) 0.96 (178%) 2.76 (511%) 2.74 (507%) cases: 2048: 0.54 (100%) 2.74 (507%) 1.23 (228%) 2.73 (505%) 2.71 (501%) cases: 4096: 0.54 (100%) 2.73 (503%) 1.44 (266%) 2.73 (502%) 2.73 (503%) Conclusion is the same for me, I'm going to prepare a patch that will disable JTs for retpolines. Thank you for testing.