On Thu, 1 May 2025, Filip Kastl wrote: > Hi, > > while developing GCC 15, we found out that the bit-test switch lowering > algorithm is quadratic and that it is possible to construct a switch that > causes huge compile time (PR117091). > > Andi Kleen came up with a fast bit-test lowering algorithm which was > 1) linear > 2) in some cases better than the original algorithm > 3) in some cases worse than the original algorithm > > For GCC 15 we decided to use the original algorithm when the given switch is > small and the fast algorithm when the switch is too big. > > In this patch set I combine both algorithms. The new algorithm gives strictly > better solutions than Andi's and the original one and runs in linear time. > > More details below and in commit messages. Bootstrapped and regtested on > x86_64 linux. > > Ok for trunk? > Filip Kastl > > > Better solutions > ---------------- > > When I say "better solutions", I mean the number of clusters. Cluster is a > concept from switch lowering pass. It is basically a group of cases that can > be handled by constantly many operations. > > On every switch the new algorithm should output clustering of the cases such > that the number of clusters is the smallest possible. This should be > guaranteed by how the algorithm works. > > To be extra sure, I tested this by compiling GCC itself. I bootstrapped GCC > once for each of the three algorithms (the original one, Andi's and the new > one) and I counted the number of clusters for each function (I didn't find a > way to easily compare individual switches). Indeed the new algorithm either > gave a function with fewer clusters or a function with more switches > (if-to-switch conversion converts if-else chains into switches based on if > switch lowering can lower the resulting switch -- I think that this is also a > win). > > > A note about --param switch-lower-slow-alg-max-cases > ---------------------------------------------------- > > The param switch-lower-slow-alg-max-cases specified how big a switch had to be > so that GCC would switch to the fast algorithm. After this patch set it will > serve no purpose. I kept it though. There is the jump-table switch lowering > algorithm. That algorithm is still quadratic. Nobody ran into any real > issues > with that yet, but it would be nice to come up with a fast algorithm for > jump-table lowering and set GCC to fall back to it when a switch is too big > (pr118353). Then the param would be relevant again. I plan to look into that > in this Stage 1. > > > Testing > ------- > > - I compiled the testcase from PR117091 to confirm that the new algorithm > spends reasonable time on it. > > > gcc.sh -O2 -ftime-report big-switch.c > Time variable wall GGC > ipa SRA : 12.58 ( 5%) 536 ( 0%) > dominator optimization : 55.13 ( 22%) 4687k ( 1%) > tree FRE : 71.05 ( 29%) 744 ( 0%) > tree switch lowering : 27.41 ( 11%) 13M ( 3%) > tree modref : 13.24 ( 5%) 480 ( 0%) > rest of compilation : 23.09 ( 9%) 2816 ( 0%) > TOTAL : 248.27 413M > (some entries omitted) > > The original bug report claimed that this testcase ran over 30 minutes. > That's > not the case here. Switch lowering doesn't even dominate the compilation > time. > > - I also ran a few SPEC CPU 2017 benchmarks (x86_64 linux). > > with this patch set: > > O2 generic > 500.perlbench_r NR 1 237 > 6.72 * > 502.gcc_r NR 1 216 > 6.57 * > > Ofast native > 500.perlbench_r NR 1 251 > 6.33 * > 502.gcc_r NR 1 210 > 6.73 * > > without this patch set: > > O2 generic > 500.perlbench_r NR 1 242 > 6.57 * > 502.gcc_r NR 1 216 > 6.56 * > > Ofast native > 500.perlbench_r NR 1 253 > 6.29 * > 502.gcc_r NR 1 210 > 6.73 * > > The fourth row is the exec time in seconds. There are only small differences > (2% max) and they favor the new algorithm. They could just be noise, btw. > > I've selected the perl and gcc benchmarks because I expect that those contain > many switches. > > - I bootstrapped and regtested this on x86_64 linux.
This looks OK. Thanks for the work, Richard. > > Filip Kastl (4): > gimple: Merge slow and fast bit-test switch lowering [PR117091] > gimple: Make bit-test switch lowering more powerful > gimple: Don't warn about using different algs for big switch lowering > [PR117091] > gimple: Switch bit-test lowering testcases for the more powerful alg > > gcc/testsuite/gcc.dg/tree-ssa/switch-5.c | 60 ++++++ > gcc/testsuite/gcc.dg/tree-ssa/switch-6.c | 51 +++++ > gcc/tree-switch-conversion.cc | 245 +++++++---------------- > gcc/tree-switch-conversion.h | 10 - > 4 files changed, 184 insertions(+), 182 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/switch-5.c > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/switch-6.c > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)