https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402
--- Comment #63 from rguenther at suse dot de <rguenther at suse dot de> --- On Tue, 28 Mar 2023, amonakov at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402 > > Alexander Monakov <amonakov at gcc dot gnu.org> changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |amonakov at gcc dot gnu.org > > --- Comment #61 from Alexander Monakov <amonakov at gcc dot gnu.org> --- > (In reply to Richard Biener from comment #60) > > > This one is btw. a known issue PR108129. > > > > But the revision only sligthly changes the patterns so I'm very curious > > how it arrived at 30% slowdown. > > It adds an extra 'convert2?' to 'nop_atomic_bit_test_and_p' matchers, and > since > match.pd expansion works by emitting match subtrees twice for each '?' > component, that gives an extra 2x factor to the already bad combinatorial > explosion going on in those patterns. > > We really need to rework match-and-simplify emission in a smarter way. I've > looked at that in January once, but there's a few things I'd need help > understanding, such as... > > > The "trivial" improvement of course would be to special-case > > iterator uses als for (match ...) like we do for (simplify ...) where > > we can delay substitution. > > ... this. Is there a short explanation what's 'delayed substitution' in this > context? 'delayed substitution' works for (simplify (...)) by not expanding the substitution for each (for ..) iterator but instead passing it as variable to a split out common function. For (match (...)) the "substitution" part is trivial so there's no point doing that. But instead we can look to apply something similar to the "matching" part. When we have (for X (A B ...) (simplify (op (X (op2 ...) ...)) ... we get for the matching of 'X' (if it's not at the toplevel) switch (...) { case A: { .. match the rest .. } case B: { .. match the rest .. } ... but we can instead emit (maybe only in a subset of cases?) switch (...) { case A: case B: case ...: { .. mach the rest .. } in theory we support things like (for X (plus IFN_POW) (... as both operators are binary - so that's cases we cannot handle this way. Basically we'd keep the user-defined operator in the AST and adjust code-generation to deal with that. I will try to do that.