https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69270
Bug ID: 69270 Summary: DOM should exploit range information to create more equivalences Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: law at redhat dot com Target Milestone: --- Given a conditional like if (a != 0) TRUE-ARM else FALSE-ARM DOM will exploit the knowledge that A must have the value 0 in the FALSE arm of the conditional. DOM can not be as aggressive in the TRUE arm because A could have any value other than zero. However, if we have range information for A which indicates its range is [0..1], then DOM can exploit the fact that A must have the value 1 in the TRUE arm of the conditional. DOM knows how to do this for booleans where the range is implicit in the type, but isn't exploiting VRP information to do this on larger integral types. Opportunities show up in the adpcm benchmark when compiled with -fsplit-paths as well as within GCC itself. Here's the greatly simplified adpcm code: extern int *stepsizeTable; void adpcm_coder (signed char *outdata, int len) { signed char *outp; int delta; int outputbuffer; int bufferstep = 0; outp = (signed char *) outdata; int step = 0; int index = 0; int diff = 0; for (; len > 0; len--) { delta = 0; if (diff >= step) delta = 4; step = stepsizeTable[index]; if (bufferstep) outputbuffer = (delta << 4) & 0xf0; else *outp++ = (delta & 0x0f) | outputbuffer; bufferstep = !bufferstep; } } Path splitting essentially turns the loop into: for (; len > 0; len--) { delta = 0; if (diff >= step) delta = 4; step = stepsizeTable[index]; if (bufferstep) { outputbuffer = (delta << 4) & 0xf0; bufferstep = !bufferstep; } else { *outp++ = (delta & 0x0f) | outputbuffer; bufferstep = !bufferstep; } } VRP can determine that the bufferstep has the range [0..1] so we can propagate those constants into each arm of the conditional, causing the !bufferstep expression to have a constant value. That's a win. The astute reader will realize that we've got a flip-flop variable and that unrolling the loop 1X would allow dramatic simplifications. In this example there wouldn't be any code bloat, but in the real adpcm code the bloat would likely be significant.