https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67438

--- Comment #11 from Yuri Rumyantsev <ysrumyan at gmail dot com> ---
In fact, the problem is quite different although it is caused by non-profitable
pattern matching ~X CMP ~Y -> Y CMP X. In general this pattern may be helpful
if we can delete not operation, e.g.
  x1 = ~x;
  y1 = ~y;
  if (x1 <cmp> y1) ... and there no any other uses of x1 and y1, i.e. x1 and y1
have single use. But if this is not truth we will increase register pressure
since we can not use the same register for x,x1 and y,y1.

Richard proposed to use the same simplification for min/max operations but
in original test-case nested min/max operation (min(x,min(y,z)) or multi
operand min/max (min(x,y,z)) are not recognized by gcc (Note that icc does such
transformation) and so this won't help since we have the same register pressure
issue:
    c = ~r; 
    m = ~g;
    y = ~b;
    k = min(c, m, y);
    *out++ = c - k;
    *out++ = m - k;
    *out++ = y - k;
    *out++ = k;
and we can see that value of 'c' is used in min computation and resulting
store, so if we will use r <cmp> g comparison we will increase live range for
r, g, b variables and additional registers will require for them (till
comparison).
Note also that there exists another issue with path-splitting (aka tail
duplication) which duplicate loop back edge and in fact move tail block to
hammock. This transformation does not loop useful (at least at given stage of
design) but this is another topic for discussion.

I'd like to propose to introduce new predicate for pattern matching which tells
us how much uses have left-hand side of ~x.

Reply via email to