On 23.02.2017 06:59, Jeff Law wrote:
On 02/22/2017 02:40 PM, Jakub Jelinek wrote:
Hi!

If both arguments of integer division or modulo are known to be
non-negative
in corresponding signed type, then signed as well as unsigned
division/modulo
shall have the exact same result and therefore we can choose between
those
two depending on which one is faster (or shorter for -Os), which varries
a lot depending on target and especially for constant divisors on the
exact
divisor.  expand_divmod itself is too complicated and we don't even have
the ability to ask about costs e.g. for highpart multiplication without
actually expanding it, so this patch just in that case tries both
sequences,
computes their costs and uses the cheaper (and for equal cost honors the
actual original signedness of the operation).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-02-22  Jakub Jelinek  <ja...@redhat.com>

    PR middle-end/79665
    * internal-fn.c (get_range_pos_neg): Moved to ...
    * tree.c (get_range_pos_neg): ... here.  No longer static.
    * tree.h (get_range_pos_neg): New prototype.
    * expr.c (expand_expr_real_2) <case TRUNC_DIV_EXPR>: If both
arguments
    are known to be in between 0 and signed maximum inclusive, try to
    expand both unsigned and signed divmod and use the cheaper one from
    those.
OK.
jeff

Hi, this causes a performance degradation for avr.

When optimizing for speed, and with a known denominatior, then v6 uses
s/umulMM3_highpart insn to avoid division because no div instruction is
available.

unsigned scale256 (unsigned val)
{
    return value / 255;
}

With this patch, v7 now uses __divmodhi4 which is very expensive but
the costs are not computed because rtlanal.c:seq_cost assumes a cost of
ONE:

  for (; seq; seq = NEXT_INSN (seq))
    {
      set = single_set (seq);
      if (set)
        cost += set_rtx_cost (set, speed);
      else
        cost++;
    }

because divmod in not a single_set:
(gdb) p seq
$10 = (const rtx_insn *) 0x7ffff730d500
(gdb) pr
warning: Expression is not an assignment (and might have no effect)
(insn 14 13 0 (parallel [
            (set (reg:HI 52)
                (div:HI (reg:HI 47)
                    (reg:HI 54)))
            (set (reg:HI 53)
                (mod:HI (reg:HI 47)
                    (reg:HI 54)))
            (clobber (reg:QI 21 r21))
            (clobber (reg:HI 22 r22))
            (clobber (reg:HI 24 r24))
            (clobber (reg:HI 26 r26))
        ]) "scale.c":7 -1
     (nil))
(gdb)

Hence the divmod appears to be much less expensive than the unsigned
variant that computed the costs for mult_highpart.


Johann





Reply via email to