On Thu, 30 Aug 2018, Jakub Jelinek wrote: > Hi! > > The following testcase is miscompiled, because expand_mult_const adds an > incorrect REG_EQUAL note to a temporary pseudo. In the testcase, we > do an unsigned __int128 multiplication of a variable by > 0x7fffffffffffffff, which is determined to be best performed as > shift left by 63 (multiplication by 0x8000000000000000U) followed by > subtraction, i.e. (x << 63) - x. The val_so_far is tracked in an UHWI and > is necessarily always non-negative even because the caller ensures that, > if (is_neg && mode_bitsize > HOST_BITS_PER_WIDE_INT) then it performs > expand_mult_const on the negation and negates afterwards the result. > The problem is that it calls gen_int_mode, where the argument is signed hwi > (well, these days poly_int64). That is fine for DImode multiplication, we > don't really care about bits above the mode, but for TImode multiplication > it is significant. Without this patch we emit in that case: > (expr_list:REG_EQUAL (mult:TI (reg/v:TI 85 [ x ]) > (const_int -9223372036854775808 [0x8000000000000000])) > but that is for TImode actually multiplication by > 0xffffffffffffffff8000000000000000 > rather than > 0x00000000000000008000000000000000, so we need to emit > (const_wide_int 0x08000000000000000) > instead. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for > trunk and after a while for 8.3?
Ugh. I wonder if we should add a rtx gen_int_mode (poly_uint64 c, machine_mode mode) and assert that the topmost bit is not set if GET_MODE_PRECISION (mode) > 64? But I guess passing unsigned HOST_WIDE_INT will make this ambiguous. So maybe a unsigned HOST_WIDE_INT overload instead. Just to catch the possibly very many places where things can go wrong... I also wonder why callers have to decide whether to use a CONST_INT, CONST_WIDE_INT or CONST_DOUBLE and why we cannot have a single interface here, eventually taking a sign in addition to the mode. Code looks a bit ugly below, let's see if Richard can come up with sth nicer. Conceptually it is OK, but I wonder why not simply unconditionally use a CONST_WIDE_INT/CONST_DOUBLE since the value-range of unsigned HOST_WIDE_INT cannot be expressed in a CONST_INT (unless the mode restricts the value-range appropriately). Richard. > 2018-08-30 Jakub Jelinek <ja...@redhat.com> > > PR middle-end/87138 > * expmed.c (expand_mult_const): If val_so_far has MSB set, use > immed_wide_int_const instead of gen_int_mode. Formatting fixes. > > * gcc.target/i386/avx512bw-pr87138.c: New test. > > --- gcc/expmed.c.jj 2018-08-29 14:20:53.427109187 +0200 > +++ gcc/expmed.c 2018-08-29 17:22:52.416860714 +0200 > @@ -3347,19 +3347,27 @@ expand_mult_const (machine_mode mode, rt > /* Write a REG_EQUAL note on the last insn so that we can cse > multiplication sequences. Note that if ACCUM is a SUBREG, > we've set the inner register and must properly indicate that. */ > - tem = op0, nmode = mode; > - accum_inner = accum; > - if (GET_CODE (accum) == SUBREG) > + tem = op0, nmode = mode; > + accum_inner = accum; > + if (GET_CODE (accum) == SUBREG) > { > accum_inner = SUBREG_REG (accum); > nmode = GET_MODE (accum_inner); > tem = gen_lowpart (nmode, op0); > } > > - insn = get_last_insn (); > - set_dst_reg_note (insn, REG_EQUAL, > - gen_rtx_MULT (nmode, tem, > - gen_int_mode (val_so_far, nmode)), > + insn = get_last_insn (); > + rtx c; > + if (val_so_far > (unsigned HOST_WIDE_INT) HOST_WIDE_INT_MAX) > + { > + wide_int wval_so_far > + = wi::uhwi (val_so_far, > + GET_MODE_PRECISION (as_a <scalar_mode> (nmode))); > + c = immed_wide_int_const (wval_so_far, nmode); > + } > + else > + c = gen_int_mode (val_so_far, nmode); > + set_dst_reg_note (insn, REG_EQUAL, gen_rtx_MULT (nmode, tem, c), > accum_inner); > } > } > --- gcc/testsuite/gcc.target/i386/avx512bw-pr87138.c.jj 2018-08-29 > 17:55:08.550154082 +0200 > +++ gcc/testsuite/gcc.target/i386/avx512bw-pr87138.c 2018-08-29 > 17:56:11.112131910 +0200 > @@ -0,0 +1,29 @@ > +/* PR middle-end/87138 */ > +/* { dg-do run { target int128 } } */ > +/* { dg-options "-O -fno-tree-fre -mavx512bw -mtune=k8" } */ > +/* { dg-require-effective-target avx512bw } */ > + > +#include "avx512bw-check.h" > + > +typedef int U __attribute__ ((vector_size (64))); > +typedef __int128 V __attribute__ ((vector_size (64))); > +V g, i; > + > +static inline void > +foo (unsigned h, V j, U k, V n) > +{ > + k /= h; > + __builtin_memmove (&h, &n, 1); > + n[j[1]] *= 0x7FFFFFFFFFFFFFFF; > + j[k[5]] = 0; > + g = n; > + i = h + j + n; > +} > + > +void > +avx512bw_test () > +{ > + foo (~0, (V) { }, (U) { 5 }, (V) { 3 }); > + if (g[0] != (__int128) 3 * 0x7FFFFFFFFFFFFFFF) > + abort (); > +} > > Jakub > > -- Richard Biener <rguent...@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)