Re: [wide-int] Avoid some changes in output code

Richard Biener Mon, 04 Nov 2013 01:22:59 -0800

On Mon, 4 Nov 2013, Richard Biener wrote:

> On Fri, 1 Nov 2013, Richard Sandiford wrote:
> 
> > I'm building one target for each supported CPU and comparing the wide-int
> > assembly output of gcc.c-torture, gcc.dg and g++.dg with the corresponding
> > output from the merge point.  This patch removes all the differences I saw
> > for alpha-linux-gnu in gcc.c-torture.
> > 
> > Hunk 1: Preserve the current trunk behaviour in which the shift count
> > is truncated if SHIFT_COUNT_TRUNCATED and not otherwise.  This was by
> > inspection after hunk 5.
> > 
> > Hunks 2 and 3: Two cases where force_fit_to_type could extend to wider
> > types and where we weren't extending according to the sign of the source.
> > We should probably assert that the input is at least as wide as the type...
> > 
> > Hunk 4: The "&" in:
> > 
> >           if ((TREE_INT_CST_HIGH (arg1) & mask_hi) == mask_hi
> >               && (TREE_INT_CST_LOW (arg1) & mask_lo) == mask_lo)
> > 
> > had got dropped during the conversion.
> > 
> > Hunk 5: The old code was:
> > 
> >       if (shift > 0)
> >         {
> >           *mask = r1mask.llshift (shift, TYPE_PRECISION (type));
> >           *val = r1val.llshift (shift, TYPE_PRECISION (type));
> >         }
> >       else if (shift < 0)
> >         {
> >           shift = -shift;
> >           *mask = r1mask.rshift (shift, TYPE_PRECISION (type), !uns);
> >           *val = r1val.rshift (shift, TYPE_PRECISION (type), !uns);
> >         }
> > 
> > and these precision arguments had two purposes: to control which
> > bits of the first argument were shifted, and to control the truncation
> > mask for SHIFT_TRUNCATED.  We need to pass a width to the shift functions
> > for the second.
> > 
> > (BTW, I'm running the comparisons with CONST_WIDE_INT locally moved to the
> > end of the !GENERATOR_FILE list in rtl.def, since the current position 
> > caused
> > some spurious differences.  The "problem" AFAICT is that hash_rtx hashes
> > on code, RTL PRE creates registers in the hash order of the associated
> > expressions, RA uses register numbers as a tie-breaker during ordering,
> > and so the order of rtx_code can indirectly influence register allocation.
> > First time I'd realised that could happen, so just thought I'd mention it.
> > I think we should keep rtl.def in the current (logical) order though.)
> > 
> > Tested on x86_64-linux-gnu and powerpc64-linux-gnu.  OK for wide-int?
> 
> Bah - can you instead try removing the use of SHIFT_COUNT_TRUNCATED
> from double-int.c on the trunk?  If not then putting this at the
> callers of wi::rshift and friends is clearly asking for future
> mistakes.
> 
> Oh, and I think that honoring SHIFT_COUNT_TRUNCATED anywhere else
> than in machine instruction patterns was a mistake in the past.


Oh, and my understanding is that it's maybe required for correctness on
the RTL side if middle-end code removes masking operations.
Constant propagation results later may not change the result because
of that.  I'm not sure this is the way it happens, but if we have

  (lshift:si (reg:si 1) (and:qi (reg:qi 2) 0x1f))

and we'd transform it to (based on SHIFT_COUNT_TRUNCATED)

  (lshift:si (reg:si 1) (reg:qi 2))

and later constant propagate 77 to reg:qi 2.

That isn't the way SHIFT_COUNT_TRUNCATED should work, of course,
but instead the backends should have a define_insn which matches
the 'and' and combine should try to match that.

You can see that various architectures may have truncating shifts
but not for all patterns which results in stuff like

config/aarch64/aarch64.h:#define SHIFT_COUNT_TRUNCATED !TARGET_SIMD

which clearly means that it's not implemented in terms of providing
shift insns which can match a shift operand that is masked.

That is, either try to clean that up (hooray!), or preserve
the double-int.[ch] behavior (how does CONST_INT RTL constant
folding honor it?).

Richard.

> Thanks,
> Richard.
> 
> [btw, please use diff -p to get us at least some context if you
> are not writing changelogs ...]
> 
> > Thanks,
> > Richard
> > 
> > 
> > Index: gcc/fold-const.c
> > ===================================================================
> > --- gcc/fold-const.c        (revision 204247)
> > +++ gcc/fold-const.c        (working copy)
> > @@ -1008,9 +1008,12 @@
> >        The following code ignores overflow; perhaps a C standard
> >        interpretation ruling is needed.  */
> >     res = wi::rshift (arg1, arg2, sign,
> > -                     GET_MODE_BITSIZE (TYPE_MODE (type)));
> > +                     SHIFT_COUNT_TRUNCATED
> > +                     ? GET_MODE_BITSIZE (TYPE_MODE (type)) : 0);
> >        else
> > -   res = wi::lshift (arg1, arg2, GET_MODE_BITSIZE (TYPE_MODE (type)));
> > +   res = wi::lshift (arg1, arg2,
> > +                     SHIFT_COUNT_TRUNCATED
> > +                     ? GET_MODE_BITSIZE (TYPE_MODE (type)) : 0);
> >        break;
> >        
> >      case RROTATE_EXPR:
> > @@ -6870,7 +6873,8 @@
> >      return NULL_TREE;
> >  
> >    if (TREE_CODE (arg1) == INTEGER_CST)
> > -    arg1 = force_fit_type (inner_type, arg1, 0, TREE_OVERFLOW (arg1));
> > +    arg1 = force_fit_type (inner_type, wi::to_widest (arg1), 0,
> > +                      TREE_OVERFLOW (arg1));
> >    else
> >      arg1 = fold_convert_loc (loc, inner_type, arg1);
> >  
> > @@ -8081,7 +8085,8 @@
> >         }
> >       if (change)
> >         {
> > -         tem = force_fit_type (type, and1, 0, TREE_OVERFLOW (and1));
> > +         tem = force_fit_type (type, wi::to_widest (and1), 0,
> > +                               TREE_OVERFLOW (and1));
> >           return fold_build2_loc (loc, BIT_AND_EXPR, type,
> >                                   fold_convert_loc (loc, type, and0), tem);
> >         }
> > @@ -14098,12 +14103,13 @@
> >             (inner_width, outer_width - inner_width, false, 
> >              TYPE_PRECISION (TREE_TYPE (arg1)));
> >  
> > -         if (mask == arg1)
> > +         wide_int common = mask & arg1;
> > +         if (common == mask)
> >             {
> >               tem_type = signed_type_for (TREE_TYPE (tem));
> >               tem = fold_convert_loc (loc, tem_type, tem);
> >             }
> > -         else if ((mask & arg1) == 0)
> > +         else if (common == 0)
> >             {
> >               tem_type = unsigned_type_for (TREE_TYPE (tem));
> >               tem = fold_convert_loc (loc, tem_type, tem);
> > Index: gcc/tree-ssa-ccp.c
> > ===================================================================
> > --- gcc/tree-ssa-ccp.c      (revision 204247)
> > +++ gcc/tree-ssa-ccp.c      (working copy)
> > @@ -1238,15 +1238,20 @@
> >               else
> >                 code = RSHIFT_EXPR;
> >             }
> > +         int shift_precision = SHIFT_COUNT_TRUNCATED ? width : 0;
> >           if (code == RSHIFT_EXPR)
> >             {
> > -             *mask = wi::rshift (wi::ext (r1mask, width, sgn), shift, sgn);
> > -             *val = wi::rshift (wi::ext (r1val, width, sgn), shift, sgn);
> > +             *mask = wi::rshift (wi::ext (r1mask, width, sgn),
> > +                                 shift, sgn, shift_precision);
> > +             *val = wi::rshift (wi::ext (r1val, width, sgn),
> > +                                shift, sgn, shift_precision);
> >             }
> >           else
> >             {
> > -             *mask = wi::sext (wi::lshift (r1mask, shift), width);
> > -             *val = wi::sext (wi::lshift (r1val, shift), width);
> > +             *mask = wi::ext (wi::lshift (r1mask, shift, shift_precision),
> > +                              width, sgn);
> > +             *val = wi::ext (wi::lshift (r1val, shift, shift_precision),
> > +                             width, sgn);
> >             }
> >         }
> >     }
> > 
> > 
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend

Re: [wide-int] Avoid some changes in output code

Reply via email to