https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88739

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|aarch64_be armb             |aarch64be
           Priority|P3                          |P2
                 CC|                            |ebotcazou at gcc dot gnu.org,
                   |                            |rguenth at gcc dot gnu.org
          Component|middle-end                  |tree-optimization
   Target Milestone|---                         |7.5
            Summary|[7,8,9 Regression ]         |[7/8/9 Regression]
                   |Big-endian union bug        |Big-endian union bug

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Wilco from comment #3)
> (In reply to Richard Earnshaw from comment #2)
> > >   _23 = BIT_FIELD_REF <_2, 16, 0>;            // WRONG: should be _2, 14, > > > 0
> > 
> > _2 is declared as a 30-bit integer, so perhaps the statement is right, but
> > expand needs to understand that the shift extract of the top 16 bits comes
> > from a different location in big-endian.
> 
> So the question becomes what format is this in?
> 
>   <unnamed-unsigned:30> _2;
> 
> Is it big-endian memory format (so value is in top 30 bits) or simply a
> 30-bit value in a virtual register?

The middle-end (GIMPLE) thinks this is a 30-bit value in a virtual register.
And BIT_FIELD_REF <..., 16, 0> reads the first (counting from LSB) 16 bits.

That is, as far as I understand "endianess" is irrelevant for registers
but matters for memory.

We expand

  _1 = ulAddr_3(D) >> 2;
  _2 = (<unnamed-unsigned:30>) _1;
  _6 = BIT_FIELD_REF <_2, 16, 0>;

to (_6 is unsigned short)

(insn 6 5 7 (set (reg:SI 95)
        (lshiftrt:SI (reg/v:SI 94 [ ulAddr ])
            (const_int 2 [0x2]))) "t.c":42:48 -1
     (nil))

(insn 7 6 8 (set (reg:SI 96)
        (and:SI (reg:SI 95)
            (const_int 1073741823 [0x3fffffff]))) "t.c":42:48 -1
     (nil))

(insn 8 7 9 (set (subreg:DI (reg:HI 97) 0)
        (zero_extract:DI (subreg:DI (reg:SI 96) 0)
            (const_int 16 [0x10])
            (const_int 16 [0x10]))) "t.c":44:8 -1
     (nil))

now I suppose for subregs (and its offset) endianess starts to matter.
Now my head of course starts to hurt when we build a paradoxical DImode
subreg of a SImode reg in big endian.

But going back what possibly goes wrong is when FRE does

Value numbering stmt = unData.strMemHead.b30AddrL = _2;
No store match
Value numbering store unData.strMemHead.b30AddrL to _2
..
Value numbering stmt = _3 = unData.ausValue[6];
Inserting name _9 for expression BIT_FIELD_REF <_2, 16, 0>
Setting value number of _3 to _9 (changed)

it analyzes unData.strMemHead.b30AddrL to be a reference at
a bit-offset with some bit-size, matching that up with the
same data from unData.ausValue[6] and translating that to
a BIT_FIELD_REF:

      base2 = get_ref_base_and_extent (gimple_assign_lhs (def_stmt),
                                       &offset2, &size2, &maxsize2,
                                       &reverse);
      if (!reverse
          && known_size_p (maxsize2)
          && known_eq (maxsize2, size2)
          && operand_equal_p (base, base2, 0)
          && known_subrange_p (offset, maxsize, offset2, size2)
          /* ???  We can't handle bitfield precision extracts without
             either using an alternate type for the BIT_FIELD_REF and
             then doing a conversion or possibly adjusting the offset
             according to endianness.  */
          && (! INTEGRAL_TYPE_P (vr->type)
              || known_eq (ref->size, TYPE_PRECISION (vr->type)))
          && multiple_p (ref->size, BITS_PER_UNIT))
        {
          gimple_match_op op (gimple_match_cond::UNCOND,
                              BIT_FIELD_REF, vr->type,
                              vn_valueize (gimple_assign_rhs1 (def_stmt)),
                              bitsize_int (ref->size),
                              bitsize_int (offset - offset2));

here def_stmt is unData.strMemHead.b30AddrL = _2 while offset / ref
are from the load.  There's already a comment about endianess but
it's oddly applied to ref->size vs. vr->type precision equality.
I think we need adjustments whenever ref->size (from the load) is
not equal to size2 (from the store)?  That is, arbitrary sub-parts,
while contiguous in memory, might not be contiguous in the register?

*head hurts*

for the testcase

(gdb) p ref->size
$1 = {<poly_int_pod<2u, long>> = {coeffs = {16, 0}}, <No data fields>}
(gdb) p size2
$2 = {<poly_int_pod<2u, long>> = {coeffs = {30, 0}}, <No data fields>}

and offset == offset2 == 0.

Oh, and there's of course a plethora of variants, not to mention
FLOAT_WORDS_BIG_ENDIAN and REG_WORDS_BIG_ENDIAN.

Note the code above was exactly added to elide this kind of memory
operation...

Since the folding happens only since GCC 7 this is a regression.

Andrew somewhere mentioned that BIT_INSERT_EXPR expansion is also
wrong for BE (it's currently only used for vector element stuff
so that's a latent issue).

Reply via email to