https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88739
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Target|aarch64_be armb |aarch64be Priority|P3 |P2 CC| |ebotcazou at gcc dot gnu.org, | |rguenth at gcc dot gnu.org Component|middle-end |tree-optimization Target Milestone|--- |7.5 Summary|[7,8,9 Regression ] |[7/8/9 Regression] |Big-endian union bug |Big-endian union bug --- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Wilco from comment #3) > (In reply to Richard Earnshaw from comment #2) > > > _23 = BIT_FIELD_REF <_2, 16, 0>; // WRONG: should be _2, 14, > > > 0 > > > > _2 is declared as a 30-bit integer, so perhaps the statement is right, but > > expand needs to understand that the shift extract of the top 16 bits comes > > from a different location in big-endian. > > So the question becomes what format is this in? > > <unnamed-unsigned:30> _2; > > Is it big-endian memory format (so value is in top 30 bits) or simply a > 30-bit value in a virtual register? The middle-end (GIMPLE) thinks this is a 30-bit value in a virtual register. And BIT_FIELD_REF <..., 16, 0> reads the first (counting from LSB) 16 bits. That is, as far as I understand "endianess" is irrelevant for registers but matters for memory. We expand _1 = ulAddr_3(D) >> 2; _2 = (<unnamed-unsigned:30>) _1; _6 = BIT_FIELD_REF <_2, 16, 0>; to (_6 is unsigned short) (insn 6 5 7 (set (reg:SI 95) (lshiftrt:SI (reg/v:SI 94 [ ulAddr ]) (const_int 2 [0x2]))) "t.c":42:48 -1 (nil)) (insn 7 6 8 (set (reg:SI 96) (and:SI (reg:SI 95) (const_int 1073741823 [0x3fffffff]))) "t.c":42:48 -1 (nil)) (insn 8 7 9 (set (subreg:DI (reg:HI 97) 0) (zero_extract:DI (subreg:DI (reg:SI 96) 0) (const_int 16 [0x10]) (const_int 16 [0x10]))) "t.c":44:8 -1 (nil)) now I suppose for subregs (and its offset) endianess starts to matter. Now my head of course starts to hurt when we build a paradoxical DImode subreg of a SImode reg in big endian. But going back what possibly goes wrong is when FRE does Value numbering stmt = unData.strMemHead.b30AddrL = _2; No store match Value numbering store unData.strMemHead.b30AddrL to _2 .. Value numbering stmt = _3 = unData.ausValue[6]; Inserting name _9 for expression BIT_FIELD_REF <_2, 16, 0> Setting value number of _3 to _9 (changed) it analyzes unData.strMemHead.b30AddrL to be a reference at a bit-offset with some bit-size, matching that up with the same data from unData.ausValue[6] and translating that to a BIT_FIELD_REF: base2 = get_ref_base_and_extent (gimple_assign_lhs (def_stmt), &offset2, &size2, &maxsize2, &reverse); if (!reverse && known_size_p (maxsize2) && known_eq (maxsize2, size2) && operand_equal_p (base, base2, 0) && known_subrange_p (offset, maxsize, offset2, size2) /* ??? We can't handle bitfield precision extracts without either using an alternate type for the BIT_FIELD_REF and then doing a conversion or possibly adjusting the offset according to endianness. */ && (! INTEGRAL_TYPE_P (vr->type) || known_eq (ref->size, TYPE_PRECISION (vr->type))) && multiple_p (ref->size, BITS_PER_UNIT)) { gimple_match_op op (gimple_match_cond::UNCOND, BIT_FIELD_REF, vr->type, vn_valueize (gimple_assign_rhs1 (def_stmt)), bitsize_int (ref->size), bitsize_int (offset - offset2)); here def_stmt is unData.strMemHead.b30AddrL = _2 while offset / ref are from the load. There's already a comment about endianess but it's oddly applied to ref->size vs. vr->type precision equality. I think we need adjustments whenever ref->size (from the load) is not equal to size2 (from the store)? That is, arbitrary sub-parts, while contiguous in memory, might not be contiguous in the register? *head hurts* for the testcase (gdb) p ref->size $1 = {<poly_int_pod<2u, long>> = {coeffs = {16, 0}}, <No data fields>} (gdb) p size2 $2 = {<poly_int_pod<2u, long>> = {coeffs = {30, 0}}, <No data fields>} and offset == offset2 == 0. Oh, and there's of course a plethora of variants, not to mention FLOAT_WORDS_BIG_ENDIAN and REG_WORDS_BIG_ENDIAN. Note the code above was exactly added to elide this kind of memory operation... Since the folding happens only since GCC 7 this is a regression. Andrew somewhere mentioned that BIT_INSERT_EXPR expansion is also wrong for BE (it's currently only used for vector element stuff so that's a latent issue).