Sorry for the slow reply!
Jeff Law <[email protected]> writes:
> We don't really have a good place for discussions anymore other than
> gcc-patches, but this really isn't a patch.. Oh well.
>
>
> I'm debugging a failure of ext-dce on a big endian target (m68k) and I
> can't help but think it's actually exposed a latent bug in subreg
> handling. Your input would be appreciated.
>
>
> So we have this insn:
>
>> (insn 14 13 15 2 (set (reg:DI 55 [ _6 ])
>> (zero_extend:DI (subreg:SI (reg:DI 35 [ _5 ]) 0))) "j.c":21:27 85
>> {*zero_extendsidi2}
>> (expr_list:REG_DEAD (reg:DI 35 [ _5 ])
>> (nil)))
> So bits 32..63 from (reg:DI 35) and shove them into bits 0..31 of
> (reg:DI 55) with zero extension.
>
> ext-dce has correctly determined that we only need bits 0..31 of (reg:DI
> 55). It it conceptually wants to transform that into:
>
> (set (reg:DI 55) (subreg:DI (subreg:SI (reg:DI 35) 0) 0))
>
> Of course we don't want nested subregs, so we actually use
> simplify_gen_subreg.
>
> It's simplifying that to...
>
> (set (reg:DI 55) (reg:DI 35))
>
> Which seems wrong for the SET_SRC expression.
>
> This code looks suspicious to me (simplify-rtx.cc):
>
>
>
>> /* Changing mode twice with SUBREG => just change it once,
>> or not at all if changing back op starting mode. */
>> if (GET_CODE (op) == SUBREG)
>> {
>> machine_mode innermostmode = GET_MODE (SUBREG_REG (op));
>> poly_uint64 innermostsize = GET_MODE_SIZE (innermostmode);
>> rtx newx;
>>
>> /* Make sure that the relationship between the two subregs is
>> known at compile time. */
>> if (!ordered_p (outersize, innermostsize))
>> return NULL_RTX;
>>
>> if (outermode == innermostmode
>> && known_eq (byte, 0U)
>> && known_eq (SUBREG_BYTE (op), 0))
>> return SUBREG_REG (op);
>
>
> That test just seems wrong on a big endian target. If fact if I take
> out that early exit and let the rest of the code run we get into this:
>
>> /* Work out the memory offset of the final OUTERMODE value relative
>> to the inner value of OP. */
>> poly_int64 mem_offset = subreg_memory_offset (outermode,
>> innermode, byte);
>> poly_int64 op_mem_offset = subreg_memory_offset (op);
>> poly_int64 final_offset = mem_offset + op_mem_offset;
>>
>> /* See whether resulting subreg will be paradoxical. */
>> if (!paradoxical_subreg_p (outermode, innermostmode))
>> {
>> /* Bail out in case resulting subreg would be incorrect. */
>> if (maybe_lt (final_offset, 0)
>> || maybe_ge (poly_uint64 (final_offset), innermostsize)
>> || !multiple_p (final_offset, outersize))
>> return NULL_RTX;
>> }
>
> OUTERMODE and INNERMOSTMODE are both DImode, so it's not paradoxical at
> this point. But FINAL_OFFSET is -4, which triggers rejecting the
> simplification.
>
> Am I totally offbase here? Or should we just not have called
> gen_simplify_subreg in the way we did?
No, I agree it looks like a bug. Seems like:
if (outermode == innermostmode
&& known_eq (byte, 0U)
&& known_eq (SUBREG_BYTE (op), 0))
return SUBREG_REG (op);
should be testing for subreg_lowpart_offset in both cases.
Richard