On Wed, Aug 03, 2016 at 12:02:54AM +0930, Alan Modra wrote:
> This is a hack I tried to avoid having to poke at lra code for
> pr71680..
>
> The idea of adding force_reg here was that it removes subregs from
> fix_trunc, emitting the subreg mode conversion in a separate set insn.
> That avoids the underlying lra issue, by virtue of combine merging the
> SF subreg with the SI mem load, at least for -m64.
>
> For -m32 combine rejects the combination due to the mem address being
> a lo_sum which is a mode dependent address. Of course even for -m64,
> combine probably shouldn't allow this combination, and wouldn't if the
> rs6000 rtx_costs function said that SLOW_UNALIGNED_ACCESS mems
> actually cost more.
>
> So this patch isn't a particularly good solution to pr71680, but
> a) force_reg for an operand that must be a reg is 100% safe, and
> b) it's good to expose more combine opportunities.
>
> Bootstrapped and regression tested powerpc64le-linux and
> powerpc64-linux.
>
> * config/rs6000/rs6000.md (fix_trunc<mode>si2): Force source operand
> to a reg. Localize vars.
>
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 5afae92..45ad661 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -5357,15 +5357,15 @@
> {
> if (!<E500_CONVERT>)
> {
> - rtx tmp, stack;
> + rtx src = force_reg (SFmode, operands[1]);
>
> if (TARGET_STFIWX)
> - emit_insn (gen_fix_trunc<mode>si2_stfiwx (operands[0], operands[1]));
> + emit_insn (gen_fix_trunc<mode>si2_stfiwx (operands[0], src));
> else
> {
> - tmp = gen_reg_rtx (DImode);
> - stack = rs6000_allocate_stack_temp (DImode, true, false);
> - emit_insn (gen_fix_trunc<mode>si2_internal (operands[0], operands[1],
> + rtx tmp = gen_reg_rtx (DImode);
> + rtx stack = rs6000_allocate_stack_temp (DImode, true, false);
> + emit_insn (gen_fix_trunc<mode>si2_internal (operands[0], src,
> tmp, stack));
> }
> DONE;
Ummm, this patch looks wrong. Because the insn uses the SFDF iterator, the
mode of operands[1] could be either SFmode or DFmode. I think it should be:
- rtx tmp, stack;
+ rtx src = force_reg (<MODE>mode, operands[1]);
--
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: [email protected], phone: +1 (978) 899-4797