On Wed, Aug 03, 2016 at 12:02:54AM +0930, Alan Modra wrote: > This is a hack I tried to avoid having to poke at lra code for > pr71680.. > > The idea of adding force_reg here was that it removes subregs from > fix_trunc, emitting the subreg mode conversion in a separate set insn. > That avoids the underlying lra issue, by virtue of combine merging the > SF subreg with the SI mem load, at least for -m64. > > For -m32 combine rejects the combination due to the mem address being > a lo_sum which is a mode dependent address. Of course even for -m64, > combine probably shouldn't allow this combination, and wouldn't if the > rs6000 rtx_costs function said that SLOW_UNALIGNED_ACCESS mems > actually cost more. > > So this patch isn't a particularly good solution to pr71680, but > a) force_reg for an operand that must be a reg is 100% safe, and > b) it's good to expose more combine opportunities. > > Bootstrapped and regression tested powerpc64le-linux and > powerpc64-linux. > > * config/rs6000/rs6000.md (fix_trunc<mode>si2): Force source operand > to a reg. Localize vars. > > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index 5afae92..45ad661 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -5357,15 +5357,15 @@ > { > if (!<E500_CONVERT>) > { > - rtx tmp, stack; > + rtx src = force_reg (SFmode, operands[1]); > > if (TARGET_STFIWX) > - emit_insn (gen_fix_trunc<mode>si2_stfiwx (operands[0], operands[1])); > + emit_insn (gen_fix_trunc<mode>si2_stfiwx (operands[0], src)); > else > { > - tmp = gen_reg_rtx (DImode); > - stack = rs6000_allocate_stack_temp (DImode, true, false); > - emit_insn (gen_fix_trunc<mode>si2_internal (operands[0], operands[1], > + rtx tmp = gen_reg_rtx (DImode); > + rtx stack = rs6000_allocate_stack_temp (DImode, true, false); > + emit_insn (gen_fix_trunc<mode>si2_internal (operands[0], src, > tmp, stack)); > } > DONE;
Ummm, this patch looks wrong. Because the insn uses the SFDF iterator, the mode of operands[1] could be either SFmode or DFmode. I think it should be: - rtx tmp, stack; + rtx src = force_reg (<MODE>mode, operands[1]); -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797