Re: [PATCH] AArch64: Block combine_and_move from creating FP literal loads

Richard Sandiford Fri, 08 Nov 2024 05:16:35 -0800

Wilco Dijkstra <wilco.dijks...@arm.com> writes:
> Hi Richard,
>
>> It's ok for instructions to require properties that are false during
>> early RTL passes and then transition to true.  But they can't require
>> properties that go from true to false, since that would mean that
>> existing instructions become unrecognisable at certain points during
>> the compilation process.
>
> Only invalid cases are rejected - can_create_pseudo_p is used to reject
> instructions that cannot be split after regalloc. This is basically a small
> extension to that: we always split the aarch64_float_const_rtx_p case before
> regalloc, and thus no such instructions should exist/created during IRA.
>
> So this simply blocks any code that tries to undo the split.


But I think you missed my point.  Suppose we have:

  (define_insn
    [(...)]
    "FOO"
    "..."
  )

In general, as the RTL pipeline progresses, FOO is only allowed to go
from 0 to 1.  It shouldn't go from 1 back to 0.

That's because, once an instruction matches, the instruction should
continue to match.  It should always be possible to set the INSN_CODE of
an existing instruction to -1, rerun recog, and get the same instruction
code back.

Because of that, insn conditions shouldn't depend on can_create_pseudo_p.

>> Also, why are the conditions tighter for aarch64_float_const_rtx_p
>> (which we can split) but not for the general case (which we can't,
>> and presumably need to force to memory)?  I.e. for what cases do we want
>> the final return to be (sometimes) true?  If it's going to be forced
>> into memory anyway then wouldn't we get better optimisation by exposing
>> that early?
>
> This is the only case that is always split before regalloc. The forced into 
> memory
> case works exactly like it does now for all other FP immediates.
>
>> Would it be possible to handle the split during expand instead?
>> Or do we expect to discover new FP constants during RTL optimisation?
>> If so, where do they come from?
>
> The split is done during the split passes. The issue is that combine_and_move
> undoes this split during IRA and creates new FP constants that then require 
> to be
> split, but they aren't because we don't run split passes during IRA.

Yeah, I realise it's done by the split pass at the moment.  My question was:
why do we need to wait till then?  Why can't we do it in expand instead?
Are there cases where we expect to discover new FP constants during RTL
optimisation that weren't present in gimple?  And if so, which cases are
they?  Where do the constants come from?

Thanks,
Richard

Re: [PATCH] AArch64: Block combine_and_move from creating FP literal loads

Reply via email to