Re: [RFC][RISC-V] Add target dependent pass to optimize related permutation constants

Jeff Law Fri, 15 Nov 2024 06:06:19 -0800



On 11/15/24 12:17 AM, Richard Biener wrote:

On Thu, Nov 14, 2024 at 10:41 PM Jeff Law <j...@ventanamicro.com> wrote:



Several weeks ago I was looking at SATD and realized that we had loads
of permutation constants that could be implemented as a trivial
adjustment to a prior loaded permutation constant.

For example if we had loaded a permutation constant like 1, 3, 1, 3, 5,
7, 5, 7 and we needed 0, 2, 0, 2, 4, 6, 4, 6.  We could use vadd.vi to
adjust the already loaded constant into the new constant we wanted.

This has a lot of similarities to SLSR and reload_cse_mov2add, but with
the added complexity that we're dealing with vectors and whether or not
we can profitably derive one constant from the other is a target
dependent decision.

So this is implemented as a mini pass after CSE2.  If other targets are
interested, we can certainly look to make this more generic.  I'm sure
we could use a hook to help with the profitability question.

The implementation works by normalizing permutation constants so that
the first element is 0 and we hash based on the normalized form.  So in
the case above, the normalized form is 0, 2, 0, 2, 4, 6, 4, 6, that's
what gets entered into the hash table for 1, 3, 1, 3, 5, 7, 5, 7,
allowing us to realize the all the elements differ by the same value
when we later encounter 0, 2, 0, 2, 4, 6, 4, 6.


Note this in principle applies to all (non-vector) constants.  The issues are
  a) can the target directly generate the constant
  b) requiring an earlier constant and some adjustment might increase
      register lifetime and thus cause spilling in the end
  c) a load from L1 might be better than the dependence on the earlier
      generated constant

I think it makes sense to consider this as part of LRA rematerialization
support?

Yea, it definitely applies to other constants, which is why I mentionedthe related values stuff from CSE, move2add and SLSR.

(a) is probably the least "interesting" problem. An appropriate hookwould give the target a chance to answer that question.

(b) is common to most CSE based transformations. We've typically drivenCSE's decisions based on localized cost modeling, which we couldcertainly do here. If we moved the REG_EQUAL/REG_EQUIV note that wouldlikely be a good thing for IRA/LRA. I'm not sure if they're currentlyrematerializing constants, but it'd at least provide a critical tidbitof information.

(c) is tough as well since it can be fairly dependent on the preciseinstruction mix. I'm pretty sure issues in this space are why the patchactually caused performance to go backwards a couple months ago. Withsome design adjustments and a sensible scheduler model it's likely asmall win now in general on our design. But I would well see itbehaving differently on other designs.


But yes, this could well be thought of as a remat problem in two ways.

First we could continue down the path of trying to optimize the relatedvalue in a CSE-like manner, but provide enough infrastructure forIRA/LRA to rematerialize the constant from the constant pool whenregister pressure is high. That would probably fit into the currentIRA/LRA model.

Or we could extend remat to more generally work on trying to materializeconstant pool accesses using existing values in the IL.

Or we could punt it to post-reload CSE since this is just a vectorversion of move2add. I'm not a fan of the move2add code, so Idiscounted this approach. But it would largely address (b) above.


Jeff

Re: [RFC][RISC-V] Add target dependent pass to optimize related permutation constants

Reply via email to