Re: [PATCH v2] cselib: add function to check if SET is redundant [PR106187]

Jeff Law via Gcc-patches Tue, 02 Aug 2022 16:36:42 -0700



On 8/2/2022 10:06 AM, Richard Earnshaw wrote:

On 01/08/2022 11:38, Richard Earnshaw via Gcc-patches wrote:
On 30/07/2022 20:57, Jeff Law via Gcc-patches wrote:
On 7/29/2022 7:52 AM, Richard Earnshaw via Gcc-patches wrote:
A SET operation that writes memory may have the same value as anearlier store but if the alias sets of the new and earlier store donot conflict then the set is not truly redundant. This can happen,for example, if objects of different types share a stack slot.
To fix this we define a new function in cselib that first checks for
equality and if that is successful then finds the earlier store in the
value history and checks the alias sets.

The routine is used in two places elsewhere in the compiler. Firstly
in cfgcleanup and secondly in postreload.

gcc/ChangeLog:
    * alias.h (mems_same_for_tbaa_p): Declare.
    * alias.cc (mems_same_for_tbaa_p): New function.
    * dse.cc (record_store): Use it instead of open-coding
    alias check.
    * cselib.h (cselib_redundant_set_p): Declare.
    * cselib.cc: Include alias.h
    (cselib_redundant_set_p): New function.
    * cfgcleanup.cc: (mark_effect): Use cselib_redundant_set_p instead
    of rtx_equal_for_cselib_p.
    * postreload.c (reload_cse_simplify): Use cselib_redundant_set_p.
    (reload_cse_noop_set_p): Delete.
Seems quite reasonable. The only question I would have would bewhether or not you considered including the aliasing info into thehashing used by cselib. You'd probably still need the bulk of thispatch as well since we could presumably still get a hash conflictwith two stores of the same value to the same location, but withdifferent alias sets (it's just much less likely), so perhaps itdoesn't really buy us anything.
I thought about this, but if the alias set were included in the hash,then surely you'd get every alias set in a different value. Thenyou'd miss the cases where the alias sets do conflict even thoughthey are not the same. Anyway, the values /are/ the same so in somecircumstances you might want to know that.
Ideally this would include a testcase. You might be able to turnthat non-executawble reduced case into something useful by scanningthe post-reload dumps.
I considered this as well, but the testcase I have is far toofragile, I think. The existing test only fails on Arm, only fails on11.2 (not 11.3 or gcc-12 onwards), relies on two objects with thesame value being in distinct alias sets but still assigned to thesame stack slot and for some copy dance to end up trying to writeback the original value to the same slot but with a non-conflictingset. And finally, the scheduler has to then try to move a load pastthe non-aliasing store.
To get anywhere close to this I think we'd need something akin to thegimple reader but for RTL so that we could set up all the conditionsfor the failure without the risk of an earlier transform blowing thetest away.
I wasn't aware of the rtl reader already in the compiler. But itdoesn't really get me any closer as it is lacking in so many regards:
- It can't handle (const_double:SF ...) - it tries to handle theargument as an int. This is a consequence, I think, of the readerbeing based on that for reading machine descriptions where FPconst_double is simply never encountered.
- It doesn't seem to handle anything much more than very basic types,and in particular appears to have no way of ensuring that alias setsmatch up with the type system.
I even considered whether we could start with a gimple dump andbypassing all the tree/gimple transformations, but even that would bestill at the mercy of the stack-slot allocation algorithm.
I spent a while trying to get some gimple out of the dumpers in a formthat was usable, but that's pretty much a non-starter. To make itwork we'd need to add support for gimple clobbers on objects - withoutthat there's no way to get the stack-slot sharing code to work. Furthermore, even feeding fully-optimized gimple directly into expandis such a long way from the postreload pass, that I can't believe thetestcase would remain stable for long.
And the other major issue is that the original testcase is heavilytemplated C++ and neither of the parsers gimple or rtl is supported incc1plus: converting the boilerplate to be C-friendly is probably goingto be hard.
I can't afford to spend much more time on this, especially given thelow-quality test we're going to get out of the end of the process.

Understood. Let's just go with the patch as-is. That's normal forcases where we can't produce a reasonable test.


jeff

Re: [PATCH v2] cselib: add function to check if SET is redundant [PR106187]

Reply via email to