On 8/10/23 03:28, Manolis Tsamis wrote:
Hi Jeff,
Thanks a lot for providing all this information and testcase! I have
been able to reproduce the issue with it.
I have investigated the cause of the issue and it's not what you
mention, the uses of all intermediate calculations are properly taken
into account.
In this case it would be fine to alter the runtime value of insn 58
because its uses are also memory accesses that can have a folded
offset. The real issue is that the offset for these two (insn 60/61)
is not updated.
[ ... ]
This instruction doesn't match any of these since the if with the
CONST_INT only accepts arg1 being a single REG (that could be
extended, but that's not the point now) and as a result we do `return
0;`
But return 0 at this point loses the offset 1 calculated from arg1
previously and which is stored in `offset`. And that's our bug :)
Changing that return 0 to return offset (i.e. return what we have up
to now) fixes this testcase with the insn being folded and all offsets
updating properly. But it got me thinking that this is a more general
issue that I need to address differently.
There are more `return 0;` cases in the code which say "We know how to
handle this rtx code, but don't know how to propagate", but returning
0 is not enough.
It's also not correct to propagate an offset from just one argument
and punt on the other because the argument we punt might contain
references to the other argument.
Funny, I'd looked at that as well (return 0 signaling two different
things), but from the standpoint of the analysis phase it didn't matter
as we don't use the returned value. So I set it aside.
So the general solution that solves all the issues is: If we don't
fully understand how to handle an instruction and its arguments in
fold_offsets then we need to mark it in one of the bitsets (either set
in cannot_fold or don't set in can_fold), whereas currently a insn
that has transitively all uses as foldable is foldable.
I'm still struggling a bit with using the transitive set as a global
like we do. I haven't come up with a case where it fails, but every
time I look at it I wonder if it's going to go awry at some point.
Basically we'd be looking for a case where we have two MEMs which share
some bit of address calculation, where one of the MEMs is foldable, but
the other is not for some reason.
If we adjust the address calculations and the foldable MEM, then do we
run the risk of needing to change the non-foldable MEM?
Jeff