On Fri, 1 Apr 2016, Jakub Jelinek wrote:
> On Fri, Apr 01, 2016 at 11:08:09AM +0200, Richard Biener wrote:
> >
> > RTL DSE uses true_dependence to see whether a store may be killed by
> > anothe store - that's obviously broken. The following patch makes
> > it use output_dependence instead (introducing a canon_ variant of that).
>
> I think it would be interesting to see some stats on what effect does this
> have on the optimization RTL DSE is doing (say gather during
> unpatched bootstrap/regtest number of successfully optimized replace_read
> calls, and the same with patched bootstrap/regtest).
> From quick look at the patch, this wouldn't optimize even the cases that
> could be optimized (return *pi;) at the RTL level. If the statistics would
> show this affects it significantly, perhaps we could do both
> canon_true_dependence and canon_output_dependence, and if the two calls
> differ, don't clear the rhs, but mark it somehow and then in replace_read
> check what alias set is used for the read or something similar?
Well, I don't believe it is DSEs job to do CSE. And I don't see how
we can efficiently do what you suggest - it seems DSE doesn't check
all possible aliases when CSEing.
But of course I didn't try to understand how it works and thus happily
defer to somebody else coming up with a better fix. Maybe the correct
fix is to
while (i_ptr)
{
bool remove = false;
store_info *store_info = i_ptr->store_rec;
/* Skip the clobbers. */
while (!store_info->is_set)
store_info = store_info->next;
...
/* If this read is just reading back something that we just
stored, rewrite the read. */
else
{
if (store_info->rhs
&& offset >= store_info->begin
&& offset + width <= store_info->end
&& all_positions_needed_p (store_info,
offset -
store_info->begin,
width)
&& replace_read (store_info, i_ptr, read_info,
insn_info, loc,
bb_info->regs_live))
return;
/* The bases are the same, just see if the offsets
overlap. */
if ((offset < store_info->end)
&& (offset + width > store_info->begin))
remove = true;
which lacks sth like a canon_true_dependence check so that if
replace_read fails it doesn't try again with other stores it finds
(hopefully the store_info->next list is "properly" ordered - you'd
want to walk backwards starting from the reads - I don't think this
is what the code above does).
So it really seems to be that the code I fixed (quoted again below)
is supposed to ensure the validity of the transform as the comment
suggests.
else if (s_info->rhs)
/* Need to see if it is possible for this store to overwrite
the value of store_info. If it is, set the rhs to NULL to
keep it from being used to remove a load. */
{
if (canon_output_dependence (s_info->mem, true,
mem, GET_MODE (mem),
mem_addr))
{
s_info->rhs = NULL;
s_info->const_rhs = NULL;
}
As said, I don't believe in a DSE algorithm doing CSE, so ...
Richard.