> I was hoping that this patch might help me fix pr121393 (on which I'm still 
> stuck), if I just disabled the unsigned gather/scatter patterns, but no such 
> luck: it just zero-extends the supposed-to-be-negative offset and suffers 
> exactly the same bug.
>
> On the plus side, whereas previously if I disabled unsigned offsets I 
> got a total failure to vectorize, whereas with your patch it now 
> vectorizes just fine, albeit with the bug.

Once things have settled down here, I can also have a look at the PR but
so far I didn't have much time.

> Assuming your patch is approved, should I continue to accept signed 
> offsets in the backend? The hardware instruction that accepts 32-bit 
> offsets only accepts unsigned offsets, so I'm supporting signed offsets 
> by calculating the absolute address explicitly. Of course, the backend 
> has no way to know if the range of possible offsets is actually safe to 
> treat it as unsigned, so is your way likely to produce better code?

We also accept "everything" in the riscv backend and convert the offsets
to unsigned.  Similar to yours our instructions can only do unsigned offsets.
I wouldn't expect this patch to improve on what a backend can do.  For us the 
motivation is that costing will be easier, as well as choosing the offset
type.

I'm not doing any range checks, though.  The idea is that for signed -> 
unsigned we need a new pointer-sized offset and things will just work via
two's complement.  The range truncation for strided gather/scatter is not 
touched.

> The backend also accepts 64-bit offsets, but there's no hardware 
> instruction for that, so both variants are converted to absolute 
> addresses explicitly. This variant only exists because I found that the 
> SPEC HPC "lbm" benchmark fails to vectorize at all without it, but it 
> would be better if middle end didn't choose to use 64-bit offsets unless 
> absolutely necessary, but vectorizing is better than not.

Hmm, we try to use smaller offset types first, generally.
If a 64-bit offset is used that's likely due to a double array being used just 
like in SPECfp 2017's lbm?  How do you treat 64-bit offsets when the hardware 
can only handle 32-bits?  Maybe we could use ranger range information
in vect_truncate_gather_scatter_offset?

-- 
Regards
 Robin

Reply via email to