On 1/27/2021 6:47 AM, Alexandre Oliva wrote:
While looking into the possibility of introducing setmemM patterns on
RISC-V to undo the transformation from loops of word writes into
memset, I was disappointed to find out that get_nonzero_bits would
take into account the range of the length passed to memset, but not
the trivially-available observation that this length was a multiple of
the word size. This knowledge, if passed on to setmemM, could enable
setmemM to output more efficient code.
In the end, I did not introduce a setmemM pattern, nor the machinery
to pass the ctz of the length on to it along with other useful
information, but I figured this small improvement to nonzero_bits
could still improve code generation elsewhere.
https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564341.html
Regstrapped on x86_64-linux-gnu. No analysis of codegen impact yet.
Does this seem worth pursuing, presumably for stage1?
for gcc/ChangeLog
* tree-ssanames.c (get_nonzero_bits): Zero out low bits of
integral types, when a MULT_EXPR INTEGER_CST operand ensures
the result will be a multiple of a power of two.
Your call on whether or not to pursue -- I'm not sure how often this
helps us in practice.
If you want to pursue, I'd suggest some tests to show when/how its helpful.
jeff