On 17/08/18 19:28 +0200, Marc Glisse wrote:
On Mon, 13 Aug 2018, Jonathan Wakely wrote:
Thanks to Lars for the suggestions.
* libsupc++/new_opa.cc (operator new(size_t, align_val_t)): Use
__is_pow2 to check for valid alignment. Avoid branching when rounding
size to multiple of alignment.
Tested x86_64-linux, committed to trunk.
Are you getting better code with __is_pow2 on many platforms? As far
as I can tell from a quick look at the patch (I didn't actually test
it, I could be completely off), this replaces (x&(x-1))==0 with
popcount(x)==1. On a basic x86_64, popcount calls into libgcc, which
doesn't seem so good. On a more recent x86_64 (BMI1), x&(x-1) is a
single instruction that sets a flag when the result is 0, that's hard
to beat.
Then shouldn't we do that in __ispow2?
Even better would be a peephole optimisation to turn
__builtin_popcount(x)==1 into that.
Or was the goal to accept an alignment of 0, and not an optimization?
Accepting alignment of 0 isn't the goal :-)
std::ispow2 should be the best way to check if an unsigned integer is
a power of two, so I wanted to use that instead of manual bit
twiddling.
I hope that check will go away soon, if the compiler starts checking
for valid alignments at the call site. (That won't catch all misuses,
as there could be calls with non-constants, but we can't make it
completely foolproof, some people just deserve to get UB!)
+ sz = (sz + align - 1) & ~(align - 1);
Note that gcc immediately replaces ~(align - 1) with -align. It does
it even if you compute align-1 on the previous line and write
(sz+X)&~X in the hope of sharing the subtraction.
The goal there was to replace the branch for the 'if' and just do the
adjustment unconditionally.