On Mon, 13 Aug 2018, Jonathan Wakely wrote:

Thanks to Lars for the suggestions.

        * libsupc++/new_opa.cc (operator new(size_t, align_val_t)): Use
        __is_pow2 to check for valid alignment. Avoid branching when rounding
        size to multiple of alignment.

Tested x86_64-linux, committed to trunk.

Are you getting better code with __is_pow2 on many platforms? As far as I can tell from a quick look at the patch (I didn't actually test it, I could be completely off), this replaces (x&(x-1))==0 with popcount(x)==1. On a basic x86_64, popcount calls into libgcc, which doesn't seem so good. On a more recent x86_64 (BMI1), x&(x-1) is a single instruction that sets a flag when the result is 0, that's hard to beat.

Or was the goal to accept an alignment of 0, and not an optimization?


+  sz = (sz + align - 1) & ~(align - 1);

Note that gcc immediately replaces ~(align - 1) with -align. It does it even if you compute align-1 on the previous line and write (sz+X)&~X in the hope of sharing the subtraction.

--
Marc Glisse

Reply via email to