Hi,

+(simplify
+  (convert
+    (rshift
+      (mult

> is the outer convert really necessary?  That is, if we change
> the simplification result to

Indeed that should be "convert?" to make it optional.

> Is the Hamming weight popcount
> faster than the libgcc table-based approach?  I wonder if we really
> need to restrict this conversion to the case where the target
> has an expander.

Well libgcc uses the exact same sequence (not a table):

objdump -d ./aarch64-unknown-linux-gnu/libgcc/_popcountsi2.o

0000000000000000 <__popcountdi2>:
   0:   d341fc01        lsr     x1, x0, #1
   4:   b200c3e3        mov     x3, #0x101010101010101          // 
#72340172838076673
   8:   9200f021        and     x1, x1, #0x5555555555555555
   c:   cb010001        sub     x1, x0, x1
  10:   9200e422        and     x2, x1, #0x3333333333333333
  14:   d342fc21        lsr     x1, x1, #2
  18:   9200e421        and     x1, x1, #0x3333333333333333
  1c:   8b010041        add     x1, x2, x1
  20:   8b411021        add     x1, x1, x1, lsr #4
  24:   9200cc20        and     x0, x1, #0xf0f0f0f0f0f0f0f
  28:   9b037c00        mul     x0, x0, x3
  2c:   d378fc00        lsr     x0, x0, #56
  30:   d65f03c0        ret

So if you don't check for an expander you get an endless loop in libgcc since
the makefile doesn't appear to use -fno-builtin anywhere...

Wilco

Reply via email to