GCC makes the problem is even worse if only SSE and not SSE 2 instructions
are enabled.  Since the integer bitwise instructions are only available
with SSE 2, using casts instead of intrinsics causes GCC to expand the
operation into a long series of instructions.

This was also a bug and a patch for this has been posted and approved.

Paolo

Reply via email to