On 09/02/2015 07:31 PM, Richard Henderson wrote: > Signed-off-by: Richard Henderson <[email protected]> > --- > target-tilegx/Makefile.objs | 2 +- > target-tilegx/helper.h | 4 +++ > target-tilegx/simd_helper.c | 63 > +++++++++++++++++++++++++++++++++++++++++++++ > target-tilegx/translate.c | 17 +++++++++++- > 4 files changed, 84 insertions(+), 2 deletions(-) > create mode 100644 target-tilegx/simd_helper.c
Naive question:
> +
> +uint64_t helper_v1shl(uint64_t a, uint64_t b)
> +{
> + uint64_t r = 0;
> + int i;
> +
> + b &= 7;
> + for (i = 0; i < 64; i += 8) {
> + uint64_t m = 0xffULL << i;
> + r |= ((a & m) << b) & m;
> + }
Is it any more efficient to use multiplies instead of looping, as in:
uint64_t m;
b &= 7;
m = 0x0101010101010101ULL * ((1 << (8 - b)) - 1);
return (a & m) << b;
Or if multiplies are bad, what about straight-line expansion of the
mask, as in:
uint64_t m;
b &= 7;
m = (1 << (8 - b)) - 1;
m |= m << 32;
m |= m << 16;
m |= m << 8;
return (a & m) << b;
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature
