Catching up on backlog, sorry for the very late response:
Tamar Christina writes:
> Hi All,
>
> Consider the following case
>
> #include
>
> uint64_t
> test4 (uint8x16_t input)
> {
> uint8x16_t bool_input = vshrq_n_u8(input, 7);
> poly64x2_t mask = vdupq_n_p64(0x0102040810204080UL);
>
Hi All,
Consider the following case
#include
uint64_t
test4 (uint8x16_t input)
{
uint8x16_t bool_input = vshrq_n_u8(input, 7);
poly64x2_t mask = vdupq_n_p64(0x0102040810204080UL);
poly64_t prodL = vmull_p64((poly64_t)vgetq_lane_p64((poly64x2_t)bool_input,
0),