https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
The following testcase reproduces the assembly:
typedef __UINT64_TYPE__ uint64_t;
void poly_double_le2 (unsigned char *out, const unsigned char *in)
{
uint64_t W[2];
__builtin_memcpy (&W, in, 16);
uint64_t carry = (W[1] >> 63) * 135;
W[1] = (W[1] << 1) ^ (W[0] >> 63);
W[0] = (W[0] << 1) ^ carry;
__builtin_memcpy (out, &W[0], 8);
__builtin_memcpy (out + 8, &W[1], 8);
}
