>> >      for (i = 0; i < lim; i++) {
>> > -        xts_tweak_encdec(datactx, decfunc, src, dst, (uint8_t *)&T);
>> > +        xts_uint128 S, D;
>> > +
>> > +        memcpy(&S, src, XTS_BLOCK_SIZE);
>> > +        xts_tweak_encdec(datactx, decfunc, &S, &D, &T);
>> > +        memcpy(dst, &D, XTS_BLOCK_SIZE);
>> 
>> Why do you need S and D?
>
> I think src & dst pointers can't be guaranteed to be aligned
> sufficiently for int64 operations, if we just cast from uint8t*.

I see. I did a quick test without the memcpy() calls and it doesn't seem
to have a visible effect on performance, but if it turns out that it
does then maybe this is worth investigating further. I suspect all
buffers received by this code are allocated with qemu_try_blockalign()
anyway, so it should be safe.

Berto

Reply via email to