On 21/11/14 13:31, Bernhard R. Link wrote:
> Otherwise that memory
> might afterwards be regarded as lzo_memops_TU2_struct

lzo_memops_TU2_struct is declared with __attribute__((__may_alias__)),
so actually the right thing should be happening WRT aliasing in this case.

On 21/11/14 13:21, Thorsten Glaser wrote:
> • for i386 and especially amd64, all subarchitectures supported
>   by Debian/Linux jessie suffer so much from unaligned access,
>   speed-wise, that it’s worth the overhead of forcing aligned
>   access (i386, i486 maybe were not as badly affected)

I was hoping this statement was correct, because if it was, avoiding
unaligned accesses would be a clear win regardless, and the right thing
to do would be entirely uncontroversial.

Unfortunately, on my x86-64 laptop, my patched liblzo2 with
-DLZO_CFG_NO_UNALIGNED on all architectures seems to be half as fast as
the unpatched one for a simple test-case (uncompress
linux_3.17.orig.tar.xz to linux_3.17.orig.tar in a tmpfs, time lzop -c <
linux_3.17.orig.tar > /dev/null, repeat 3 times; results agree within 10%).

I'm trying out a slightly different approach: keeping the unaligned
accesses via casts like *(uint16_t *) on architectures where lzodefs.h
specifically allows them, but disabling the casts via
struct { char[n] } conditional on alignof(that struct) == 1, which seem
to be the problematic ones.

The CPUs for which lzodefs.h uses those casts are amd64, arm*
conditional on target CPU (so armel but not armhf in Debian terms),
arm64, cris, i386, m68k conditional on target CPU (__mc68020__ but not
__mcoldfire__), powerpc* if big-endian, and s390*.

    S


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/54721f74.7010...@debian.org

Reply via email to