Missed optimization with endian and alignment independent memory access on x64

Moritz Strübe Thu, 06 Feb 2020 09:36:26 -0800

Hey,

a pattern I see quite often in embedded libraries is to access an arraybyte wise and shift the bits as needed (as this fixes endianness andalignment issues). If I read two consecutive bytes and left-shift thesecond by 8, I'd expect the compiler to optimize this to a word read ona x64, as it is LE and supports unaligned reads.

Clang does this as expected, gcc however misses this.


Here are the examples: https://godbolt.org/z/qvCCNs

Thus my questions:

Why is this so hard optimize? As it's quite a common pattern I'd expectthat there would be at least some hand-coded special case optimizer.(This isn't criticism - I'm honestly curious.) Or is there a reason gccshouldn't optimize this / Why it doesn't matter that this is missed?


Is there a way to write such code that gcc optimizes?

From a performance point of view: If I actually need two consecutivebytes, wouldn't it be better to load them as word and split them at theregister level?


Cheers
Morty

--
Redheads Ltd. Softwaredienstleistungen
Schillerstr. 14
90409 Nürnberg

Telefon: +49 (0)911 180778-50
E-Mail: moritz.stru...@redheads.de | Web: www.redheads.de

Geschäftsführer: Andreas Hanke
Sitz der Gesellschaft: Lauf
Amtsgericht Nürnberg HRB 22681
Ust-ID: DE 249436843

Missed optimization with endian and alignment independent memory access on x64

Reply via email to