http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60145
Bug ID: 60145 Summary: [AVR] Suboptimal code for byte order shuffling using shift and or Product: gcc Version: 4.8.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: other Assignee: unassigned at gcc dot gnu.org Reporter: matthijs at stdin dot nl (Not sure what the component should be, just selected "other" for now) Using shifts and bitwise-or to compose multiple bytes into a bigger integer results in suboptimal code on AVR. For example, a few simple functions that take two or four bytes and compose them into (big endian) integers. Since AVR is an 8-bit platform, this essentially just means moving two bytes from the argument register to the return value registers. However, the outputted assembly is significantly bigger than that and contains obvious optimization opportunities. The example below also contains a version that uses a union to compose the integer, which gets optimized as expected (but only works on little-endian systems, since it relies on the native endianness of uint16_t). matthijs@grubby:~$ cat foo.c #include <stdint.h> uint16_t join2(uint8_t a, uint8_t b) { return ((uint16_t)a << 8) | b; } uint16_t join2_efficient(uint8_t a, uint8_t b) { union { uint16_t uint; uint8_t arr[2]; } tmp = {.arr = {b, a}}; return tmp.uint; } uint32_t join4(uint8_t a, uint8_t b, uint8_t c, uint8_t d) { return ((uint32_t)a << 24) | ((uint32_t)b << 16) | ((uint32_t)c << 8) | d; } matthijs@grubby:~$ avr-gcc -c foo.c -O3 && avr-objdump -d foo.o foo.o: file format elf32-avr Disassembly of section .text: 00000000 <join2>: 0: 70 e0 ldi r23, 0x00 ; 0 2: 26 2f mov r18, r22 4: 37 2f mov r19, r23 6: 38 2b or r19, r24 8: 82 2f mov r24, r18 a: 93 2f mov r25, r19 c: 08 95 ret 0000000e <join2_efficient>: e: 98 2f mov r25, r24 10: 86 2f mov r24, r22 12: 08 95 ret 00000014 <join4>: 14: 0f 93 push r16 16: 1f 93 push r17 18: 02 2f mov r16, r18 1a: 10 e0 ldi r17, 0x00 ; 0 1c: 20 e0 ldi r18, 0x00 ; 0 1e: 30 e0 ldi r19, 0x00 ; 0 20: 14 2b or r17, r20 22: 26 2b or r18, r22 24: 38 2b or r19, r24 26: 93 2f mov r25, r19 28: 82 2f mov r24, r18 2a: 71 2f mov r23, r17 2c: 60 2f mov r22, r16 2e: 1f 91 pop r17 30: 0f 91 pop r16 32: 08 95 ret