AVR-gcc shift optimization

Asm Twiddler Thu, 01 Aug 2013 18:24:06 -0700

Hello all.

The current implementation produces non-optimal code for large shifts
that aren't a multiple of eight when operating on long integers (4
bytes).
All such shifts are broken down into a slow loop shift.
For example, a logical shift right by 17 will result in a loop that
takes around 7 cycles per iteration resulting in ~119 cycles.
This takes at best 7 instruction words.


A more efficient implementation could be:
mov %B0,%D1
mov %A0,%C1
clr %C0
clr %D0
lsr %C0
ror %D0
This gives six cycles and six instruction words, but which can both be
reduced to five if movw exists.

There are several other locations where a more efficient
implementation may be done.

I'm just wondering why this functionality doesn't exist already.
It seems like this would probably be fairly easy to implement,
although a bit time consuming.
I would also guess lack of interest or lack of use of long integers.

Lack of this functionality wouldn't be a problem as one could simply
split the shift.
Sadly my attempts to split the shift result in it being recombined.

unsigned long temp = val >> 16;
return temp >> 1;

gives the same assembly as

return val >> 17;


Thanks for any info.

AVR-gcc shift optimization

Reply via email to