date:20130817

gcc-4.7-20130817 is now available

2013-08-17 Thread gccadmin

Snapshot gcc-4.7-20130817 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.7-20130817/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.7 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_7-branch 
revision 201818

You'll find:

 gcc-4.7-20130817.tar.bz2 Complete GCC

  MD5=201a8b3d0716844e1e74971bef04eb54
  SHA1=e6af1d4ceaff054e106d0fd617433e3688770bd3

Diffs from 4.7-20130810 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.7
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.

Inefficiencies in large integers

2013-08-17 Thread Asm Twiddler

Hello all,

I'm not sure whether this has been posted before, but gcc creates
slightly inefficient code for large integers in several cases:

unsigned long long val;

void example1() {
val += 0x8000ULL;
}

On x86 this results in the following assembly:
addl $0, val
adcl $32768, val+4
ret

The first add is unnecessary as it shouldn't modify val or set the carry.
This isn't too bad, but compiling for a something like AVR, results in
8 byte loads, followed by three additions (of the high bytes),
followed by another 8 byte saves.
The compiler doesn't recognize that 5 of those loads and 5 of those
saves are unnecessary.
Replacing the addition, with bitwise or/xor also produces an
unnecessary instruction on x86, but produces optimal instructions on
an AVR.


Here is another inefficiency for x86:

unsigned long long val = 0;
unsigned long small = 0;

unsigned long long example1() {
return val | small;
}

unsigned long long example2() {
return val & small;
}

This produces for example1 (bad):
movl small, %ecx
movl val, %eax
movl val+4, %edx
pushl %ebx
xorl %ebx, %ebx
orl %ecx, %eax
orl %ebx, %edx
popl %ebx
ret

For example2 (good):
movl small, %eax
xorl %edx, %edx
andl val, %eax
ret


The RTL's generated for example1 and example2 are very similar until
the fwprop1 stage.
Since the largest word size on x86 is 4 bytes, each operation is
actually split into two.
The forward propagator correctly realizes that anding the upper 4
bytes results in a zero.
However, it doesn't seem to recognize that oring the upper 4 bytes
should return val's high word.
This problem also occurs in the xor operation, and also when
subtracting (val - small).

All programs were compiled with "-O2 -Wall" although I also tried -O3
and -Os with the same result.

Thanks for any help.

gcc-4.7-20130817 is now available

Inefficiencies in large integers

2 matches

Site Navigation

Mail list logo

Footer information