On 05/12/2015 07:40 PM, Andrew Pinski wrote:
On Tue, May 12, 2015 at 6:36 PM, Fei Ding <fding...@gmail.com> wrote:
I think Thiago and Eric just want to know which code-gen is better and why...
You need to understand for a complex process (CISC ISAs) like x86,
there is no one right answer sometimes. You need to look at each
micro-arch and understand the pipeline. Sometimes different code
stream will performance the same but it also depends on the code size
too.
A good place to start is the Intel 64 and IA-32 Architectures
Optimization Reference Manual. It lists the throughput and
latencies of x86 instructions and gives guidance for which
ones might be more efficient on which processors. For example,
in the section titled Using LEA it discusses why the three
operand form of the instruction is slower on the Sandy Bridge
microarchitecture than on others:
http://www.intel.com/content/dam/doc/manual/64-ia-32-architectures-optimization-manual.pdf
Martin
Thanks,
Andrew Pinski
2015-05-12 23:29 GMT+08:00 Eric Botcazou <ebotca...@libertysurf.fr>:
Note that at -O3 there is a difference still:
clang (3.6.0):
addl %esi, %edi
movl %edi, %eax
retq
gcc (4.9.2)
leal (%rdi,%rsi), %eax
ret
Can't tell which is best, if any.
But what's your point exactly here? You cannot expect different compilers to
generate exactly the same code on a given testcase for non-toy architectures.
Note that this kind of discussion is more appropriate for gcc-h...@gcc.gnu.org
--
Eric Botcazou