Am 09.05.2012 21:48, schrieb Marc Glisse:
On Wed, 9 May 2012, Daniel Marschall wrote:

1. I do not know my DisplayName/DisplayFamily (0f_2h or 0f_3h?).

Ask your processor (cpuid). Or your kernel (/proc/cpuinfo on linux).

/proc/cpuinfo says:

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 30
model name      : Intel(R) Xeon(R) CPU           X3440  @ 2.53GHz
stepping        : 5
...

But I do not know if this is "0f_2h" or "0f_3h" . That's cryptical for me.


3. Should I compare Latency or Throughput if I want to produce fast code? Or doesn't it matter which value I compare?

Both. And you also need to look at the code that is nearby, not just
this one instruction. In short, don't bother. If you really want to
know, benchmark both versions.

The nearby code is identical. The typecast only changes these two OP codes. Yes, I should do a bit benchmarks. It would be a long-term-benchmark since the speedup is very fine-graded.

Daniel



Am 09.05.2012 21:48, schrieb Marc Glisse:
On Wed, 9 May 2012, Daniel Marschall wrote:

1. I do not know my DisplayName/DisplayFamily (0f_2h or 0f_3h?).

Ask your processor (cpuid). Or your kernel (/proc/cpuinfo on linux).

3. Should I compare Latency or Throughput if I want to produce fast code? Or doesn't it matter which value I compare?

Both. And you also need to look at the code that is nearby, not just
this one instruction. In short, don't bother. If you really want to
know, benchmark both versions.

Reply via email to