Am 09.05.2012 20:30, schrieb Ian Lance Taylor:
Daniel Marschall <daniel-marsch...@viathinksoft.de> writes:
I did understand that the compiler used "signed" multiplication
instead of an unsigned one because char*char needs to be extended.
Maybe I am wrong, but couldn't the compiler "know" that the result
will be at least unsigned because unsigned * unsigned = unsigned ?
Well, but the rules of C say that the unsigned char values are
zero-extended to int, and then they are multiplied using a signed
multiplication. So the result is not unsigned. The compiler really
would have to do some sort of type or value based reasoning here to
determine that an unsigned multiplication would work also.
Hello,
I could sucessfully do a benchmark of my code. I found out that the
no-typecast-version (imull+movslq) needed 47 secs for 12 working
packages, while the typecast-version (imulq) needed only 38 secs per 12
working packages. That is incredible!
Maybe you should still consider preferring imulq instead of
imull+movslq ?
I wonder if GCC has an optimization which optimizes the machine code
itself, without knowledge of the underlaying C code, e.g. it could
eliminate unnecessary mov commands if a register is not used resp. using
operations which do have lower latency. I think such an "assembler-only"
optimization still can get additional performance since the rules of the
underlaying programming language (e.g. the expansion to signed int) can
be ignored if the end-result is the same. But I fear that this is rather
a hard task and maybe not possible.
Daniel