Bug#572746: libm: sinf/cosf performance is awful on amd64

Jerome Vizcaino Sun, 07 Mar 2010 15:22:25 -0800

Ok about the patches : there had to be a reason for those not to be merged 
upstream.
Some of my co-workers noticed the performance improvement when binding values 
between -pi and pi but the thing is, this kind of trick do not need to be 
applied when on 32 bits systems... Libm's behavior is not really arch-proof in 
terms of performance which is a bit confusing (and I would expect 64 bit being 
better than 32 bits when dealing with maths... :(
Have you tried pushing your code upstream ? Maybe this would be useful for 
future versions.


Thanks for your help

Jerome

On Sunday 07 March 2010, Aurelien Jarno wrote:
> On Sun, Mar 07, 2010 at 07:43:46PM +0100, Jerome Vizcaino wrote:
> > Hi,
> >
> > I could not say for sure the difference between sin and sinf (for
> > example) on Suse but the performance ratio I had on 32 bits, stayed the
> > same on 64 bits. This is why I was surprised to get impressive slowness
> > when moving to debian :( Thanks for pointing out the Suse patch : as we
> > only have Suse or Debian at work I could not do more comparisons.
> >
> > How about including patches from OpenSuse ? Is it possible as a quick
> > workaround?
> 
> The patches from OpenSuse are ugly and very invasive, and they do not
> seem to include the recent errno changes for C99 compliance (though I
> haven't tested them). I am not really sure we want that. I have started
> to rewrite part of the functions in assembly.
> 
> While this new assembly code behaves correctly with your testcase, it is
> twice slower than the current version when using normal arguments. I
> have modified a bit your code to stay within a reasonable range of
> arguments, and also test the l version of the functions.
> 
> Here is the result with the original code (using C code for the f
> 
> version):
> | Testing 10000000 sinf, cosf and tanf... Result: 19764686.000000,
> | Duration: 0.516700 sec Testing 10000000 sin, cos and tan (with float
> | args)... Result: 19764686.000000, Duration: 1.056214 sec Testing 10000000
> | sinl, cosl and tanl (with float args)... Result: 19764686.000000,
> | Duration: 1.089871 sec
> 
> Here is the result with assembly code instead (using the FPU
> 
> instructions), I get instead:
> | Testing 10000000 sinf, cosf and tanf... Result: 19764686.000000,
> | Duration: 1.010248 sec Testing 10000000 sin, cos and tan (with float
> | args)... Result: 19764686.000000, Duration: 1.055434 sec Testing 10000000
> | sinl, cosl and tanl (with float args)... Result: 19764686.000000,
> | Duration: 1.095374 sec
> 
> As I expect most codes to use values between -2pi and 2pi, I am not
> really sure we should change the current code.
> 




-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Bug#572746: libm: sinf/cosf performance is awful on amd64

Reply via email to