Hi Stephen,
Thanks for this code, it's easy to experiment with it.
Let me propose this simple update with a variation on your ncubic() function.
I noticed that all intermediate results were far below 32 bits, so I did a
new version which is 30% faster on my athlon with the same results. This is
be
Here is a better version of the benchmark code.
It has the original code used in 2.4 version of Cubic for comparison
---
/* Test and measure perf of cube root algorithms. */
#include
#include
#include
#include
#include
#ifdef __x86_64