Thanks, Peter.
I actually don't mind a larger code if the performance can be improved.
Also, we probably do computations more often with smaller integers (at
least in my applications); that's why I overlooked the case greater than
10...
But now I'm using your (+Michael Jones and others) suggestions:
// PowP computes real raised to positive integer xⁿ
func PowP(x float64, n uint32) (r float64) {
if n == 0 {
return 1.0
}
if n == 1 {
return x
}
if n == 2 {
return x * x
}
if n == 3 {
return x * x * x
}
if n == 4 {
r = x * x
return r * r
}
if n == 5 {
r = x * x
return r * r * x
}
if n == 6 {
r = x * x * x
return r * r
}
if n == 7 {
r = x * x * x
return r * r * x
}
if n == 8 {
r = x * x * x * x
return r * r
}
if n == 9 {
r = x * x * x
return r * r * r
}
if n == 10 {
r = x * x * x
return r * r * r * x
}
r = 1.0
for i := n; i > 0; i >>= 1 {
if i&1 == 1 {
r *= x
}
x *= x
}
return
}
The output of the benchmark is:
BenchmarkPowP10-32 50000000 26.4 ns/op
BenchmarkPowP10std-32 5000000 245 ns/op
BenchmarkPowP20-32 20000000 79.2 ns/op
BenchmarkPowP20std-32 3000000 556 ns/op
BenchmarkPowP50-32 5000000 255 ns/op
BenchmarkPowP50std-32 1000000 1566 ns/op
BenchmarkPowP100-32 2000000 616 ns/op
BenchmarkPowP100std-32 500000 3594 ns/op
BenchmarkPowP200-32 1000000 1365 ns/op
BenchmarkPowP200std-32 200000 9184 ns/op
So, the speedups for each N are (more or less):
N=10: 9x
N=20: 7x
N=50: 6x
N=100: 5x
N=200: 6x
(files
here: https://gist.github.com/cpmech/9f871df5b59fa8407221b0d3fb361a3d)
I'm thinking about looking at Go-ASM to see if I can improve this further.
I may try to optimize the math.Sin() function too...
Thanks also to Michael Jones.
On Friday, August 4, 2017 at 10:36:16 PM UTC+10, peterGo wrote:
>
> Dorival Pedroso,
>
> PowP has a lot of code. PowG is simpler and it is modestly slower for
> small values of n and much faster for larger values of n.
>
> func PowG(x float64, n uint32) float64 {
> y := 1.0
> for i := n; i > 0; i >>= 1 {
> if i&1 == 1 {
> y *= x
> }
> x *= x
> }
> return y
> }
>
>
> BenchmarkPowP1 1000000000 2.89 ns/op
> BenchmarkPowP5 100000000 13.6 ns/op
> BenchmarkPowP10 50000000 29.2 ns/op
> BenchmarkPowP20 20000000 89.6 ns/op
> BenchmarkPowP200 200000 11666 ns/op
> BenchmarkPowG1 1000000000 2.66 ns/op
> BenchmarkPowG5 100000000 17.9 ns/op
> BenchmarkPowG10 30000000 45.1 ns/op
> BenchmarkPowG20 20000000 97.7 ns/op
> BenchmarkPowG200 300000 3972 ns/op
>
> Peter
>
> On Friday, August 4, 2017 at 10:08:36 AM UTC, Dorival Pedroso wrote:
>>
>> Since my ASM skills are limited, for positive integers, I'm planning on
>> using:
>>
>> // PowP computes real raised to positive integer xⁿ
>> func PowP(x float64, n uint32) (r float64) {
>> if n == 0 {
>> return 1.0
>> }
>> if n == 1 {
>> return x
>> }
>> if n == 2 {
>> return x * x
>> }
>> if n == 3 {
>> return x * x * x
>> }
>> if n == 4 {
>> r = x * x
>> return r * r
>> }
>> if n == 5 {
>> r = x * x
>> return r * r * x
>> }
>> if n == 6 {
>> r = x * x * x
>> return r * r
>> }
>> if n == 7 {
>> r = x * x * x
>> return r * r * x
>> }
>> if n == 8 {
>> r = x * x * x * x
>> return r * r
>> }
>> if n == 9 {
>> r = x * x * x
>> return r * r * r
>> }
>> if n == 10 {
>> r = x * x * x
>> return r * r * r * x
>> }
>> r = x * x * x
>> r = r * r * r * x
>> var i uint32
>> for i = 11; i <= 20; i++ {
>> r *= x
>> if n == i {
>> return
>> }
>> }
>> return math.Pow(x, float64(n))
>> }
>>
>> which gives a speedup of 5-8 times for small positive integers (<20). the
>> case of negative integers could be handled too.
>>
>> output:
>> BenchmarkPowP10-32 50000000 30.4 ns/op
>> BenchmarkPowP10std-32 5000000 247 ns/op
>> BenchmarkPowP20-32 20000000 98.2 ns/op
>> BenchmarkPowP20std-32 3000000 561 ns/op
>> BenchmarkPowP200-32 200000 10145 ns/op
>> BenchmarkPowP200std-32 200000 9231 ns/op
>>
>> all files are here:
>> https://gist.github.com/cpmech/9f871df5b59fa8407221b0d3fb361a3d
>>
>> I'm also looking at the Cephes library: http://www.netlib.org/cephes/
>> for ideas
>>
>> maybe we could have a similar function in Go (and many more optimised
>> ones, especially for complex numbers...)
>>
>> cheers
>> d
>>
>>
>>
>> On Friday, August 4, 2017 at 4:20:41 PM UTC+10, Dorival Pedroso wrote:
>>>
>>> I've noticed that this C code:
>>>
>>> #include "math.h"
>>> int main() {
>>> double x = 2.5;
>>> int Nmax = 10000000;
>>> for (int N=0; N<Nmax; N++) {
>>> for (int i=0; i<20; i++) {
>>> pow(x, i);
>>> }
>>> }
>>> }
>>>
>>> can run up to 50x faster than this Go code:
>>>
>>> package main
>>>
>>> import "math"
>>>
>>> func main() {
>>> x := 2.5
>>> Nmax := 10000000
>>> for N := 0; N < Nmax; N++ {
>>> for i := 0; i < 20; i++ {
>>> math.Pow(x, float64(i))
>>> }
>>> }
>>> }
>>>
>>> The C code was compiled with: gcc -O2 ccode.c -o ccode -lm
>>> then run with time ./ccode
>>>
>>> The Go code was compiled with: go build gcode.go
>>> then run with time ./gcode
>>>
>>> I've used the time command on Linux (Ubuntu) to get some estimate.
>>>
>>> So the question is: how can we make the Go code faster?
>>>
>>
--
You received this message because you are subscribed to the Google Groups
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.