Thanks, I just tried using the random number generator in Incanter as
a replacement and it shaves another ~100 msec off the runtime of the
function. That's better but still noticeably slower than python.
I also tried applying what I've learned here to writing a function
that adds two arrays which is imperative and I used type hints to get
rid of the reflection warnings:
(defn sum-fields [^"[[D" arr1 ^"[[D" arr2 ^"[[D" result]
(let [L (alength arr1)]
(dotimes [i L]
(dotimes [j L]
(aset-double ^doubles (aget result i) j (+ (aget arr1 i j)
(aget arr2 i j)))))))
but it's surprisingly slow compared to numpy again.
On Sep 21, 2:26 pm, Jason Wolfe <[email protected]> wrote:
> FYI I fired up a profiler and more than 2/3 of the runtime of gaussian-
> matrix5 is going to the .nextGaussian method of java.util.Random. I
> think there are faster (and higher-quality) drop-in replacements for
> java.util.Random, if that is really important to you. I seem to
> recall there being a good Mersenne Twister implementation around,
> which might fit your bill.
>
> -Jason
>
> On Sep 21, 6:34 am, Ranjit <[email protected]> wrote:
>
> > Yeah, I spoke too soon. All the rows are identical. This is what I
> > meant to do:
>
> > (defn gaussian-matrix-for-the-nth-time [L]
> > (into-array (map double-array (repeatedly L #(repeatedly L next-
> > gaussian)))))
>
> > and on my computer takes ~800 msecs vs. ~400 msecs for gaussian-
> > matrix5 to make an array of 1000^2 gaussians.
>
> > But numpy only takes about 100 msecs to do the same thing on my
> > computer. I'm surprised we can't beat that or at least get close. But
> > maybe next-gaussian is the bottleneck as you say.
>
> > On Sep 21, 12:20 am, Jason Wolfe <[email protected]> wrote:
>
> > > On Sep 20, 4:43 pm, Ranjit <[email protected]> wrote:
>
> > > > I'm glad you think partition is the problem, because that was my guess
> > > > too. But I think I have the answer. This is the fastest version I've
> > > > seen so far:
>
> > > > (defn gaussian-matrix-final [L]
> > > > (into-array ^doubles (map double-array (repeat L (repeatedly L next-
> > > > gaussian)))))
>
> > > The ^doubles type hint is wrong (it means array of doubles, not seq of
> > > arrays of doubles); the compiler is just ignoring it.
>
> > > And the reason it's so fast is probably since you're repeating the
> > > same L gaussian values L times here. Doesn't matter how fast it runs
> > > if it returns the wrong answer :-). Anyway, that might indicate that
> > > the .nextGaussian is actually the bottleneck in the fastest versions.
>
> > > -Jason
>
> > > > If I understand what's going on now, then it looks like the only way
> > > > to make this any faster is if next-gaussian could return primitives.
>
> > > > The for, and doseq macros seems like they're pretty slow.
>
> > > > -Ranjit
>
> > > > On Sep 20, 3:30 pm, Jason Wolfe <[email protected]> wrote:
>
> > > > > I think partition is slowing you down (but haven't profiled to
> > > > > verify). Here's a functional version that's about 70% as fast as my
> > > > > "5":
>
> > > > > (defn gaussian-matrix6 [L]
> > > > > (to-array (for [i (range L)] (into-array Double/TYPE (for [j
> > > > > (range L)] (next-gaussian))))))
>
> > > > > and I'd guess that's about as good as you're going to get, given that
> > > > > this approach is necessarily going to box and unbox the doubles, and
> > > > > create intermediate sequences, rather than stuffing the primitive
> > > > > doubles directly into the result array.
>
> > > > > -Jason
>
> > > > > On Sep 20, 12:00 pm, Ranjit <[email protected]> wrote:
>
> > > > > > Replacing the doseq's with dotimes speeds it up a little more:
>
> > > > > > (defn gaussian-matrix5 [^"[[D" arr]
> > > > > > (dotimes [x (alength arr)]
> > > > > > (dotimes [y (alength (first arr))]
> > > > > > (aset-double ^doubles (aget arr (int x)) (int y) (next-
> > > > > > gaussian)))))
>
> > > > > > but I'm getting reflection warnings on alength. I guess it doesn't
> > > > > > cause a problem because they're only called once?
>
> > > > > > Also adding type hints to the more functional version of my first
> > > > > > attempt speeds things up quite a bit:
>
> > > > > > (defn gaussian-matrix2 [L]
> > > > > > (into-array ^doubles
> > > > > > (map double-array (partition L (repeatedly (* L L) next-
> > > > > > gaussian)))))
>
> > > > > > But it's still about 4x slower than gaussian-matrix5 above. There
> > > > > > must
> > > > > > be a way to improve on the inner loop here that doesn't require
> > > > > > using
> > > > > > indices, right?
>
> > > > > > On Sep 20, 12:32 pm, Jason Wolfe <[email protected]> wrote:
>
> > > > > > > Oops, I found aset-double2 with tab completion and figured it was
> > > > > > > build-in. Forgot it was a utility I built some time ago, a stub
> > > > > > > for a
> > > > > > > Java method that does the setting.
>
> > > > > > > Also, I got the type hint for the "arr" arg wrong, although it
> > > > > > > didn't
> > > > > > > seem to matter.
>
> > > > > > > Here's a fixed version in standard Clojure that's basically as
> > > > > > > fast:
>
> > > > > > > user> (defn gaussian-matrix4 [^"[[D" arr ^int L]
> > > > > > > (doseq [x (range L) y (range L)] (aset-double ^doubles
> > > > > > > (aget arr (int x)) (int y) (.nextGaussian ^Random r))))
> > > > > > > #'user/gaussian-matrix4
> > > > > > > user> (do (microbench (gaussian-matrix3 (make-array Double/TYPE
> > > > > > > 10
> > > > > > > 10) 10)) (microbench (gaussian-matrix4 (make-array Double/TYPE 10
> > > > > > > 10)
> > > > > > > 10)) )
> > > > > > > min; avg; max ms: 0.000 ; 0.033 ; 8.837 ( 56828 iterations)
> > > > > > > min; avg; max ms: 0.009 ; 0.038 ; 7.132 ( 50579 iterations)
>
> > > > > > > It seems like you should be able to just use aset-double with
> > > > > > > multiple
> > > > > > > indices (in place of aset-double2), but I can't seem to get the
> > > > > > > type
> > > > > > > hints right.
>
> > > > > > > -Jason
>
> > > > > > > On Sep 20, 7:36 am, Ranjit <[email protected]> wrote:
>
> > > > > > > > Thanks Jason, this is great.
>
> > > > > > > > I was confused earlier because I wasn't seeing reflection
> > > > > > > > warnings,
> > > > > > > > but it turns out that was only because I was evaluating the
> > > > > > > > function
> > > > > > > > definitions in the emacs buffer, and the warnings weren't
> > > > > > > > visible.
>
> > > > > > > > I have a question about gaussian-matrix3 though. What is "aset-
> > > > > > > > double2"? Is that a macro that has a type hint for an array of
> > > > > > > > doubles?
>
> > > > > > > > Thanks,
>
> > > > > > > > -Ranjit
> > > > > > > > On Sep 19, 5:37 pm, Jason Wolfe <[email protected]> wrote:
>
> > > > > > > > > Hi Ranjit,
>
> > > > > > > > > The big perf differences you're seeing are due to reflective
> > > > > > > > > calls.
> > > > > > > > > Getting the Java array bits properly type-hinted is
> > > > > > > > > especially tricky,
> > > > > > > > > since you don't always get good reflection warnings.
>
> > > > > > > > > Note that aset is only fast for reference types:
>
> > > > > > > > > user> (doc aset)
> > > > > > > > > -------------------------
> > > > > > > > > clojure.core/aset
> > > > > > > > > ([array idx val] [array idx idx2 & idxv])
> > > > > > > > > Sets the value at the index/indices. Works on Java arrays of
> > > > > > > > > reference types. Returns val.
>
> > > > > > > > > So, if you want to speed things up ... here's your starting
> > > > > > > > > point:
>
> > > > > > > > > user> (set! *warn-on-reflection* true)
> > > > > > > > > true
> > > > > > > > > user> (import java.util.Random)
> > > > > > > > > (def r (Random. ))
>
> > > > > > > > > (defn next-gaussian [] (.nextGaussian r))
>
> > > > > > > > > (defn gaussian-matrix1 [arr L]
> > > > > > > > > (doseq [x (range L) y (range L)] (aset arr x y
> > > > > > > > > (next-gaussian))))
>
> > > > > > > > > (defn gaussian-matrix2 [L]
> > > > > > > > > (into-array (map double-array (partition L (repeatedly
> > > > > > > > > (* L L)
> > > > > > > > > next-gaussian)))))
>
> > > > > > > > > Reflection warning, NO_SOURCE_FILE:1 - reference to field
> > > > > > > > > nextGaussian
> > > > > > > > > can't be resolved.
>
> > > > > > > > > user> (do (microbench (gaussian-matrix1 (make-array
> > > > > > > > > Double/TYPE 10
> > > > > > > > > 10) 10)) (microbench (gaussian-matrix2 10)) )
> > > > > > > > > min; avg; max ms: 2.944 ; 4.693 ; 34.643 ( 424
> > > > > > > > > iterations)
> > > > > > > > > min; avg; max ms: 0.346 ; 0.567 ; 11.006 ( 3491
> > > > > > > > > iterations)
>
> > > > > > > > > ;; Now, we can get rid of the reflection in next-guassian:
>
> > > > > > > > > user> (defn next-gaussian [] (.nextGaussian #^Random r))
> > > > > > > > > #'user/next-gaussian
> > > > > > > > > user> (do (microbench (gaussian-matrix1 (make-array
> > > > > > > > > Double/TYPE 10
> > > > > > > > > 10) 10)) (microbench (gaussian-matrix2 10)) )
> > > > > > > > > min; avg; max ms: 2.639 ; 4.194 ; 25.024 ( 475
> > > > > > > > > iterations)
> > > > > > > > > min; avg; max ms: 0.068 ; 0.130 ; 10.766 ( 15104
> > > > > > > > > iterations)
> > > > > > > > > nil
>
> > > > > > > > > ;; which has cut out the main bottleneck in gaussian-matrix2.
> > > > > > > > > ;; 1 is still slow because of its array handling.
> > > > > > > > > ;; here's a fixed version:
>
> > > > > > > > > user> (defn gaussian-matrix3 [^doubles arr ^int L]
> > > > > > > > > (doseq [x (range L) y (range L)] (aset-double2 arr (int
> > > > > > > > > x) (int
> > > > > > > > > y) (.nextGaussian ^Random r))))
> > > > > > > > > #'user/gaussian-matrix3
>
> > > > > > > > > user> (do (microbench (gaussian-matrix1 (make-array
> > > > > > > > > Double/TYPE 10
> > > > > > > > > 10) 10)) (microbench (gaussian-matrix2 10)) (microbench
> > > > > > > > > (gaussian-
> > > > > > > > > matrix3 (make-array Double/TYPE 10 10) 10)) )
> > > > > > > > > min; avg; max ms: 2.656 ; 4.164 ; 12.752 ( 479
> > > > > > > > > iterations)
> > > > > > > > > min; avg; max ms: 0.065 ; 0.128 ; 9.712 ( 15255
> > > > > > > > > iterations)
> > > > > > > > > min; avg; max ms: 0.000 ; 0.035 ; 10.180 ( 54618
> > > > > > > > > iterations)
> > > > > > > > > nil
>
> > > > > > > > > ;; which is 100x faster than where we started.
>
> > > > > > > > > A profiler is often a great way to figure out what's eating
> > > > > > > > > up time.
> > > > > > > > > Personally, I've never found the need to use a disassembler.
>
> > > > > > > > > Cheers, Jason
>
>
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en