On my laptop (Mac) the biggest difference here has nothing to do with buffering
in slurp. It is whether you use System/in (fast) or *in* (slow). The latter is
a LineNumberingPushbackReader.
Can you check and confirm? When I slurp System/in it is more than twice as fast
as slurping *in*.
I believe the next-biggest perf issue is how StringBuilders grow. I suspect
that the 4096 buffer is making them grow more efficiently.
Stu
> Another example. I'm running this on a Ubuntu 10.04 laptop with this
> java:
>
> java version "1.6.0_18"
> OpenJDK Runtime Environment (IcedTea6 1.8) (6b18-1.8-0ubuntu1)
> OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode)
>
> and this command line:
> java -Xmx3G -server clojure.main cat2.clj
>
>
> (require '[clojure.java.io :as jio])
>
> (defn- normalize-slurp-opts
> [opts]
> (if (string? (first opts))
> (do
> (println "WARNING: (slurp f enc) is deprecated, use (slurp
> f :encoding enc).")
> [:encoding (first opts)])
> opts))
>
> (defn slurp2
> "Reads the file named by f using the encoding enc into a
> string
> and returns it."
> {:added "1.0"}
> ([f & opts]
> (let [opts (normalize-slurp-opts opts)
> data (StringBuffer.)
> buffer (char-array 4096)]
> (with-open [#^java.io.Reader r (apply jio/reader f opts)]
> (loop [c (.read r buffer)]
> (if (neg? c)
> (str data)
> (do
> (.append data buffer 0 c)
> (recur (.read r buffer)))))))))
>
> (time
> (with-open [f (java.io.FileReader. "words")]
> (println (count (slurp f)))))
>
> (time
> (with-open [f (java.io.FileReader. "words")]
> (println (count (slurp2 f)))))
>
> I get this output:
>
> $ java -Xmx3G -server clojure.main cat2.clj
> 279440100
> "Elapsed time: 17094.007487 msecs"
> 279440100
> "Elapsed time: 5233.097287 msecs"
>
> So at least in my environment there seems to be a big difference
> between slurp2 with an explicit buffer and the core/slurp one, which
> appears to be reading a character at a time from a BufferedReader
> stream.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to [email protected]
> Note that posts from new members are moderated - please be patient with your
> first post.
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en