I've got an even faster version using memory-mapped file I/O. It also
simplify the code a little bit.
(defn fast-read-file [#^String filename #^String charset]
(with-open [cin (. (new FileInputStream filename) getChannel)]
(let [size (. (new File filename) length)
char-buffer (. ByteBuffer (allocate size))
decoder (. (. Charset (forName charset)) (newDecoder))]
(str
(. decoder
(decode
(.map cin (. FileChannel$MapMode READ_ONLY) 0 size)))))))
On Windows I get these results:
nio=> (time (dotimes [i 1000] (fast-read-file ".emacs" "ISO-8859-1")))
"Elapsed time: 247.9453 msecs"
nio=> (time (dotimes [i 1000] (fast-read-file ".emacs" "ISO-8859-1")))
"Elapsed time: 247.666881 msecs"
nio=> (time (dotimes [i 1000] (fast-read-file ".emacs" "ISO-8859-1")))
"Elapsed time: 247.895424 msecs"
Compared with the previous version with buffer size equal to the
length of the file:
nio=> (time (dotimes [i 1000] (read-file ".emacs" "ISO-8859-1" (. (new
File ".emacs") length))))
"Elapsed time: 264.470276 msecs"
nio=> (time (dotimes [i 1000] (read-file ".emacs" "ISO-8859-1" (. (new
File ".emacs") length))))
"Elapsed time: 265.775947 msecs"
nio=> (time (dotimes [i 1000] (read-file ".emacs" "ISO-8859-1" (. (new
File ".emacs") length))))
"Elapsed time: 263.828204 msecs"
The API documentation for the map method aren't very exhaustive,
there's not much information on its limits. A lot of details are
implementation dependent and left unspecified, most problems though
seems to be related to writing.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---