Er..
This version is better. Uses hasNext instead of catching the exception:
(defn lazy-read-records [file regex]
(let [scanner (java.util.Scanner. file)
get-next (fn get-next []
(if (not (.hasNext scanner))
()
(cons (.next scanner)
(lazy-seq (get-next)))))]
(.useDelimiter scanner regex)
(get-next)))
On Aug 17, 2010, at 2:29 PM, Jeff Palmucci wrote:
> I'm assuming your problem is with memory, and not multithreaded reading.
> Given that:
>
> I also work with files much too big to fit into memory.
>
> You could just use java.util.Scanner. That has a useDelimiter method, so you
> can set the pattern to break on:
>
> (defn lazy-read-records [file regex]
> (let [scanner (java.util.Scanner. file)
> get-next (fn get-next []
> (try
> (cons (.next scanner)
> (lazy-seq (get-next)))
> (catch java.util.NoSuchElementException e ())))]
> (.useDelimiter scanner regex)
> (lazy-seq (get-next))))
>
> The trick here is that the sequence is lazy. It won't read the file until it
> needs to in order to return the next element.
>
> If you don't hold onto the head of the sequence, the front part can be
> garbage collected while you are working further down.
>
> PS If, for some reason, you want the character indices rather than the actual
> records, replace (.next scanner) with:
>
> (do (.next scanner)
> (.start (.match scanner)))
>
> On Aug 16, 2010, at 5:22 PM, cej38 wrote:
>
>> Hello,
>> I work with text files that are, at times, too large to read in all
>> at one time. In searching for a way to read in only part of the file
>> I came across
>> http://meshy.org/2009/12/13/widefinder-2-with-clojure.html
>>
>> I am only interested in the chunk-file and read-lines-range functions.
>>
>> My problem is that I would like to change chunk-file, so that instead
>> of looking for the next line break, it would look for some regular
>> expression (to be given as part of the function call), and would then
>> report the position of the first character of every instance of that
>> regular expression.
>>
>> After working on this for a couple of days I am raising the white
>> flag. Is there someone that can help me with this?
>>
>> Thanks.
>>
>>
>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "Clojure" group.
>> To post to this group, send email to [email protected]
>> Note that posts from new members are moderated - please be patient with your
>> first post.
>> To unsubscribe from this group, send email to
>> [email protected]
>> For more options, visit this group at
>> http://groups.google.com/group/clojure?hl=en
>
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en