Well, I've dug around on this some more, and I'm unfortunately no
closer to finding an answer. I decided to try to whittle down the
source code to the minimal set which exhibits the problem, and post
the result here, so that at least there's a higher chance that if I'm
making a mistake, someone might be able to identify it and point it
out.
Without further ado, here it is:
(import '(org.apache.lucene.store FSDirectory)
'(org.apache.lucene.index IndexReader)
'(org.apache.lucene.search IndexSearcher))
(def *vendors* #{ "1211", "7784" })
(defn document-seq [index-path]
(let [directory (. FSDirectory (getDirectory index-path))
searcher (new IndexSearcher directory)
reader (. searcher getIndexReader)
numDocs (. reader numDocs)]
(map (fn [i] (. reader document i)) (range 0 numDocs))))
(defn my-filter-pred [document]
(let [item (. document get Constants/ITEM_ID)]
(contains? *vendors* item)))
(defn splode [index-path]
(with-local-vars [doc-count 0]
(doseq [document (filter my-filter-pred (document-seq index-
path))]
(var-set doc-count (inc @doc-count)))
'done))
I did the same heap analysis on this, and found exactly the same
results. That is to say that
clojure.core$filter__3364$fn__3367
is a stack local (which I think is what makes it a GC root), and it
has a reference to "coll", which is a variable used in the definition
of filter, which refers to the whole gigantic list of lazy-conses. You
can see that I'm operating on lucene indexes - I actually tried to
rewrite this all using something more fundamental (like a sequence of
random strings), but I could not come up with something that caused
the heap to blow out this way.
I would love to find that I'm making some mistake in how I've written
my sequences, but I'm running out of ideas.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Clojure" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---