Hi Luke, which database are you using? I had the same issue with MySQL recently. At the end I got it working with clojure.java.jdbc. I don't have the code at hand, but according to the MySQL docs you have to set the fetch size to Integer.MIN_VALUE.
https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-implementation-notes.html r0man On Saturday, June 17, 2017 at 3:15:13 AM UTC+2, Luke Burton wrote: > > > Riddle me this: > > https://gist.github.com/hagmonk/a75621b143501966c22f53ed1e2bc36e > > Wherein I synthesize a large table in Postgres, then attempt to lazily > load the table, discarding each row as I receive it. I tried *many* > permutations and experiments, but settled on these two tests to illustrate > my point. Which is that I simply can't get it to work with > clojure.java.jdbc. > > test1, according to all my research and reading of the source code > involved, should consume the query results lazily. It does not, and I can't > for the life of me figure out why. Traffic starts to stream in, and the > heap is overwhelmed almost immediately. I've deliberately set the heap to 1 > GB. > > test2 uses a technique I borrowed wholesale from Ghadi Shayban in JDBC-99 > <https://dev.clojure.org/jira/browse/JDBC-99?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#issue-tabs>, > > which is to have ResultSet implement IReduceInit. It consumes a nominal > amount of memory. I've verified it's actually doing something by putting > counters in, and using YourKit to watch about 20 MB/s of traffic streaming > into the JVM. It's brilliant, it doesn't even break 200 MB total heap usage. > > I used YourKit to track where the memory is being retained for test1. > Initially I made the mistake of not setting the fetchSize, so I saw an > ArrayList inside the driver holding the reference. The driver > documentation <https://jdbc.postgresql.org/documentation/head/query.html> > confirms > that autoCommit must be disabled and the fetchSize set to some non-zero > number. > > After making that change, YourKit confirmed that the GC root holding all > the memory was the stack local variable "rs". At least I think it did, as a > non-expert in this domain. I tried disassembling the functions using > no.disassemble and the IntelliJ decompiler but I'm not really at the point > where I understand what to look for. > > So my questions are: > > 1) what am I doing wrong with clojure.java.jdbc? > > Note some things I've already tried: > > * using row-fn instead of result-set-fn > * using prepared statements > * explicitly setting auto-commit false on the connection > * declaring my result-set-fn with (^{:once true} *fn […]) (I did not see a > change in the disassembly when using this) > * probably other things I am forgetting > > 2) in these situations where you suspect that the head of a lazy sequence > is being retained, how do you reason about it? I'm kind of lucky this one > blew the heap so quickly, who knows how much of my production code might > burning memory unnecessarily but not quite as fatally. Do you disassemble > the functions and observe some smoking gun? How do you peek under the > covers to see where the problem is? > > Luke. > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to [email protected] Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
