Eclipse MAT (Memory Analysis Tool, or something like that) is a program that grok dumps from jmap (or jstat, can't remember).
Is that the kind of tool that you're looking for? On Fri, Jul 31, 2009 at 10:45 AM, Andy Fingerhut<[email protected]> wrote: > > I thought I'd follow up my own question with some programs that I > should have already known about for memory profiling, which were > already installed on my Mac as part of the standard Java installation > from Apple (who are just passing them on from Sun, I'm sure), but I > didn't know about them: > > jconsole -- Good for seeing how fast a java process is allocating > memory, and garbage collecting. > > jmap -- Good for a quick summary of the above, or with the "-histo" > option, a much more detailed list of what kinds of objects are taking > up the most memory. > > I also learned that Clojure's cons and lazy cons structures take up 48 > bytes per element (at least on a Mac with 'java -client' and java > 1.6.0_<foo>), which gets significant when your program has sequences > of several millions of elements about. > > I've updated my github repo with a pretty decent version of the > reverse-complement benchmark in Clojure. It isn't as sequence-y as it > could be, but the more sequence-y version generates and collects > garbage so fast that it really slows things down significantly. Same > lesson from other flavors of Lisp, I guess -- you can write the > straightforward easy-to-write-and-test-and-understand code that conses > a lot (i.e. allocates memory quickly that typically becomes garbage > quite soon), or you can write the more loopy code that doesn't, but > typically starts to merge many things that you'd otherwise prefer to > separate into different functions. Just compare revcomp.clj-5.clj and > revcomp.clj-6.clj in my git repo for an example. > > The nice thing is that when you don't need the "uglier" code, Clojure > and other Lisps usually let you write code much more concisely than > lower level languages. Get it working first, then optimize it. Since > I'm comparing run times of the Clojure programs versus those submitted > to the language shootout benchmark web site, some of which appear > quite contorted in order to gain performance, I wanted to do some > optimizations that you wouldn't necessarily want to do otherwise. > > git://github.com/jafingerhut/clojure-benchmarks.git > > You can see my latest run time results here. I've got 4 benchmarks > written in Clojure so far, with my current versions being 6x, 8x, 12x, > and 15x more CPU time than the Java programs submitted to the language > shootout benchmark web site. > > http://github.com/jafingerhut/clojure-benchmarks/blob/20d21bc169d52ca52d6a8281536838662c54e854/RESULTS > > I could make some of these significantly closer in speed to the Java > versions, but I suspect that they will start looking more and more > like the Java versions if I do, except with Clojure syntax for Java > calls. I'm happy to be proved wrong on that, if someone finds better > Clojure versions than I've got. > > Thanks, > Andy > > > On Jul 30, 11:00 am, Andy Fingerhut <[email protected]> > wrote: >> I'm gradually adding a few more Clojure benchmark programs to my >> repository here: >> >> git://github.com/jafingerhut/clojure-benchmarks.git >> >> The one I wrote for the "reverse-complement" benchmark is here: >> >> http://github.com/jafingerhut/clojure-benchmarks/blob/4ab4f41c6f96344... >> >> revcomp.clj-4.clj is the best I've got so far, but it runs out of >> memory on the full size benchmark. >> >> If you clone the repository, and successfully run the init.sh script >> to generate the big input and expected output files, the file rcomp/ >> long-input.txt contains 3 DNA sequences in FASTA format. The first is >> 50,000,000 characters long, the second is 75,000,000 characters long, >> and the third is 125,000,000 characters long. Each needs to be >> reversed, have each character replaced with a different one, and >> printed out, so we need to store each of the strings one at a time, >> but it is acceptable to deallocate/garbage-collect the previous one >> when starting on the next. I think my code should be doing that, but I >> don't know how to verify that. >> >> I've read that a Java String takes 2 bytes per character, plus about >> 38 bytes of overhead per string. That is about 250 Mbytes for the >> longest one. I also read in a seq of lines, and these long strings are >> split into lines with 60 characters (plus a newline) each. Thus the >> string's data needs to be stored at least twice temporarily -- once >> for the many 60-character strings, plus the final long one. Also, the >> Java StringBuilder that Clojure's (str ...) function uses probably >> needs to be copied and reallocated periodically as it outgrows its >> current allocation. So I could imagine needing about 3 * 250 Mbytes >> temporarily, but that doesn't explain why my 1536 Mbytes of JVM memory >> are being exhausted. >> >> It would be possible to improve things by not creating all of the >> separate strings, one for each line, and then concatenating them >> together. But first I'd like to explain why it is using so much, >> because I must be missing something. >> >> Thank, >> Andy > > > > -- Venlig hilsen / Kind regards, Christian Vest Hansen. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to [email protected] Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~----------~----~----~----~------~----~------~--~---
