Re: Good memory profilers (program or human)?

Christian Vest Hansen Fri, 31 Jul 2009 03:50:52 -0700

Eclipse MAT (Memory Analysis Tool, or something like that) is a
program that grok dumps from jmap (or jstat, can't remember).


Is that the kind of tool that you're looking for?

On Fri, Jul 31, 2009 at 10:45 AM, Andy
Fingerhut<[email protected]> wrote:
>
> I thought I'd follow up my own question with some programs that I
> should have already known about for memory profiling, which were
> already installed on my Mac as part of the standard Java installation
> from Apple (who are just passing them on from Sun, I'm sure), but I
> didn't know about them:
>
> jconsole -- Good for seeing how fast a java process is allocating
> memory, and garbage collecting.
>
> jmap -- Good for a quick summary of the above, or with the "-histo"
> option, a much more detailed list of what kinds of objects are taking
> up the most memory.
>
> I also learned that Clojure's cons and lazy cons structures take up 48
> bytes per element (at least on a Mac with 'java -client' and java
> 1.6.0_<foo>), which gets significant when your program has sequences
> of several millions of elements about.
>
> I've updated my github repo with a pretty decent version of the
> reverse-complement benchmark in Clojure.  It isn't as sequence-y as it
> could be, but the more sequence-y version generates and collects
> garbage so fast that it really slows things down significantly.  Same
> lesson from other flavors of Lisp, I guess -- you can write the
> straightforward easy-to-write-and-test-and-understand code that conses
> a lot (i.e. allocates memory quickly that typically becomes garbage
> quite soon), or you can write the more loopy code that doesn't, but
> typically starts to merge many things that you'd otherwise prefer to
> separate into different functions.  Just compare revcomp.clj-5.clj and
> revcomp.clj-6.clj in my git repo for an example.
>
> The nice thing is that when you don't need the "uglier" code, Clojure
> and other Lisps usually let you write code much more concisely than
> lower level languages.  Get it working first, then optimize it.  Since
> I'm comparing run times of the Clojure programs versus those submitted
> to the language shootout benchmark web site, some of which appear
> quite contorted in order to gain performance, I wanted to do some
> optimizations that you wouldn't necessarily want to do otherwise.
>
> git://github.com/jafingerhut/clojure-benchmarks.git
>
> You can see my latest run time results here.  I've got 4 benchmarks
> written in Clojure so far, with my current versions being 6x, 8x, 12x,
> and 15x more CPU time than the Java programs submitted to the language
> shootout benchmark web site.
>
> http://github.com/jafingerhut/clojure-benchmarks/blob/20d21bc169d52ca52d6a8281536838662c54e854/RESULTS
>
> I could make some of these significantly closer in speed to the Java
> versions, but I suspect that they will start looking more and more
> like the Java versions if I do, except with Clojure syntax for Java
> calls.  I'm happy to be proved wrong on that, if someone finds better
> Clojure versions than I've got.
>
> Thanks,
> Andy
>
>
> On Jul 30, 11:00 am, Andy Fingerhut <[email protected]>
> wrote:
>> I'm gradually adding a few more Clojure benchmark programs to my
>> repository here:
>>
>> git://github.com/jafingerhut/clojure-benchmarks.git
>>
>> The one I wrote for the "reverse-complement" benchmark is here:
>>
>> http://github.com/jafingerhut/clojure-benchmarks/blob/4ab4f41c6f96344...
>>
>> revcomp.clj-4.clj is the best I've got so far, but it runs out of
>> memory on the full size benchmark.
>>
>> If you clone the repository, and successfully run the init.sh script
>> to generate the big input and expected output files, the file rcomp/
>> long-input.txt contains 3 DNA sequences in FASTA format. The first is
>> 50,000,000 characters long, the second is 75,000,000 characters long,
>> and the third is 125,000,000 characters long. Each needs to be
>> reversed, have each character replaced with a different one, and
>> printed out, so we need to store each of the strings one at a time,
>> but it is acceptable to deallocate/garbage-collect the previous one
>> when starting on the next. I think my code should be doing that, but I
>> don't know how to verify that.
>>
>> I've read that a Java String takes 2 bytes per character, plus about
>> 38 bytes of overhead per string. That is about 250 Mbytes for the
>> longest one. I also read in a seq of lines, and these long strings are
>> split into lines with 60 characters (plus a newline) each. Thus the
>> string's data needs to be stored at least twice temporarily -- once
>> for the many 60-character strings, plus the final long one.  Also, the
>> Java StringBuilder that Clojure's (str ...) function uses probably
>> needs to be copied and reallocated periodically as it outgrows its
>> current allocation. So I could imagine needing about 3 * 250 Mbytes
>> temporarily, but that doesn't explain why my 1536 Mbytes of JVM memory
>> are being exhausted.
>>
>> It would be possible to improve things by not creating all of the
>> separate strings, one for each line, and then concatenating them
>> together. But first I'd like to explain why it is using so much,
>> because I must be missing something.
>>
>> Thank,
>> Andy
>
> >
>



-- 
Venlig hilsen / Kind regards,
Christian Vest Hansen.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Good memory profilers (program or human)?

Reply via email to