I'm not sure why you see 1.5 GB before restart but then 4 GB after. But seeing a 26 MB tii file --> 200 MB RAM is unfortunately expected; in 3.x Lucene's in-RAM representation of the terms index is very inefficient (three separate object instances (TermInfo, Term, String) per indexed term, with each object having various fields, etc.).
This has been improved substantially in trunk with flexible indexing. You can increase the terms index divisor when you open your IndexReader. EG, passing 2 (instead of the default 1) keeps every other indexed term, halving the required RAM (but taking more time to seek to a certain term). I'm not sure how Solr exposes this configuration though. Mike On Wed, Aug 18, 2010 at 1:54 PM, Rebecca Watson <bec.wat...@gmail.com> wrote: > hi, > > I am running solr 1.4.1 and java 1.6 with 6GB heap and the following > GC settings: > gc_args="-XX:+UseConcMarkSweepGC > -XX:+CMSClassUnloadingEnabled -XX:NewSize=2g -XX:MaxNewSize=2g > -XX:CMSInitiatingOccupancyFraction=60" > > So 6GB total heap and 2GB allocated to eden space. > > I have caching, autocommit and auto-warming commented out of > solrconfig.xml > > After I index 500k docs and call commit/optimize (via URL after indexing > has completed) my RAM usage is only about 1.5GB, but then if I stop > and restart my Solr server over the same data the RAM immediately > jumps to about 4GB and I can't understand why there is a difference > here? As this is close to the old gen limit -- i quickly find that Solr > becomes unresponsive. > > The following shows that tii files are being loaded from 26MB > files to consume over 200MB in RAM when I restart the server. > > is this expected? > > thanks for any help/advice in advance, > > bec :) > > ----------------- > > Rebecca-Watsons-iMac:work iwatson$ jmap -histo:live 8992 | head -30 > > num #instances #bytes class name > ---------------------------------------------- > 1: 18334714 1422732624 [C > 2: 18332491 733299640 java.lang.String > 3: 6104929 244197160 org.apache.lucene.index.TermInfo > 4: 6104929 244197160 org.apache.lucene.index.TermInfo > 5: 6104929 244197160 org.apache.lucene.index.TermInfo > 6: 6104921 195357472 org.apache.lucene.index.Term > 7: 6104921 195357472 org.apache.lucene.index.Term > 8: 6104921 195357472 org.apache.lucene.index.Term > 9: 224 146527408 [J > 10: 10 48839592 [Lorg.apache.lucene.index.TermInfo; > 11: 10 48839592 [Lorg.apache.lucene.index.Term; > 12: 10 48839592 [Lorg.apache.lucene.index.TermInfo; > 13: 10 48839592 [Lorg.apache.lucene.index.TermInfo; > 14: 10 48839592 [Lorg.apache.lucene.index.Term; > 15: 10 48839592 [Lorg.apache.lucene.index.Term; > 16: 41630 6264728 <constMethodKlass> > 17: 41630 5005104 <methodKlass> > 18: 4049 4596352 <constantPoolKlass> > 19: 4049 3049984 <instanceKlassKlass> > 20: 3129 2580040 <constantPoolCacheKlass> > 21: 49713 2418496 <symbolKlass> > 22: 4983 1067192 [B > 23: 4381 806104 java.lang.Class > 24: 5979 533064 [[I > 25: 6124 438080 [S > 26: 7951 381648 java.util.HashMap$Entry > 27: 2071 375744 [Ljava.util.HashMap$Entry; > Rebecca-Watsons-iMac:work iwatson$ ls > ./mach-lcf/data/data-serv-lcf/artdoc1/index/*.tii > -rw-r--r-- 1 iwatson staff 26M 18 Aug 23:44 > ./mach-lcf/data/data-serv-lcf/artdoc1/index/_36.tii > -rw-r--r-- 1 iwatson staff 26M 19 Aug 00:06 > ./mach-lcf/data/data-serv-lcf/artdoc1/index/_69.tii > -rw-r--r-- 1 iwatson staff 25M 19 Aug 00:26 > ./mach-lcf/data/data-serv-lcf/artdoc1/index/_9d.tii > -rw-r--r-- 1 iwatson staff 24M 19 Aug 00:50 > ./mach-lcf/data/data-serv-lcf/artdoc1/index/_ch.tii > -rw-r--r-- 1 iwatson staff 25M 19 Aug 01:11 > ./mach-lcf/data/data-serv-lcf/artdoc1/index/_fj.tii > -rw-r--r-- 1 iwatson staff 3.1M 19 Aug 01:12 > ./mach-lcf/data/data-serv-lcf/artdoc1/index/_fq.tii > -rw-r--r-- 1 iwatson staff 3.1M 19 Aug 01:12 > ./mach-lcf/data/data-serv-lcf/artdoc1/index/_g1.tii > -rw-r--r-- 1 iwatson staff 167B 19 Aug 01:10 > ./mach-lcf/data/data-serv-lcf/artdoc1/index/_gb.tii > -rw-r--r-- 1 iwatson staff 3.1M 19 Aug 01:11 > ./mach-lcf/data/data-serv-lcf/artdoc1/index/_gc.tii > -rw-r--r-- 1 iwatson staff 223K 19 Aug 01:23 > ./mach-lcf/data/data-serv-lcf/artdoc1/index/_gd.tii >