Hi, Thanks for your reply.
I will work on your suggestion for using only one solr instance. I tried to merge the 15 indexes again, and I found out that the new merged index (without opitmization) size was about 351 GB , but when I optimize it the size return back to 411 GB, Why? I thought that optimization would decrease or at least be equal to the same index size before optimization Funtick wrote: > > Hi, > > Can you try to use single SOLR instance with heavy RAM (so that > ramBufferSizeMB=8192 for instance) and mergeFactor=10? Single SOLR > instance > is fast enough (> 100 client threads of Tomcat; configurable) - I usually > prefer single instance for single "writable" box with heavy RAM allocation > and good I/O. > > Merging 15 indexes and 4-times larger size could happen, for instance, > because of differences in SOLR Schema and Lucene; ensure that schema is > the > same (using Luke for instance). SOLR 1.4 has some new powerful features > such > as document->term cache stored somewhere (uninverted index) (Yonik), term > vectors, stored=true, copyField, etc. > > Do not do commit per 100; do it once at the end... > > > > -----Original Message----- > From: engy.ali [mailto:omeshm...@hotmail.com] > Sent: August-25-09 3:31 PM > To: solr-user@lucene.apache.org > Subject: Solr index - Size and indexing speed > > > Summary > =============== > > I had about 120,000 object of total size 71.2 GB, those objects are > already > indexed using Lucene. The index size is about 111 GB. > > I tried to use solr 1.4 nightly build to index the same collection. I > divided collection on three servers, each server had 5 solr instances (not > solr cores) up and running. > > After collection had been indexed, i merge the 15 indexes. > > Problems > ============== > > 1. The new merged index size is about 411 GB (i.e: 4 times larger than old > index using lucene) > > I tried to index only on object using lucene and same object using solr to > verify the size and the result was that the new index is about twice size > of > old index. > > DO you have any idea what might be the reason? > > > 2. the indexing speed is slow, 100 object on single solr instance were > indexed in 1 hour so i estimated that 1000 on single instance can be done > in > 10 hours, but that was not the case, the indexing time exceeds estimated > time by about 12 hour. > > is that might be related to the growth of index?if not, so what might be > the > reason. > > Note: I do a commit/100 object and an optimize by the end of the whole > operation. I also changed the mergeFactor from 10 to 15. > > > 3. I google and found out that solr is using an inverted index, but I > want > to know what is the internal structure of solr index,for example if i have > a > word and its stems, how it will be store in the index > > Thanks, > Engy > -- > View this message in context: > http://www.nabble.com/Solr-index---Size-and-indexing-speed-tp25140702p251407 > 02.html > Sent from the Solr - User mailing list archive at Nabble.com. > > > > > -- View this message in context: http://www.nabble.com/Solr-index---Size-and-indexing-speed-tp25140702p25201981.html Sent from the Solr - User mailing list archive at Nabble.com.