Thanks Simon and Jay .That was helpful . So what we are looking at during optimize is 2 or 3 times free Disk Space to recreate the index.
Regards Sujatha On Wed, Oct 26, 2011 at 12:26 AM, Simon Willnauer < simon.willna...@googlemail.com> wrote: > RAM costs during optimize / merge is generally low. Optimize is > basically a merge of all segments into one, however there are > exceptions. Lucene streams existing segments from disk and serializes > the new segment on the fly. When you optimize or in general when you > merge segments you need disk space for the "source" segments and the > "targed" (merged) segment. > > If you use CompoundFileSystem (CFS) you need to additional space once > the merge is done and your files are packed into the CFS which is > basically the size of the "target" (merged) segment. Once the merge is > done lucene can free the diskspace unless you have an IndexReader open > that references those segments (lucene keeps track of these files and > frees diskspace once possible). > > That said, I think you should use optimize very very rarely. Usually > if you document collection is rarely changing optimize is useful and > reasonable once in a while. if you collection is constantly changing > you should rely on the merge policy to balance the number of segments > for you in the background. Lucene 3.4 has a nice improved > TieredMergePolicy that does a great job. (previous version are also > good - just saying) > > A commit is basically flushing the segment you have in memory > (IndexWriter memory) to disk. compression ratio can be up to 30% of > the ram cost or even more depending on your data. The actual commit > doesn't need a notable amount of memory. > > hope this helps > > simon > > On Mon, Oct 24, 2011 at 7:38 PM, Jaeger, Jay - DOT > <jay.jae...@dot.wi.gov> wrote: > > I have not spent a lot of time researching it, but one would expect that > the OS RAM requirement for optimization of an index to be minimal. > > > > My understanding is that during optimization an essentially new index is > built. Once complete it switches out the indexes and will throw away the > old one. (In Windows it may not throw away the old one until the next > Commit). > > > > JRJ > > > > -----Original Message----- > > From: Sujatha Arun [mailto:suja.a...@gmail.com] > > Sent: Friday, October 21, 2011 12:10 AM > > To: solr-user@lucene.apache.org > > Subject: Re: Optimization /Commit memory > > > > Just one more thing ,when we are talking about Optimization , we > > are referring to HD free space for replicating the index (2 or 3 > times > > the index size ) .what is role of RAM (OS) here? > > > > Regards > > Suajtha > > > > On Fri, Oct 21, 2011 at 10:12 AM, Sujatha Arun <suja.a...@gmail.com> > wrote: > > > >> Thanks that helps. > >> > >> Regards > >> Sujatha > >> > >> > >> On Thu, Oct 20, 2011 at 6:23 PM, Jaeger, Jay - DOT < > jay.jae...@dot.wi.gov>wrote: > >> > >>> Well, since the OS RAM includes the JVM RAM, that is part of your > >>> requirement, yes? Aside from the JVM and normal OS requirements, all > you > >>> need OS RAM for is file caching. Thus, for updates, the OS RAM is not > a > >>> major factor. For searches, you want sufficient OS RAM to cache enough > of > >>> the index to get the query performance you need, and to cache queries > inside > >>> the JVM if you get a lot of repeat queries (see solrconfig.xml for the > >>> various caches: we have not played with them much). So, the amount of > RAM > >>> necessary for that is very much dependent upon the size of your index, > so I > >>> cannot give you a simple number. > >>> > >>> You seem to believe that you have to have sufficient memory to have the > >>> entire index in memory. Except where extremely high performance is > >>> required, I have not found that to be the case. > >>> > >>> This is just one of those "your mileage may vary" things. There is not > a > >>> single answer or formula that fits every situation. > >>> > >>> JRJ > >>> > >>> -----Original Message----- > >>> From: Sujatha Arun [mailto:suja.a...@gmail.com] > >>> Sent: Wednesday, October 19, 2011 11:58 PM > >>> To: solr-user@lucene.apache.org > >>> Subject: Re: Optimization /Commit memory > >>> > >>> Thanks Jay , > >>> > >>> I was trying to compute the *OS RAM requirement* *not JVM RAM* for a > 14 > >>> GB > >>> Index [cumulative Index size of all Instances].And I put it thus - > >>> > >>> Requirement of Operating System RAM for an Index of 14GB is - Index > >>> Size > >>> + 3 Times the maximum Index Size of Individual Instance for Optimize . > >>> > >>> That is to say ,I have several Instances ,combined Index Size is 14GB > >>> .Maximum Individual Index Size is 2.5GB .so My requirement for OS RAM > is > >>> 14GB +3 * 2.5 GB ~ = 22GB. > >>> > >>> Correct? > >>> > >>> Regards > >>> Sujatha > >>> > >>> > >>> > >>> On Thu, Oct 20, 2011 at 3:45 AM, Jaeger, Jay - DOT < > jay.jae...@dot.wi.gov > >>> >wrote: > >>> > >>> > Commit does not particularly spike disk or memory usage, unless you > are > >>> > adding a very large number of documents between commits. A commit > can > >>> cause > >>> > a need to merge indexes, which can increase disk space temporarily. > An > >>> > optimize is *likely* to merge indexes, which will usually increase > disk > >>> > space temporarily. > >>> > > >>> > How much disk space depends very much upon how big your index is in > the > >>> > first place. A 2 to 3 times factor of the sum of your peak index > file > >>> size > >>> > seems safe, to me. > >>> > > >>> > Solr uses only modest amounts of memory for the JVM for this stuff. > >>> > > >>> > JRJ > >>> > > >>> > -----Original Message----- > >>> > From: Sujatha Arun [mailto:suja.a...@gmail.com] > >>> > Sent: Wednesday, October 19, 2011 4:04 AM > >>> > To: solr-user@lucene.apache.org > >>> > Subject: Optimization /Commit memory > >>> > > >>> > Do we require 2 or 3 Times OS RAM memory or Hard Disk Space while > >>> > performing Commit or Optimize or Both? > >>> > > >>> > what is the requirement in terms of size of RAM and HD for commit > and > >>> > Optimize > >>> > > >>> > Regards > >>> > Sujatha > >>> > > >>> > >> > >> > > >