RAM costs during optimize / merge is generally low. Optimize is basically a merge of all segments into one, however there are exceptions. Lucene streams existing segments from disk and serializes the new segment on the fly. When you optimize or in general when you merge segments you need disk space for the "source" segments and the "targed" (merged) segment.
If you use CompoundFileSystem (CFS) you need to additional space once the merge is done and your files are packed into the CFS which is basically the size of the "target" (merged) segment. Once the merge is done lucene can free the diskspace unless you have an IndexReader open that references those segments (lucene keeps track of these files and frees diskspace once possible). That said, I think you should use optimize very very rarely. Usually if you document collection is rarely changing optimize is useful and reasonable once in a while. if you collection is constantly changing you should rely on the merge policy to balance the number of segments for you in the background. Lucene 3.4 has a nice improved TieredMergePolicy that does a great job. (previous version are also good - just saying) A commit is basically flushing the segment you have in memory (IndexWriter memory) to disk. compression ratio can be up to 30% of the ram cost or even more depending on your data. The actual commit doesn't need a notable amount of memory. hope this helps simon On Mon, Oct 24, 2011 at 7:38 PM, Jaeger, Jay - DOT <jay.jae...@dot.wi.gov> wrote: > I have not spent a lot of time researching it, but one would expect that the > OS RAM requirement for optimization of an index to be minimal. > > My understanding is that during optimization an essentially new index is > built. Once complete it switches out the indexes and will throw away the old > one. (In Windows it may not throw away the old one until the next Commit). > > JRJ > > -----Original Message----- > From: Sujatha Arun [mailto:suja.a...@gmail.com] > Sent: Friday, October 21, 2011 12:10 AM > To: solr-user@lucene.apache.org > Subject: Re: Optimization /Commit memory > > Just one more thing ,when we are talking about Optimization , we > are referring to HD free space for replicating the index (2 or 3 times > the index size ) .what is role of RAM (OS) here? > > Regards > Suajtha > > On Fri, Oct 21, 2011 at 10:12 AM, Sujatha Arun <suja.a...@gmail.com> wrote: > >> Thanks that helps. >> >> Regards >> Sujatha >> >> >> On Thu, Oct 20, 2011 at 6:23 PM, Jaeger, Jay - DOT >> <jay.jae...@dot.wi.gov>wrote: >> >>> Well, since the OS RAM includes the JVM RAM, that is part of your >>> requirement, yes? Aside from the JVM and normal OS requirements, all you >>> need OS RAM for is file caching. Thus, for updates, the OS RAM is not a >>> major factor. For searches, you want sufficient OS RAM to cache enough of >>> the index to get the query performance you need, and to cache queries inside >>> the JVM if you get a lot of repeat queries (see solrconfig.xml for the >>> various caches: we have not played with them much). So, the amount of RAM >>> necessary for that is very much dependent upon the size of your index, so I >>> cannot give you a simple number. >>> >>> You seem to believe that you have to have sufficient memory to have the >>> entire index in memory. Except where extremely high performance is >>> required, I have not found that to be the case. >>> >>> This is just one of those "your mileage may vary" things. There is not a >>> single answer or formula that fits every situation. >>> >>> JRJ >>> >>> -----Original Message----- >>> From: Sujatha Arun [mailto:suja.a...@gmail.com] >>> Sent: Wednesday, October 19, 2011 11:58 PM >>> To: solr-user@lucene.apache.org >>> Subject: Re: Optimization /Commit memory >>> >>> Thanks Jay , >>> >>> I was trying to compute the *OS RAM requirement* *not JVM RAM* for a 14 >>> GB >>> Index [cumulative Index size of all Instances].And I put it thus - >>> >>> Requirement of Operating System RAM for an Index of 14GB is - Index >>> Size >>> + 3 Times the maximum Index Size of Individual Instance for Optimize . >>> >>> That is to say ,I have several Instances ,combined Index Size is 14GB >>> .Maximum Individual Index Size is 2.5GB .so My requirement for OS RAM is >>> 14GB +3 * 2.5 GB ~ = 22GB. >>> >>> Correct? >>> >>> Regards >>> Sujatha >>> >>> >>> >>> On Thu, Oct 20, 2011 at 3:45 AM, Jaeger, Jay - DOT <jay.jae...@dot.wi.gov >>> >wrote: >>> >>> > Commit does not particularly spike disk or memory usage, unless you are >>> > adding a very large number of documents between commits. A commit can >>> cause >>> > a need to merge indexes, which can increase disk space temporarily. An >>> > optimize is *likely* to merge indexes, which will usually increase disk >>> > space temporarily. >>> > >>> > How much disk space depends very much upon how big your index is in the >>> > first place. A 2 to 3 times factor of the sum of your peak index file >>> size >>> > seems safe, to me. >>> > >>> > Solr uses only modest amounts of memory for the JVM for this stuff. >>> > >>> > JRJ >>> > >>> > -----Original Message----- >>> > From: Sujatha Arun [mailto:suja.a...@gmail.com] >>> > Sent: Wednesday, October 19, 2011 4:04 AM >>> > To: solr-user@lucene.apache.org >>> > Subject: Optimization /Commit memory >>> > >>> > Do we require 2 or 3 Times OS RAM memory or Hard Disk Space while >>> > performing Commit or Optimize or Both? >>> > >>> > what is the requirement in terms of size of RAM and HD for commit and >>> > Optimize >>> > >>> > Regards >>> > Sujatha >>> > >>> >> >> >