I'm sorry, you're right, I was thinking in the 2GB default value for maxMergeMB.
*Juan* On Mon, Apr 18, 2011 at 3:16 PM, Burton-West, Tom <tburt...@umich.edu>wrote: > >> As far as I know, Solr will never arrive to a segment file greater than > 2GB, > >>so this shouldn't be a problem. > > Solr can easily create a file size over 2GB, it just depends on how much > data you index and your particular Solr configuration, including your > ramBufferSizeMB, your mergeFactor, and whether you optimize. For example we > index about a terabyte of full text and optimize our indexes so we have a > 300GB *prx file. If you really have a filesystem limit of 2GB, there is a > parameter called maxMergeMB in Solr 3.1 that you can set. Unfortunately it > is the maximum size of a segment that will be merged rather than the maximum > size of the resulting segment. So if you have a mergeFactor of 10 you could > probably set it somewhere around (2GB / 10)= 200. Just to be cautious, you > might want to set it to 100. > > <mergePolicy class="org.apache.lucene.index.LogByteSizeMergePolicy"> > <double name="maxMergeMB">200</double> > </mergePolicy> > > In the flexible indexing branch/trunk there is a new merge policy and > parameter that allows you to set the maximum size of the merged segment: > https://issues.apache.org/jira/browse/LUCENE-854. > > > Tom Burton-West > http://www.hathitrust.org/blogs/large-scale-search > > -----Original Message----- > From: Juan Grande [mailto:juan.gra...@gmail.com] > Sent: Friday, April 15, 2011 5:15 PM > To: solr-user@lucene.apache.org > Subject: Re: QUESTION: SOLR INDEX BIG FILE SIZES > > Hi John, > > ¿How can split the file of the solr index into multiple files? > > > > Actually, the index is organized in a set of files called segments. It's > not > just a single file, unless you tell Solr to do so. > > That's because some "file systems are about to support a maximun > > of space in a single file" for example some UNIX file systems only > support > > a maximun of 2GB per file. > > > > As far as I know, Solr will never arrive to a segment file greater than > 2GB, > so this shouldn't be a problem. > > ¿What is the recommended storage strategy for a big solr index files? > > > > I guess that it depends in the indexing/querying performance that you're > having, the performance that you want, and what "big" exactly means for > you. > If your index is so big that individual queries take too long, sharding may > be what you're looking for. > > To better understand the index format, you can see > http://lucene.apache.org/java/3_1_0/fileformats.html > > Also, you can take a look at my blog (http://juanggrande.wordpress.com), > in > my last post I speak about segments merging. > > Regards, > > *Juan* > > > 2011/4/15 JOHN JAIRO GÓMEZ LAVERDE <jjai...@hotmail.com> > > > > > SOLR > > USER SUPPORT TEAM > > > > I have a quiestion about the "maximun file size of solr index", > > when i have a "lot of data in the solr index", > > > > -¿How can split the file of the solr index into multiple files? > > > > That's because some "file systems are about to support a maximun > > of space in a single file" for example some UNIX file systems only > support > > a maximun of 2GB per file. > > > > -¿What is the recommended storage strategy for a big solr index files? > > > > Thanks for the reply. > > > > JOHN JAIRO GÓMEZ LAVERDE > > Bogotá - Colombia - South America >