On 10/16/2011 12:01 PM, samarth s wrote:
Hi,
Is it safe to assume that with a megeFactor of 10 the open file descriptors
required by solr would be around (1+ 10) * 10 = 110
ref: *http://onjava.com/pub/a/onjava/2003/03/05/lucene.html#indexing_speed*
Solr wiki:
http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerationsstates
that FD's required per segment is around 7.
Are these estimates appropriate. Does it in anyway depend on the size of the
index& number of docs (assuming same number of segments in any case) as
well?
My index has 10 files per normal segment (the usual 7 plus three more
for termvectors). Some of the segments also have a ".del" file, and
there is a segments_* file and a segments.gen file. Your servlet
container and other parts of the OS will also have to open files.
I have personally seen three levels of segment merging taking place at
the same time on a slow filesystem during a full-import, along with new
content coming in at the same time. With a mergefactor of 10, each
merge is 11 segments - the ten that are being merged and the merged
segment. If you have three going on at the same time, that's 33
segments, and you can have up to 10 more that are actively being built
by ongoing index activity, so that's 43 potential segments. If your
filesystem is REALLY slow, you might end up with even more segments as
existing merges are paused for new ones to start, but if you run into
that, you'll want to udpate your hardware, so I won't consider it.
Multiplying 43 segments by 11 files per segment yields a working
theoretical maximum of 473 files. Add in the segments files, you're up
to 475.
Most operating systems have a default FD limit that's at least 1024. If
you only have one index (core) on your Solr server, Solr is the only
thing running on that server, and it's using the default mergeFactor of
10, you should be fine with the default. If you are going to have more
than one index on your Solr server (such as a build core and a live
core), you plan to run other things on the server, or you want to
increase your mergeFactor significantly, you might need to adjust the OS
configuration to allow more file descriptors.
Thanks,
Shawn