Have you timed how long it takes to copy the index files? Optimizing can never be faster than that, since it must read every byte and write a whole new set. Disc speed may be your bottleneck.
You could also look at disc access rates in a monitoring tool. Is there read contention between the master and slave for the same disc? wunder On 2/27/08 7:08 PM, "James Brady" <[EMAIL PROTECTED]> wrote: > Hi all, > Our current setup is a master and slave pair on a single machine, > with an index size of ~50GB. > > Query and update times are still respectable, but commits are taking > ~20% of time on the master, while our daily index optimise can up to > 4 hours... > Here's the most relevant part of solrconfig.xml: > <useCompoundFile>true</useCompoundFile> > <mergeFactor>10</mergeFactor> > <maxBufferedDocs>1000</maxBufferedDocs> > <maxMergeDocs>10000</maxMergeDocs> > <maxFieldLength>10000</maxFieldLength> > > I've given both master and slave 2.5GB of RAM. > > Does an index optimise read and re-write the whole thing? If so, > taking about 4 hours is pretty good! However, the documentation here: > http://wiki.apache.org/solr/CollectionDistribution?highlight=%28ten > +minutes%29#head-cf174eea2524ae45171a8486a13eea8b6f511f8b > states "Optimizations can take nearly ten minutes to run..." which > leads me to think that we've grossly misconfigured something... > > Firstly, we would obviously love any way to reduce this optimise time > - I have yet to experiment extensively with the settings above, and > optimise frequency, but some general guidance would be great. > > Secondly, this index size is increasing monotonously over time and as > we acquire new users. We need to take action to ensure we can scale > in the future. The approach we're favouring at the moment is > horizontal partitioning of indices by user id as our data suits this > scheme well. A given index would hold the indexed data for n users, > where n would probably be between 1 and 100 users, and we will have > multiple indices per search server. > > Running server per index is impractical, especially for a small n, so > is a sinlge Solr instance capable of managing multiple searchers and > writers in this way? Following on from that, does anyone know of > limiting factors in Solr or Lucene that would influence our decision > on the value of n - the number of users per index? > > Thanks! > James > > >