Have you timed how long it takes to copy the index files? Optimizing
can never be faster than that, since it must read every byte and write
a whole new set. Disc speed may be your bottleneck.

You could also look at disc access rates in a monitoring tool.

Is there read contention between the master and slave for the same disc?

wunder

On 2/27/08 7:08 PM, "James Brady" <[EMAIL PROTECTED]> wrote:

> Hi all,
> Our current setup is a master and slave pair on a single machine,
> with an index size of ~50GB.
> 
> Query and update times are still respectable, but commits are taking
> ~20% of time on the master, while our daily index optimise can up to
> 4 hours...
> Here's the most relevant part of solrconfig.xml:
>      <useCompoundFile>true</useCompoundFile>
>      <mergeFactor>10</mergeFactor>
>      <maxBufferedDocs>1000</maxBufferedDocs>
>      <maxMergeDocs>10000</maxMergeDocs>
>      <maxFieldLength>10000</maxFieldLength>
> 
> I've given both master and slave 2.5GB of RAM.
> 
> Does an index optimise read and re-write the whole thing? If so,
> taking about 4 hours is pretty good! However, the documentation here:
> http://wiki.apache.org/solr/CollectionDistribution?highlight=%28ten
> +minutes%29#head-cf174eea2524ae45171a8486a13eea8b6f511f8b
> states "Optimizations can take nearly ten minutes to run..." which
> leads me to think that we've grossly misconfigured something...
> 
> Firstly, we would obviously love any way to reduce this optimise time
> - I have yet to experiment extensively with the settings above, and
> optimise frequency, but some general guidance would be great.
> 
> Secondly, this index size is increasing monotonously over time and as
> we acquire new users. We need to take action to ensure we can scale
> in the future. The approach we're favouring at the moment is
> horizontal partitioning of indices by user id as our data suits this
> scheme well. A given index would hold the indexed data for n users,
> where n would probably be between 1 and 100 users, and we will have
> multiple indices per search server.
> 
> Running server per index is impractical, especially for a small n, so
> is a sinlge Solr instance capable of managing multiple searchers and
> writers in this way? Following on from that, does anyone know of
> limiting factors in Solr or Lucene that would influence our decision
> on the value of n - the number of users per index?
> 
> Thanks!
> James
> 
> 
> 

Reply via email to