Hi There,

I have been building a Solr environment that indexes roughly 3 million
products. The current index is roughly 9gig in size. We have bumped
into some issues performance issues with Solr's Replication. During
the Solr slave snapshot installation, query times take longer and may
in some cases timeout.

Here are some of the details: Every 3 minutes approximately 2000
updates are committed to the master Solr index and a snapshot is
taken. There are 4 Solr slaves (2 way quad cores / 32gig ram / 15k
scsi) which poll every minute to look for a new snapshot and install
it. During the install of the snapshot on the slaves I'm seeing two
things, 1. the disk i/o hit, and 2. cpu load on the Java/Jetty/Solr
process jumps up. I know the i/o is related to the transfer of the
snapshot to the local box. I believe the cpu load is related to cache
warming, which takes roughly 10-30 seconds to complete. Currently for
cache warming I have the following settings:

    <filterCache class="solr.LRUCache"
        size="512" initialSize="512" autowarmCount="0"/>
    <queryResultCache class="solr.LRUCache"
        size="524288" initialSize="256" autowarmCount="256"/>
    <documentCache class="solr.LRUCache"
        size="524288" initialSize="16384" autowarmCount="0"/>
    <maxWarmingSearchers>2</maxWarmingSearchers>

    <HashDocSet maxSize="3000" loadFactor="0.75"/>
    <queryResultWindowSize>50</queryResultWindowSize>
    <queryResultMaxDocsCached>200</queryResultMaxDocsCached>
    <maxBooleanClauses>1024</maxBooleanClauses>

    <enableLazyFieldLoading>true</enableLazyFieldLoading>
    <useColdSearcher>false</useColdSearcher>

I have thought about turning off the cache warming completely and
looking at the search performance. I would love to hear any ideas or
experiences that people have had in tuning Solr Replication.

Thanks,
David

Reply via email to