On 8/2/2013 8:13 AM, Anca Kopetz wrote:
Then we optimized the index to 1 segment / 0 deleted docs and we got
+40% of QPS compared to the previous test.

Therefore we thought of optimizing the index every two hours, as our
index is evolving due to frequent commits (every 30 minutes) and thus
the performance results are degrading.

1. Is this a good practice ?
2. Instead of executing an "optimize" many times a day, are there any
other parameters that we can tune and test in order to gain in average QPS?

We want to avoid the solution of adding more servers to our SolrCloud
cluster.

Some details of our system :

SolrCloud cluster: 8 nodes on 8 dedicated servers; 2 shards / 4 replicas
Hardware configuration: 2 Processors (16CPU cores) per server; 24GB of
memory; 6GB allocated to JVM
Index: 13M documents, 15GB
Search algorithm : grouping, faceting, filter queries
Solr version 4.4

Please read and follow this note about thread hijacking:

http://people.apache.org/~hossman/#threadhijack

Optimizing that frequently with an index that large *might* cause more problems than it solves. You'd have to actually try it to see whether it works for you, though. Here's some information explaining why it may be a problem:

Optimizing a 15GB index is likely to take up to 15 minutes, depending on how fast the I/O subsystem on your servers is. It probably won't happen in less than 5 minutes unless you're running on SSD, which also mitigates some of the impact described in the next paragraph.

Performance will be lower, potentially a LOT lower, for those few minutes while an optimize is occurring. Solr has to read the index, process each document, and write it back out. It does happen quite fast, but that's a lot of I/O. Because it's continually going back and forth between the old copy and the new copy, the OS disk cache will have critical data evicted for the entire process, unless you have enough free RAM so *twice* the index can fit in the cache, and from your mentioned stats, you don't.

FYI, commits every 30 minutes are NOT frequent. Commits happening one or more times every *second* are frequent.

If you can share your solrconfig.xml, there might be some suggestions we can make so things will generally work better. The list doesn't accept attachments. It's better if you use a paste website like http://www.fpaste.org/, choose the proper language for highlighting, and set the "delete after" setting to something that will work for you. Making it a paste that never gets deleted will mean that your message will retain usefulness for others as long as archives exist, but you might not want it available that long.

Properly tuning your garbage collection is important. The default garbage collector is, risking a pun, garbage.

http://wiki.apache.org/solr/SolrPerformanceProblems#GC_pause_problems
http://wiki.apache.org/solr/ShawnHeisey#GC_Tuning

Thanks,
Shawn

Reply via email to