Glad to hear you have a solution.... Best, Erick
On Thu, Nov 7, 2013 at 5:12 PM, Patanachai Tangchaisin < patanachai.tangchai...@wizecommerce.com> wrote: > Hi Erick, > > About the size of filter cache, previously we set it to 4,000. > After we faced this problem, we changed it to 10,000. > Still at size of 10,000 (always full), hitratio was 0.78 and "eviction" > was as high as "insertion". > > About 100% Cpu, yes, it was Solr using it. > I profiled an app, it was "DisjunctionSumScorer" that takes most CPU times. > Since this is a required filter query, we set it for every requests. > My assumption is because Solr cannot use a filter cache, the filter query > has to be executed at a same time as normal query. > > However, we fix this problem by sorting our filter constraints before > creating a filter query. > So, {"1","2","3"}, {"2","3","1"}, {"3","2","1"} will be a same the filter > query i.e. fq=x:("1" OR "2" OR "3"). > > We end up with very small filter cache size (<1,000) and hit ratio is now > 0.99. There is no eviction at all. > The median response time is now less than 200ms on 25 QPS. > > Thanks, > Patanachai > > > On 11/07/2013 04:37 AM, Erick Erickson wrote: > > Yeah, Solr's fq cache is pretty simple-minded, > order matters. There's no good way to improve > that except try to write your fq queries in the > same order. It's actually quite tricky to > disassemble/reassemble arbitrary queries to fix > this problem. > > But in your case, you could write a custom query > component that was able to handle this _specific_ > case relatively easily I should think. > > bq: Our machine always use 100% CPU > > This is strange. Are you sure Solr is using this? > Are there any other processes on the server that > might be using this? Top (*nix) might help here. If > it's really all Solr, then you need another slave > or two to handle the load. Do you get good responses > when the QPS rate is, say 10? > > How big is your filter cache? > > A hit ratio of .76 isn't actually too bad. It looks like > you're running for a long time, and if so the insert > and eviction numbers will tend to the same number. > > Do beware of using NOW in your fq clauses, that can > cause grief. See: > http://searchhub.org/2012/02/23/date-math-now-and-filter-queries/ > > This seems like really poor performance, I'm puzzled. > > Best, > Erick > > > > > On Mon, Nov 4, 2013 at 8:38 PM, Patanachai Tangchaisin < > patanachai.tangchai...@wizecommerce.com<mailto:patana > chai.tangchai...@wizecommerce.com>> wrote: > > > > Hello, > > We are running our search system using Apache Solr 4.2.1 and using > Master/Slave model. > Our index has ~100M document. The index size is ~20gb. > The machine has 24 CPU and 48gb rams. > > Our response time is pretty bad, median is ~4 seconds with 25 > queries/second. > > We noticed a couple of things > - Our machine always use 100% CPU. > - There is a lot of room for Java Heap. We assign Xms12g and Xmx16g, but > the size of heap is still only 12g > - Solr's filterCache hit ratio is only 0.76 and the number of insertion > and eviction is almost equal. > > The weird thing is > - most items in Solr's filterCache (only 100 first) are specify to only > 1 field which we filter it by using an OR query for this field. Note > that every request will have this field constraint. > > For example, if field name is x > fq=x:(1 OR 2 OR 3)&fq=y:'a' > fq=x:(3 OR 2 OR 1)&fq=y:'b' > fq=x:(2 OR 1 OR 3)&fq=y:'c' > > An order of items is different since it is an input from a different > system. > > To me, it seems that Solr do a cache on this field in different entry if > an order of item is different. e.g. "(1 OR 2)" and "(2 OR 1)" is going > to be a different cache entry. > > Question: > Is there other way to create a fq parameter using 'OR' and make Solr > cache them as a same entry? > > > Thanks, > Patanachai Tangchaisin > > CONFIDENTIALITY NOTICE > ====================== > This email message and any attachments are for the exclusive use of the > intended recipient(s) and may contain confidential and privileged > information. Any unauthorized review, use, disclosure or distribution is > prohibited. If you are not the intended recipient, please contact the > sender by reply email and destroy all copies of the original message along > with any attachments, from your computer system. If you are the intended > recipient, please be advised that the content of this message is subject to > access, review and disclosure by the sender's Email System Administrator. > > > > > > > > CONFIDENTIALITY NOTICE > ====================== > This email message and any attachments are for the exclusive use of the > intended recipient(s) and may contain confidential and privileged > information. Any unauthorized review, use, disclosure or distribution is > prohibited. If you are not the intended recipient, please contact the > sender by reply email and destroy all copies of the original message along > with any attachments, from your computer system. If you are the intended > recipient, please be advised that the content of this message is subject to > access, review and disclosure by the sender's Email System Administrator. >