Hello, We have a scenario where we present results to users one from solr and other from real time web site search. The solr data we have locally available that we are able to index but other website search, we don't host data and it is real time.
We are wondering if we can use some federated search framework which can unify the results into single set with relevancy and all. Any thoughts? Thanks & appreciate your help. Susheel -----Original Message----- From: Patanachai Tangchaisin [mailto:patanachai.tangchai...@wizecommerce.com] Sent: Monday, November 04, 2013 7:38 PM To: solr-user@lucene.apache.org Subject: Disjuctive Queries (OR queries) and FilterCache Hello, We are running our search system using Apache Solr 4.2.1 and using Master/Slave model. Our index has ~100M document. The index size is ~20gb. The machine has 24 CPU and 48gb rams. Our response time is pretty bad, median is ~4 seconds with 25 queries/second. We noticed a couple of things - Our machine always use 100% CPU. - There is a lot of room for Java Heap. We assign Xms12g and Xmx16g, but the size of heap is still only 12g - Solr's filterCache hit ratio is only 0.76 and the number of insertion and eviction is almost equal. The weird thing is - most items in Solr's filterCache (only 100 first) are specify to only 1 field which we filter it by using an OR query for this field. Note that every request will have this field constraint. For example, if field name is x fq=x:(1 OR 2 OR 3)&fq=y:'a' fq=x:(3 OR 2 OR 1)&fq=y:'b' fq=x:(2 OR 1 OR 3)&fq=y:'c' An order of items is different since it is an input from a different system. To me, it seems that Solr do a cache on this field in different entry if an order of item is different. e.g. "(1 OR 2)" and "(2 OR 1)" is going to be a different cache entry. Question: Is there other way to create a fq parameter using 'OR' and make Solr cache them as a same entry? Thanks, Patanachai Tangchaisin CONFIDENTIALITY NOTICE ====================== This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message along with any attachments, from your computer system. If you are the intended recipient, please be advised that the content of this message is subject to access, review and disclosure by the sender's Email System Administrator.