First, please start a new thread when changing topics, see "thread hijacking" here http://people.apache.org/~hossman/#threadhijack
But do be aware that scores are NOT comparable between different queries on the _same_ corpus. A score of .75 on one query has no relation to a score of .75 on another. So "federated search" is hard, you usually have to figure out a way to group the results in a way that's meaningful to a user. Don't quite know how carrot handles that one... FWIW, Erick On Mon, Nov 4, 2013 at 11:09 PM, Susheel Kumar < susheel.ku...@thedigitalgroup.net> wrote: > Hello, > > We have a scenario where we present results to users one from solr and > other from real time web site search. The solr data we have locally > available that we are able to index but other website search, we don't host > data and it is real time. > > We are wondering if we can use some federated search framework which can > unify the results into single set with relevancy and all. > > Any thoughts? > > Thanks & appreciate your help. > Susheel > > -----Original Message----- > From: Patanachai Tangchaisin [mailto: > patanachai.tangchai...@wizecommerce.com] > Sent: Monday, November 04, 2013 7:38 PM > To: solr-user@lucene.apache.org > Subject: Disjuctive Queries (OR queries) and FilterCache > > Hello, > > We are running our search system using Apache Solr 4.2.1 and using > Master/Slave model. > Our index has ~100M document. The index size is ~20gb. > The machine has 24 CPU and 48gb rams. > > Our response time is pretty bad, median is ~4 seconds with 25 > queries/second. > > We noticed a couple of things > - Our machine always use 100% CPU. > - There is a lot of room for Java Heap. We assign Xms12g and Xmx16g, but > the size of heap is still only 12g > - Solr's filterCache hit ratio is only 0.76 and the number of insertion > and eviction is almost equal. > > The weird thing is > - most items in Solr's filterCache (only 100 first) are specify to only > 1 field which we filter it by using an OR query for this field. Note that > every request will have this field constraint. > > For example, if field name is x > fq=x:(1 OR 2 OR 3)&fq=y:'a' > fq=x:(3 OR 2 OR 1)&fq=y:'b' > fq=x:(2 OR 1 OR 3)&fq=y:'c' > > An order of items is different since it is an input from a different > system. > > To me, it seems that Solr do a cache on this field in different entry if > an order of item is different. e.g. "(1 OR 2)" and "(2 OR 1)" is going to > be a different cache entry. > > Question: > Is there other way to create a fq parameter using 'OR' and make Solr cache > them as a same entry? > > > Thanks, > Patanachai Tangchaisin > > CONFIDENTIALITY NOTICE > ====================== > This email message and any attachments are for the exclusive use of the > intended recipient(s) and may contain confidential and privileged > information. Any unauthorized review, use, disclosure or distribution is > prohibited. If you are not the intended recipient, please contact the > sender by reply email and destroy all copies of the original message along > with any attachments, from your computer system. If you are the intended > recipient, please be advised that the content of this message is subject to > access, review and disclosure by the sender's Email System Administrator. >