Hello,

We have a scenario where we present results to users one from solr and other 
from real time web site search. The solr data we have locally available that we 
are able to index but other website search, we don't host data and it is real 
time.

We are wondering if we can use some federated search framework which can unify 
the results into single set with relevancy and all.

Any thoughts?

Thanks & appreciate your help.
Susheel

-----Original Message-----
From: Patanachai Tangchaisin [mailto:patanachai.tangchai...@wizecommerce.com] 
Sent: Monday, November 04, 2013 7:38 PM
To: solr-user@lucene.apache.org
Subject: Disjuctive Queries (OR queries) and FilterCache

Hello,

We are running our search system using Apache Solr 4.2.1 and using Master/Slave 
model.
Our index has ~100M document. The index size is  ~20gb.
The machine has 24 CPU and 48gb rams.

Our response time is pretty bad, median is ~4 seconds with 25 queries/second.

We noticed a couple of things
- Our machine always use 100% CPU.
- There is a lot of room for Java Heap. We assign Xms12g and Xmx16g, but the 
size of heap is still only 12g
- Solr's filterCache hit ratio is only 0.76 and the number of insertion and 
eviction is almost equal.

The weird thing is
- most items in Solr's filterCache (only 100 first) are specify to only
1 field which we filter it by using an OR query for this field. Note that every 
request will have this field constraint.

For example, if field name is x
fq=x:(1 OR 2 OR 3)&fq=y:'a'
fq=x:(3 OR 2 OR 1)&fq=y:'b'
fq=x:(2 OR 1 OR 3)&fq=y:'c'

An order of items is different since it is an input from a different system.

To me, it seems that Solr do a cache on this field in different entry if an 
order of item is different. e.g. "(1 OR 2)" and "(2 OR 1)" is going to be a 
different cache entry.

Question:
Is there other way to create a fq parameter using 'OR' and make Solr cache them 
as a same entry?


Thanks,
Patanachai Tangchaisin

CONFIDENTIALITY NOTICE
======================
This email message and any attachments are for the exclusive use of the 
intended recipient(s) and may contain confidential and privileged information. 
Any unauthorized review, use, disclosure or distribution is prohibited. If you 
are not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message along with any attachments, from 
your computer system. If you are the intended recipient, please be advised that 
the content of this message is subject to access, review and disclosure by the 
sender's Email System Administrator.

Reply via email to