I've not looked at the filtering for quite a while, but if you're getting lots of similar queries, the filter's caching can play a huge part in speeding up queries, so even if the first query for "paris" was slow, subsequent queries from different users for the same terms will be sped up considerably (especially if you're using the FastLRUCache).

IF filtering is slow for your queries, why not try simply using a boolean query (i.e, for the example below: "paris AND userId:123") this would remove the cross-user usefulness of the caches, if I understand them correctly, but may speed up uncached searches.

Toby.


On 27 Jan 2010, at 15:48, Matthieu Labour wrote:

@Marc: Thank you marc. This is a logic we had to implement in the client application. Will look into applying the patch to replace our own grown logic

@Trey: I have 1000 users per machine. 1 core / user. Each core is 35000 documents. Documents are small...each core goes from 100MB to 1.3GB at most. There are 7 types of documents. What I am trying to understand is the search/filter algorithm. If I have 1 core with all documents and I search for "Paris" for userId="123", is lucene going to first search for all Paris documents and then apply a filter on the userId ? If this is the case, then I am better off having a specific index for the user="123" because this will be faster





--- On Wed, 1/27/10, Marc Sturlese <marc.sturl...@gmail.com> wrote:

From: Marc Sturlese <marc.sturl...@gmail.com>
Subject: Re: Multiple Cores Vs. Single Core for the following use case
To: solr-user@lucene.apache.org
Date: Wednesday, January 27, 2010, 2:22 AM


In case you are going to use core per user take a look to this patch:
http://wiki.apache.org/solr/LotsOfCores

Trey-13 wrote:

Hi Matt,

In most cases you are going to be better off going with the userid method unless you have a very small number of users and a very large number of docs/user. The userid method will likely be much easier to manage, as you won't have to spin up a new core every time you add a new user. I would start here and see if the performance is good enough for your requirements
before you start worrying about it not being efficient.

That being said, I really don't have any idea what your data looks like.
How many users do you have?  How many documents per user?  Are any
documents
shared by multiple users?

-Trey



On Tue, Jan 26, 2010 at 7:27 PM, Matthieu Labour
<matthieu_lab...@yahoo.com>wrote:

Hi



Shall I set up Multiple Core or Single core for the following use case:



I have X number of users.



When I do a search, I always know for which user I am doing a search



Shall I set up X cores, 1 for each user ? Or shall I set up 1 core and
add
a userId field to each document?



If I choose the 1 core solution then I am concerned with performance. Let's say I search for "NewYork" ... If lucene returns all "New York" matches for all users and then filters based on the userId, then this is going to be less efficient than if I have sharded per user and send
the request for "New York" to the user's core



Thank you for your help



matt










--
View this message in context: 
http://old.nabble.com/Multiple-Cores-Vs.-Single-Core-for-the-following-use-case-tp27332288p27335403.html
Sent from the Solr - User mailing list archive at Nabble.com.






--
Toby Cole
Senior Software Engineer, Semantico Limited
Registered in England and Wales no. 03841410, VAT no. GB-744614334.
Registered office Lees House, 21-23 Dyke Road, Brighton BN1 3FE, UK.

Check out all our latest news and thinking on the Discovery blog
http://blogs.semantico.com/discovery-blog/

Reply via email to