I've not looked at the filtering for quite a while, but if you're
getting lots of similar queries, the filter's caching can play a huge
part in speeding up queries, so even if the first query for "paris"
was slow, subsequent queries from different users for the same terms
will be sped up considerably (especially if you're using the
FastLRUCache).
IF filtering is slow for your queries, why not try simply using a
boolean query (i.e, for the example below: "paris AND userId:123")
this would remove the cross-user usefulness of the caches, if I
understand them correctly, but may speed up uncached searches.
Toby.
On 27 Jan 2010, at 15:48, Matthieu Labour wrote:
@Marc: Thank you marc. This is a logic we had to implement in the
client application. Will look into applying the patch to replace our
own grown logic
@Trey: I have 1000 users per machine. 1 core / user. Each core is
35000 documents. Documents are small...each core goes from 100MB to
1.3GB at most. There are 7 types of documents.
What I am trying to understand is the search/filter algorithm. If I
have 1 core with all documents and I search for "Paris" for
userId="123", is lucene going to first search for all Paris
documents and then apply a filter on the userId ? If this is the
case, then I am better off having a specific index for the
user="123" because this will be faster
--- On Wed, 1/27/10, Marc Sturlese <marc.sturl...@gmail.com> wrote:
From: Marc Sturlese <marc.sturl...@gmail.com>
Subject: Re: Multiple Cores Vs. Single Core for the following use case
To: solr-user@lucene.apache.org
Date: Wednesday, January 27, 2010, 2:22 AM
In case you are going to use core per user take a look to this patch:
http://wiki.apache.org/solr/LotsOfCores
Trey-13 wrote:
Hi Matt,
In most cases you are going to be better off going with the userid
method
unless you have a very small number of users and a very large
number of
docs/user. The userid method will likely be much easier to manage,
as you
won't have to spin up a new core every time you add a new user. I
would
start here and see if the performance is good enough for your
requirements
before you start worrying about it not being efficient.
That being said, I really don't have any idea what your data looks
like.
How many users do you have? How many documents per user? Are any
documents
shared by multiple users?
-Trey
On Tue, Jan 26, 2010 at 7:27 PM, Matthieu Labour
<matthieu_lab...@yahoo.com>wrote:
Hi
Shall I set up Multiple Core or Single core for the following use
case:
I have X number of users.
When I do a search, I always know for which user I am doing a search
Shall I set up X cores, 1 for each user ? Or shall I set up 1 core
and
add
a userId field to each document?
If I choose the 1 core solution then I am concerned with
performance.
Let's say I search for "NewYork" ... If lucene returns all "New
York"
matches for all users and then filters based on the userId, then
this
is going to be less efficient than if I have sharded per user and
send
the request for "New York" to the user's core
Thank you for your help
matt
--
View this message in context:
http://old.nabble.com/Multiple-Cores-Vs.-Single-Core-for-the-following-use-case-tp27332288p27335403.html
Sent from the Solr - User mailing list archive at Nabble.com.
--
Toby Cole
Senior Software Engineer, Semantico Limited
Registered in England and Wales no. 03841410, VAT no. GB-744614334.
Registered office Lees House, 21-23 Dyke Road, Brighton BN1 3FE, UK.
Check out all our latest news and thinking on the Discovery blog
http://blogs.semantico.com/discovery-blog/