I think I will try a hybrid version. One that uses my simple negation
for newly joined campaigns, and uses your method to filter out
campaigns joined longer ago. A cron:ed script will run every night and
add all new user_id:s to the appropriate campaigns. That way I don't
have to re-index on the fly at daytime when the server is going to be
the busiest, and there should be less commits to the solr instance
too, one per campaign max, instead of one per every join.
Thanks for your input on this Rachel!
Alec
On Feb 13, 2008, at 2:01 PM, Rachel McConnell wrote:
We've been using this in production for at least six months. I have
never stress-tested this particular feature, but we usually do over
100k unique hits a day. Of those, most hit Solr for one thing or
another, but a much smaller percentage use this specific bit. It
isn't the fastest query but as we use it there are some additional
complexities so YMMV.
We aren't at risk for data loss from Solr, as we maintain all data in
our database backend; Solr is essentially a slave to that. So we have
a db field, enteredUsers, which has the usual JDBC failure checking
and any error is handled gracefully. And the Solr index is then
updated from the db periodically (we're optimized for faster search
results, over up-to-date-ness).
R
On 2/13/08, alexander lind <[EMAIL PROTECTED]> wrote:
Have you done any stress tests on this setup? Is it working well for
you?
It sounds like something that could work quite well for me too, but I
would be a little worried that a commit could time out, and a unique
value could be lost for that user.
Thank you
Alec
On Feb 13, 2008, at 1:10 PM, Rachel McConnell wrote:
We do something similar in a different context. I don't know if our
way is necessarily better, but it would work like this:
1. add a field to campaign called something like enteredUsers
2. once a user adds a campaign, update the campaign, adding a value
unique to that user to enteredUsers
3. the negation can now be done by excluding the user's unique id
from
the enteredUsers field, instead of excluding all the user's
campaigns
The downside is it will increase the number of your commits, which
may
or may not be OK.
Rachel
On 2/13/08, alexander lind <[EMAIL PROTECTED]> wrote:
Hi all
Say that I have a solr index with 5000 documents, each
representing a
campaign that users of my site can join. The user can search and
find
these campaigns in various ways, which is not a problem, but once a
user has found a campaign and joined it, I don't want that campaign
to
ever show up again for that particular user.
After a while, a user can have built up a list of say 200 campaigns
that he has joined, and hence should never see in any search
results
again.
I know this functionality could be achieved by simply building a
longer and longer negation query negating all the campaigns that a
user already has joined. I would assume that this would become slow
and ineffective eventually.
My question is: is there a better way to do this?
Thanks
Alec