Re: Near real-time search of user data

Noble Paul നോബിള്‍ नोब्ळ् Thu, 19 Feb 2009 20:31:55 -0800

we have a similar usecase and I have raised an issue for the same (SOLR-880)
currently we are using an internal patch and we hopw to submit one soon.


we also use an LRU based automatic loading unloading feature. if a
request comes up for a core that is 'STOPPED' . the core is 'STARTED'
and the request is served.

We  keep an upper limit of the no:of cores to be kept loaded and if
the limit is crossed, a least recently used core is 'STOPPED' .

--Noble


On Fri, Feb 20, 2009 at 8:53 AM, Otis Gospodnetic
<otis_gospodne...@yahoo.com> wrote:
>
> I've used a similar strategy for Simpy.com, but with raw Lucene and not Solr. 
>  The crucial piece is to close (inactive) user indices periodically and thus 
> free the memory.  Are you doing the same with your per-user Solr cores and 
> still running into memory issues?
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> ----- Original Message ----
>> From: Mark Ferguson <mark.a.fergu...@gmail.com>
>> To: solr-user@lucene.apache.org
>> Sent: Friday, February 20, 2009 1:14:15 AM
>> Subject: Near real-time search of user data
>>
>> Hi,
>>
>> I am trying to come up with a strategy for a solr setup in which a user's
>> indexed data can be nearly immediately available to them for search. My
>> current strategy (which is starting to cause problems) is as follows:
>>
>>   - each user has their own personal index (core), which gets committed
>> after each update
>>   - there is a main index which is basically an aggregate of all user
>> indexes. This index gets committed every 5 minutes or so.
>>
>> In this way, I can search a user's personal index to get real-time results,
>> and concatenate the world results from the main index, which aren't as
>> important to be immediate.
>>
>> This multicore strategy worked well in test scenarios but as the user
>> indexes get larger it is starting to fall apart as I run into memory issues
>> in maintaining too many cores. It's not realistic to dedicate a new machine
>> to every 5K-10K users and I think this is what I will have to do to maintain
>> the multicore strategy.
>>
>> So I am hoping that someone will be able to provide some tips on how to
>> accomplish what I am looking for. One option is to simply send a commit to
>> the main index every couple seconds, but I was hoping someone with
>> experience could shed some light on whether this is a viable option before I
>> attempt that route (i.e. can commits be sent that frequently on a large
>> index?). The indexes are distributed but they could still be in the 2-100GB
>> range.
>>
>> Thanks very much for any suggestions!
>>
>> Mark
>
>



-- 
--Noble Paul

Re: Near real-time search of user data

Reply via email to