If you must have real-time search, you might look at systems that are designed 
to do that. MarkLogic isn't free, but it is fast and real-time. You can use 
their no-charge Express license for development and prototyping: 
http://developer.marklogic.com/express

OK, back to Solr.

wunder
Search Guy, Chegg
former MarkLogic engineer

On Mar 29, 2012, at 1:49 AM, Rafal Gwizdala wrote:

> That's bad news.
> If 5-7 seconds is not safe then what is the safe interval for updates?
> Near real-time is not for me as it works only when querying by document Id
> - this doesn't solve anything in my case. I just want the index to be
> updated in real-time, 30-40 seconds delay is acceptable but not much more
> than that. Is there anything that can be done, or should I start looking
> for some other indexing tool?
> I'm wondering why there's such terrible performance degradation over time -
> SOLR runs fine for first 10-20 hours, updates are extremely fast and then
> they become slower and slower until eventually they stop executing at all.
> Is there any issue with garbage collection or index fragmentation or some
> internal data structures that can't manage their data effectively when
> updates are frequent?
> 
> Best regards
> RG
> 
> 
> Thu, Mar 29, 2012 at 10:24 AM, Lance Norskog <goks...@gmail.com> wrote:
> 
>> 5-7 seconds- there's the problem. If you want to have documents
>> visible for search within that time, you want to use the trunk and
>> "near-real-time" search. A hard commit does several hard writes to the
>> disk (with the fsync() system call). It does not run smoothly at that
>> rate. It is no surprise that eventually you hit a thread-locking bug.
>> 
>> 
>> http://www.lucidimagination.com/search/link?url=http://wiki.apache.org/solr/RealTimeGet
>> 
>> http://www.lucidimagination.com/search/link?url=http://wiki.apache.org/solr/CommitWithin
>> 
>> On Wed, Mar 28, 2012 at 11:08 PM, Rafal Gwizdala
>> <rafal.gwizd...@gmail.com> wrote:
>>> Lance, I know there are many variables that's why I'm asking where to
>> start
>>> and what to check.
>>> Updates are sent every 5-7 seconds, each update contains between 1 and 50
>>> docs. Commit is done every time (on each update).
>>> Currently queries aren't very frequent - about 1 query every 3-5 seconds,
>>> but the system is going to handle much more (of course if the problem is
>>> fixed).
>>> The system has 2 core CPU (virtualized) and 4 GB memory (SOLR uses about
>>> 300 MB)
>>> 
>>> R
>>> 
>>> On Thu, Mar 29, 2012 at 1:53 AM, Lance Norskog <goks...@gmail.com>
>> wrote:
>>> 
>>>> How often are updates? And when are commits? How many CPUs? How much
>>>> query load? There are so many variables.
>>>> 
>>>> Check the mailing list archives and Solr issues, there might be a
>>>> similar problem already discussed. Also, attachments do not work with
>>>> Apache mailing lists. (Well, ok, they work for direct subscribers, but
>>>> not for indirect subscribers and archive site users.)
>>>> 
>>>> --
>>>> Lance Norskog
>>>> goks...@gmail.com
>>>> 
>> 
>> 
>> 
>> --
>> Lance Norskog
>> goks...@gmail.com
>> 




Reply via email to