If you must have real-time search, you might look at systems that are designed to do that. MarkLogic isn't free, but it is fast and real-time. You can use their no-charge Express license for development and prototyping: http://developer.marklogic.com/express
OK, back to Solr. wunder Search Guy, Chegg former MarkLogic engineer On Mar 29, 2012, at 1:49 AM, Rafal Gwizdala wrote: > That's bad news. > If 5-7 seconds is not safe then what is the safe interval for updates? > Near real-time is not for me as it works only when querying by document Id > - this doesn't solve anything in my case. I just want the index to be > updated in real-time, 30-40 seconds delay is acceptable but not much more > than that. Is there anything that can be done, or should I start looking > for some other indexing tool? > I'm wondering why there's such terrible performance degradation over time - > SOLR runs fine for first 10-20 hours, updates are extremely fast and then > they become slower and slower until eventually they stop executing at all. > Is there any issue with garbage collection or index fragmentation or some > internal data structures that can't manage their data effectively when > updates are frequent? > > Best regards > RG > > > Thu, Mar 29, 2012 at 10:24 AM, Lance Norskog <goks...@gmail.com> wrote: > >> 5-7 seconds- there's the problem. If you want to have documents >> visible for search within that time, you want to use the trunk and >> "near-real-time" search. A hard commit does several hard writes to the >> disk (with the fsync() system call). It does not run smoothly at that >> rate. It is no surprise that eventually you hit a thread-locking bug. >> >> >> http://www.lucidimagination.com/search/link?url=http://wiki.apache.org/solr/RealTimeGet >> >> http://www.lucidimagination.com/search/link?url=http://wiki.apache.org/solr/CommitWithin >> >> On Wed, Mar 28, 2012 at 11:08 PM, Rafal Gwizdala >> <rafal.gwizd...@gmail.com> wrote: >>> Lance, I know there are many variables that's why I'm asking where to >> start >>> and what to check. >>> Updates are sent every 5-7 seconds, each update contains between 1 and 50 >>> docs. Commit is done every time (on each update). >>> Currently queries aren't very frequent - about 1 query every 3-5 seconds, >>> but the system is going to handle much more (of course if the problem is >>> fixed). >>> The system has 2 core CPU (virtualized) and 4 GB memory (SOLR uses about >>> 300 MB) >>> >>> R >>> >>> On Thu, Mar 29, 2012 at 1:53 AM, Lance Norskog <goks...@gmail.com> >> wrote: >>> >>>> How often are updates? And when are commits? How many CPUs? How much >>>> query load? There are so many variables. >>>> >>>> Check the mailing list archives and Solr issues, there might be a >>>> similar problem already discussed. Also, attachments do not work with >>>> Apache mailing lists. (Well, ok, they work for direct subscribers, but >>>> not for indirect subscribers and archive site users.) >>>> >>>> -- >>>> Lance Norskog >>>> goks...@gmail.com >>>> >> >> >> >> -- >> Lance Norskog >> goks...@gmail.com >>