Have you tried using Solr 3.5 with RankingAlgorithm 1.4.1 ? Has NRT support and is very fast, updates about 5000 documents in about 490 ms (while updating 1m docs in batches of 5k).

You can get more info from here:
http://solr-ra.tgels.com/wiki/en/Near_Real_Time_Search_ver_3.x


Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org

On 3/29/2012 1:49 AM, Rafal Gwizdala wrote:
That's bad news.
If 5-7 seconds is not safe then what is the safe interval for updates?
Near real-time is not for me as it works only when querying by document Id
- this doesn't solve anything in my case. I just want the index to be
updated in real-time, 30-40 seconds delay is acceptable but not much more
than that. Is there anything that can be done, or should I start looking
for some other indexing tool?
I'm wondering why there's such terrible performance degradation over time -
SOLR runs fine for first 10-20 hours, updates are extremely fast and then
they become slower and slower until eventually they stop executing at all.
Is there any issue with garbage collection or index fragmentation or some
internal data structures that can't manage their data effectively when
updates are frequent?

Best regards
RG


  Thu, Mar 29, 2012 at 10:24 AM, Lance Norskog<goks...@gmail.com>  wrote:

5-7 seconds- there's the problem. If you want to have documents
visible for search within that time, you want to use the trunk and
"near-real-time" search. A hard commit does several hard writes to the
disk (with the fsync() system call). It does not run smoothly at that
rate. It is no surprise that eventually you hit a thread-locking bug.


http://www.lucidimagination.com/search/link?url=http://wiki.apache.org/solr/RealTimeGet

http://www.lucidimagination.com/search/link?url=http://wiki.apache.org/solr/CommitWithin

On Wed, Mar 28, 2012 at 11:08 PM, Rafal Gwizdala
<rafal.gwizd...@gmail.com>  wrote:
Lance, I know there are many variables that's why I'm asking where to
start
and what to check.
Updates are sent every 5-7 seconds, each update contains between 1 and 50
docs. Commit is done every time (on each update).
Currently queries aren't very frequent - about 1 query every 3-5 seconds,
but the system is going to handle much more (of course if the problem is
fixed).
The system has 2 core CPU (virtualized) and 4 GB memory (SOLR uses about
300 MB)

R

On Thu, Mar 29, 2012 at 1:53 AM, Lance Norskog<goks...@gmail.com>
wrote:
How often are updates? And when are commits? How many CPUs? How much
query load? There are so many variables.

Check the mailing list archives and Solr issues, there might be a
similar problem already discussed. Also, attachments do not work with
Apache mailing lists. (Well, ok, they work for direct subscribers, but
not for indirect subscribers and archive site users.)

--
Lance Norskog
goks...@gmail.com



--
Lance Norskog
goks...@gmail.com


Reply via email to