I've never seen mention of a 1/2 second guarantee in any Lucene
project. Are there any such projects?

You can get a 30-second garbage collection pause with 8-16G of ram.


On 4/16/10, Peter Sturge <peter.stu...@googlemail.com> wrote:
> Hi Don,
>
> We've got a similar requirement in our environment - here's what we've
> found..
> Every time you commit, you're doing a relatively disk I/O intensive task to
> insert the document(s) into the index.
>
> For very small indexes (say, <10,000 docs), the commit time is pretty short
> and you can get away with doing frequent commits. With large indexes,
> commits can take seconds to complete, and use a fair bit of CPU & disk
> resource along the way. This of course impacts search performance, and it
> won't get your docs searchable within your 500ms requirement.
>
> The planned NRT (near real-time) feature (I believe scheduled for 1.5?) is
> probably what you need, where Lucene commits are done on a pre-segment
> basis.
>
> You could also check out the Zoie plugin, but make sure you're not also
> committing to disk straightaway, and that you don't mind having to reinput
> some data if your server crashes (Zoie uses an in-memory lookup for new doc
> insertions).
>
> HTH
> Peter
>
>
> On Fri, Apr 16, 2010 at 10:13 AM, Don Werve <d...@madwombat.com> wrote:
>
>> We're using Solr as the backbone for our shiny new helpdesk application,
>> and
>> by and large it's been a big win... especially in terms of search
>> performance.  But before I pat myself on the back because the Solr devs
>> have
>> done a great job, I had a question regarding commit frequency.
>>
>> While our app doesn't need truly realtime search, documents get updated
>> and
>> replaced somewhat frequently, and those changes need to be visible in the
>> index within 500ms.  At the moment, I'm using autocommit to satisfy this,
>> but I've run across a few threads mentioning that frequent commits may
>> cause
>> some serious performance issues.
>>
>> Our average document size is quite small (less than 10k), and I'm
>> expecting
>> that we're going to have a maximum of around 100k documents per day on any
>> given index; most of these will be replacing existing documents.
>>
>> So, rather than getting bitten by this down the road, I figure I may as
>> well
>> (a) ask if anybody else here is running a similar setup or has any input,
>> and then (b) do some heavy load testing via a fake data generator.
>>
>> Thanks-in-advance!
>>
>


-- 
Lance Norskog
goks...@gmail.com

Reply via email to