I've never seen mention of a 1/2 second guarantee in any Lucene project. Are there any such projects?
You can get a 30-second garbage collection pause with 8-16G of ram. On 4/16/10, Peter Sturge <peter.stu...@googlemail.com> wrote: > Hi Don, > > We've got a similar requirement in our environment - here's what we've > found.. > Every time you commit, you're doing a relatively disk I/O intensive task to > insert the document(s) into the index. > > For very small indexes (say, <10,000 docs), the commit time is pretty short > and you can get away with doing frequent commits. With large indexes, > commits can take seconds to complete, and use a fair bit of CPU & disk > resource along the way. This of course impacts search performance, and it > won't get your docs searchable within your 500ms requirement. > > The planned NRT (near real-time) feature (I believe scheduled for 1.5?) is > probably what you need, where Lucene commits are done on a pre-segment > basis. > > You could also check out the Zoie plugin, but make sure you're not also > committing to disk straightaway, and that you don't mind having to reinput > some data if your server crashes (Zoie uses an in-memory lookup for new doc > insertions). > > HTH > Peter > > > On Fri, Apr 16, 2010 at 10:13 AM, Don Werve <d...@madwombat.com> wrote: > >> We're using Solr as the backbone for our shiny new helpdesk application, >> and >> by and large it's been a big win... especially in terms of search >> performance. But before I pat myself on the back because the Solr devs >> have >> done a great job, I had a question regarding commit frequency. >> >> While our app doesn't need truly realtime search, documents get updated >> and >> replaced somewhat frequently, and those changes need to be visible in the >> index within 500ms. At the moment, I'm using autocommit to satisfy this, >> but I've run across a few threads mentioning that frequent commits may >> cause >> some serious performance issues. >> >> Our average document size is quite small (less than 10k), and I'm >> expecting >> that we're going to have a maximum of around 100k documents per day on any >> given index; most of these will be replacing existing documents. >> >> So, rather than getting bitten by this down the road, I figure I may as >> well >> (a) ask if anybody else here is running a similar setup or has any input, >> and then (b) do some heavy load testing via a fake data generator. >> >> Thanks-in-advance! >> > -- Lance Norskog goks...@gmail.com