On 8/8/06, bo_b <[EMAIL PROTECTED]> wrote:
As mentioned in another post i am trying to index a vbulletin database
containing roughly 7 million posts. The very first query where I apply
sorting after a full indexing, seems to take roughly <QTime>264998</QTime>
ms. Subsequent searches are fast.

I figure the reason is as Chris explained
here(http://www.mail-archive.com/solr-user@lucene.apache.org/msg00457.html)
that

"Sorting on a field requires building a FieldCache for every document --
regardless of how many documents match your query.  This cache is reused
for all searches thta sort on that field."

However my problem is that I would like to be able to incrementally add new
postings to the index, as they occur.

And it appears that if i add just 1
post, and do a <commit> that solr/lucene rebuilds FieldCaches for the entire
index, not just the newly added posts. Thus rendering my index unsearchable
for the next roughly 264 seconds(at least for sorting queries)..

Warming (either normal or auto-warming) will solve the problem of the
long first search.
Warming is done in the background, so no "real" live queries will see
that long delay.

That said, 264 seconds is a *long* time to build a FieldCache entry,
even for 6M documents.  Make sure that you have enough heap size and
that running out of memory isn't causing the GC to hog the CPU.

That doesn't solve the <commit> after every <add> problem though.
That's not the type of thing that Lucene (and Solr) are optimized for.
Most search collections can tollerate a few minutes of lag until new
documents become searchable.

-Yonik

Reply via email to