We've moved past this issue by reducing date precision -- thanks to all for the help. Now we're at another problem.

There is relatively constant updating of the index -- new log entries are pumped in from several applications continuously. Obviously, new entries do not appear in searches until after a commit occurs.

The problem is, issuing a commit causes searches to come to a screeching halt for up to 2 minutes. We're up to around 80M docs. Index size is 27G. The number of docs will soon be 800M, which doesn't bode well for these "pauses" in search performance.

I'd appreciate any suggestions.

---
Alok K. Dhir
Symplicity Corporation
www.symplicity.com
(703) 351-0200 x 8080
[EMAIL PROTECTED]

On Oct 29, 2008, at 4:30 PM, Alok Dhir wrote:

Hi -- using solr 1.3 -- roughly 11M docs on a 64 gig 8 core machine.

Fairly simple schema -- no large text fields, standard request handler. 4 small facet fields.

The index is an event log -- a primary search/retrieval requirement is date range queries.

A simple query without a date range subquery is ridiculously fast - 2ms. The same query with a date range takes up to 30s (30,000ms).

Concrete example, this query just look 18s:

instance:client\-csm.symplicity.com AND dt:[2008-10-01T04:00:00Z TO 2008-10-30T03:59:59Z] AND label_facet:"Added to Position"

The exact same query without the date range took 2ms.

I saw a thread from Apr 2008 which explains the problem being due to too much precision on the DateField type, and the range expansion leading to far too many elements being checked. Proposed solution appears to be a hack where you index date fields as strings and hacking together date functions to generate proper queries/format results.

Does this remain the recommended solution to this issue?

Thanks

---
Alok K. Dhir
Symplicity Corporation
www.symplicity.com
(703) 351-0200 x 8080
[EMAIL PROTECTED]


Reply via email to