On Mon, Mar 18, 2013 at 7:08 PM, Jack Krupansky <j...@basetechnology.com> wrote:
> Hmmm... if query by your unique key field is killing your performance, maybe
> you have some larger problem to address.

This is almost certainly true.  I'm well outside the use cases
targeted by Solr/Lucene, and it's a testament to the quality of the
product that it works at all.  Among other things, I'm implementing a
graph database on top of Solr (it being easier to build a graph
database on top of Solr than it is to implement Solr on top of a graph
database).

Which is the problem- you might think that 60ms unique key accesses
(what I'm seeing) is more than good enough- and for most use cases,
you'd be right.  But it's not unusual for a single web-page hit to
generate many dozens, if not low hundreds, of calls to get document by
id.  At which point, 60ms hits pile up fast.

The current plan is to just cache the documents as files in the local
file system (or possibly other systems), and have the get document
calls go there instead, while complicated searches still go to Solr.
Fortunately, this isn't complicated.

> How bad is it? Are you using the
> string field type? How long are your ids?

My ids start at 100 million and go up like a kite from there- thus the
string representation.

>
> The only thing the real-time GET API gives you is more immediate access to
> recently added, uncommitted data. Accessing older, committed data will be no
> faster. But if accessing that recent data is what you are after, real-time
> GET may do the trick.

OK, so this is good to know.  This answers question #1: GET isn't the
function I should be calling.  Thanks.

Brian

Reply via email to