Re: Performance problems for OR-queries

Jörg Kiegeland Thu, 22 Nov 2007 06:02:44 -0800

1. Does Solr support this kind of index access with better performance ?
Is there anything special to define in schema.xml?


No... Solr uses Lucene at it's core, and all matching documents for a
query are scored.

So it is not possible to have a "google" like performance with Solr,i.e. to search for a set of keywords and only the 10 best documents arelisted, without touching the other millions of (web) documents matchingless keywords.I infact would not know how to program such an index, however google hasdone it somehow..

2. Can one switch off this ordering and just return any 100 documents
fullfilling the query (though  getting best-matching documents would be
a nice feature if it would be fast)?


a feature like this could be developed... but what is the usecase for
this?  What are you tring to accomplish where either relevancy or
complete matching doesn't matter?  There may be an easier workaround
for your specific case.

This is not an actual Use-Case for my project, however I just wanted toknow if it would be possible.

Because of the performance results, we designed a new type of query. Iwould like to know how fast it would be before I implement the followingquery:


I have N keywords and execute a query of the form

keyword1 AND keyword2 AND .. AND keywordN

there may be again some millions of matching documents and I want to get the 
first 100 documents.
To have a ordering criteria, each Solr document has a field named "REV" which 
has a natural number. The returned 100 documents shall be those with
the lowest numbers in the "REV" field.

My questions now are:

(1) Will the query perform in O(100) or in O(all possible matches)?

(2) If the answer to (1) is O(all possible matches), what will be the performance if I 
dont order for the "REV" field? Will Solr order it after the point of time 
where a document was created/modified? What I have to do to get O(100) complexity finally?


Thanks

Jörg

Re: Performance problems for OR-queries

Reply via email to