My first suggestion would be to find something semantic to select on
other than arbitrary ids. For instance, are they all of a set of
categories, results from another query, etc? It is odd to have a
list of 10k completely arbitrary items.
Other than that, I don't think there is any way to handle that
efficiently with solr OOB. There probably is some way to create some
custom logic inside Solr to handle this efficiently, but it would
require some fancy code in the internals.
-Mike
On 17-Jan-08, at 1:08 PM, [EMAIL PROTECTED] wrote:
Sorry,.. my fault, please disregard previous email,.. It returns
right number of documents always,... I sent duplicated IDs after
first 1000.
Sorry to bother,.. I go get some coffee...))))
But any suggestions of solving this problem of running big queries
are still welcome)))
Thank you...
----- Original Message ----
From: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Thursday, January 17, 2008 4:00:58 PM
Subject: Re: Big number of conditions of the search
I still want to give a try to the simple idea just to combine all
those IDs in one query.
Again, lets say I have a list of IDs in some file. I need to get
all docs from Solr with those IDs.
I run query:
id:(123 234 456 ***more IDs goes like this, I was planning to have
about 10K IDs***)
Solr has property which sets a limit of Boolean conditions in one
query. Default value is 1024. I increased it to 10,000. I increased
HTTP header size in Tomcat.
I created query with 2,000 IDs. But, Solr always shows me in
response that it found only 1,000 documents.
<result name="response" numFound="1000" start="0">
If number of my IDs is bigger than 1000 Solr always shows
numFound="1000".
If I send less than 1000 IDs it shows right total number.
Are any other settings related to this?
Could somebody give any other suggestions of how to solve this
problem of big queries? Any advise is helpful.
Thank you
Gene
----- Original Message ----
From: Otis Gospodnetic <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Friday, January 11, 2008 12:26:14 AM
Subject: Re: Big number of conditions of the search
Evgeniy - sound like a problem best suited for RDBMS, really.
You can run such an OR query, but you'll have to manually increase
the max number of clauses allowed (in one of the configs) and make
sure the JVM has plenty of memory. But again, this is best done in
RDBMS with some count(*) and GROUP BY selects.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
----- Original Message ----
From: Evgeniy Strokin <[EMAIL PROTECTED]>
To: Solr User <solr-user@lucene.apache.org>
Sent: Thursday, January 10, 2008 4:39:44 PM
Subject: Big number of conditions of the search
Hello, I don't know how to formulate this right, I'll give an example:
I have 20 millions documents with unique ID indexed.
I have list of IDs stored somewhere. I need to run query which will
take documents with ID from my list and gives me some statistic.
For example: my documents are addresses with unique ID. I have list
which contains 10 thousand IDs of some addresses. I need to find
how many
addresses are in NJ from my list? Or another scenario: give me all
states my addresses from and how many addresses in each state (only
addresses from my list)?
So I was thinking I could run facet search by field "State", but my
query would be like this: ID:123 OR ID:23987 OR ID:294343 .... 10K
such OR
conditions in a row, which is ridicules and not even possible I think.
Could somebody suggest some solution for this?
Thank you
Gene