My first suggestion would be to find something semantic to select on other than arbitrary ids. For instance, are they all of a set of categories, results from another query, etc? It is odd to have a list of 10k completely arbitrary items.

Other than that, I don't think there is any way to handle that efficiently with solr OOB. There probably is some way to create some custom logic inside Solr to handle this efficiently, but it would require some fancy code in the internals.

-Mike

On 17-Jan-08, at 1:08 PM, [EMAIL PROTECTED] wrote:

Sorry,.. my fault, please disregard previous email,.. It returns right number of documents always,... I sent duplicated IDs after first 1000.
Sorry to bother,.. I go get some coffee...))))

But any suggestions of solving this problem of running big queries are still welcome)))

Thank you...



----- Original Message ----
From: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Thursday, January 17, 2008 4:00:58 PM
Subject: Re: Big number of conditions of the search

I still want to give a try to the simple idea just to combine all those IDs in one query. Again, lets say I have a list of IDs in some file. I need to get all docs from Solr with those IDs.
I run query:
id:(123 234 456 ***more IDs goes like this, I was planning to have about 10K IDs***) Solr has property which sets a limit of Boolean conditions in one query. Default value is 1024. I increased it to 10,000. I increased HTTP header size in Tomcat. I created query with 2,000 IDs. But, Solr always shows me in response that it found only 1,000 documents.
<result name="response" numFound="1000" start="0">
If number of my IDs is bigger than 1000 Solr always shows numFound="1000".
If I send less than 1000 IDs it shows right total number.
Are any other settings related to this?

Could somebody give any other suggestions of how to solve this problem of big queries? Any advise is helpful.

Thank you
Gene



----- Original Message ----
From: Otis Gospodnetic <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Friday, January 11, 2008 12:26:14 AM
Subject: Re: Big number of conditions of the search

Evgeniy - sound like a problem best suited for RDBMS, really.

You can run such an OR query, but you'll have to manually increase the max number of clauses allowed (in one of the configs) and make sure the JVM has plenty of memory. But again, this is best done in RDBMS with some count(*) and GROUP BY selects.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

----- Original Message ----
From: Evgeniy Strokin <[EMAIL PROTECTED]>
To: Solr User <solr-user@lucene.apache.org>
Sent: Thursday, January 10, 2008 4:39:44 PM
Subject: Big number of conditions of the search

Hello, I don't know how to formulate this right, I'll give an example:
I have 20 millions documents with unique ID indexed.
I have list of IDs stored somewhere. I need to run query which will
take documents with ID from my list and gives me some statistic.
For example: my documents are addresses with unique ID. I have list
which contains 10 thousand IDs of some addresses. I need to find how many
addresses are in NJ from my list? Or another scenario: give me all
states my addresses from and how many addresses in each state (only
addresses from my list)?

So I was thinking I could run facet search by field "State", but my
query would be like this: ID:123 OR ID:23987 OR ID:294343 .... 10K such OR
conditions in a row, which is ridicules and not even possible I think.

Could somebody suggest some solution for this?

Thank you
Gene

Reply via email to