Hi

We obtain ALL documents for every query, the index size is about 50k. We use
number of stored fields. Often the result set size is several thousands of
docs.

We performed the following things to make it faster:

1. Use EmbeddedSolrServer
2. Patch Solr to avoid unnecessary marshalling while using
EmbeddedSolrServer (there's an issue  in Solr JIRA)
3. Patch Solr to cache SolrDocument instances instead of Lucene's Document
instances. I was going to share this patch, but then decided that our usage
of Solr is not common and this functionality is useless in most cases
4. We have all documents in cache
5. In fact our index is stored in a data grid, not a file system. But as
tests showed this is not important because standard FSDirectory is faster if
you have enough of RAM free for OS caches.

These changes improved the performance very much, so in the end we have
performance comparable (about 3-5 times slower) to the "proper" Solr usage
(obtaining first 20 documents).

To get more details on how different Solr components perform we injected
perf4j statements into key points in the code. And a profiler was helpful
too.

Hope it helps somehow.

On Thu, Nov 26, 2009 at 8:48 PM, Raghuveer Kancherla <
raghuveer.kanche...@aplopio.com> wrote:

> Hi,
> I am using Solr1.4 for searching through half a million documents. The
> problem is, I want to retrieve nearly 200 documents for each search query.
> The query time in Solr logs is showing 0.02 seconds and I am fairly happy
> with that. However Solr is taking a long time (4 to 5 secs) to return the
> results (I think it is because of the number of docs I am requesting). I
> tried returning only the id's (unique key) without any other stored fields,
> but it is not helping me improve the response times (time to return the
> id's
> of matching documents).
> I understand that retrieving 200 documents for each search term is
> impractical in most scenarios but I dont have any other option. Any
> pointers
> on how to improve the response times will be a great help.
>
> Thanks,
>  Raghu
>



-- 
Andrew Klochkov
Senior Software Engineer,
Grid Dynamics

Reply via email to