Hi We obtain ALL documents for every query, the index size is about 50k. We use number of stored fields. Often the result set size is several thousands of docs.
We performed the following things to make it faster: 1. Use EmbeddedSolrServer 2. Patch Solr to avoid unnecessary marshalling while using EmbeddedSolrServer (there's an issue in Solr JIRA) 3. Patch Solr to cache SolrDocument instances instead of Lucene's Document instances. I was going to share this patch, but then decided that our usage of Solr is not common and this functionality is useless in most cases 4. We have all documents in cache 5. In fact our index is stored in a data grid, not a file system. But as tests showed this is not important because standard FSDirectory is faster if you have enough of RAM free for OS caches. These changes improved the performance very much, so in the end we have performance comparable (about 3-5 times slower) to the "proper" Solr usage (obtaining first 20 documents). To get more details on how different Solr components perform we injected perf4j statements into key points in the code. And a profiler was helpful too. Hope it helps somehow. On Thu, Nov 26, 2009 at 8:48 PM, Raghuveer Kancherla < raghuveer.kanche...@aplopio.com> wrote: > Hi, > I am using Solr1.4 for searching through half a million documents. The > problem is, I want to retrieve nearly 200 documents for each search query. > The query time in Solr logs is showing 0.02 seconds and I am fairly happy > with that. However Solr is taking a long time (4 to 5 secs) to return the > results (I think it is because of the number of docs I am requesting). I > tried returning only the id's (unique key) without any other stored fields, > but it is not helping me improve the response times (time to return the > id's > of matching documents). > I understand that retrieving 200 documents for each search term is > impractical in most scenarios but I dont have any other option. Any > pointers > on how to improve the response times will be a great help. > > Thanks, > Raghu > -- Andrew Klochkov Senior Software Engineer, Grid Dynamics