On Dec 17, 2009, at 4:52 AM, Raghuveer Kancherla wrote:

> Hi,
> With help from the group here, I have been able to set up a search
> application with payloads enabled. However, there is a noticeable increase
> in query response times with payloads as compared to the same queries
> without payloads. I am also seeing a lot more disk IO (I have a 7200 rpm
> disk) and comparatively lesser cpu usage.
> 
> I am guessing this is because of the use of payloadTermQuery and
> payloadNearQuery  both of which extend SpanQuery formats. SpanQueries read
> the positions index which will be much larger than the index accessed by a
> simple TermQuery.
> 
> Is there any way of making this system faster without having to distribute
> the index. My index size is hardly 1GB (~200k documents and only one field
> to search in). I am experiencing query times as high as 2 seconds (average).
> 
> Any indications on the direction in which I can experiment will also be very
> helpful.
> 

Yeah, payloads are going to be slower, but how much slower are they for you? 
Are you warming up those queries?  

Also, have you done any profiling?


> I looked at HathiTrust digital library articles. The methods indicated there
> talk about avoiding reading the positions index (converting PhraseQueries to
> TermQueries). That will not work in my case because, I still have to read
> the positions index to get the payload information during scoring. Let me
> know if my understanding is incorrect.
> 
> 
> Thanks,
> -Raghu

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using 
Solr/Lucene:
http://www.lucidimagination.com/search

Reply via email to