Re: payload queries running slow

Grant Ingersoll Mon, 21 Dec 2009 06:37:49 -0800

On Dec 20, 2009, at 3:41 AM, Raghuveer Kancherla wrote:

> Hi Grant,
> My queries are about 5 times slower when using payloads as compared to
> queries that dont use payloads on the same index. I have not done any
> profiling yet, I am trying out lucid gaze now.


How do they compare to just doing SpanQueries?  Would be interesting to see the 
three:
1. "Normal" queries
2. Span Queries
3. Payloads


> I do all the load testing after warming up.
> Since my index is small ~1 GB, was wondering if a ramDirectory will help
> instead of the default Directory implementation for the indexReader?
> 

I suppose, but probably not that big of a difference on a properly warmed index.


> Thanks,
> Raghu
> 
> 
> 
> On Thu, Dec 17, 2009 at 6:58 PM, Grant Ingersoll <gsing...@apache.org>wrote:
> 
>> 
>> On Dec 17, 2009, at 4:52 AM, Raghuveer Kancherla wrote:
>> 
>>> Hi,
>>> With help from the group here, I have been able to set up a search
>>> application with payloads enabled. However, there is a noticeable
>> increase
>>> in query response times with payloads as compared to the same queries
>>> without payloads. I am also seeing a lot more disk IO (I have a 7200 rpm
>>> disk) and comparatively lesser cpu usage.
>>> 
>>> I am guessing this is because of the use of payloadTermQuery and
>>> payloadNearQuery  both of which extend SpanQuery formats. SpanQueries
>> read
>>> the positions index which will be much larger than the index accessed by
>> a
>>> simple TermQuery.
>>> 
>>> Is there any way of making this system faster without having to
>> distribute
>>> the index. My index size is hardly 1GB (~200k documents and only one
>> field
>>> to search in). I am experiencing query times as high as 2 seconds
>> (average).
>>> 
>>> Any indications on the direction in which I can experiment will also be
>> very
>>> helpful.
>>> 
>> 
>> Yeah, payloads are going to be slower, but how much slower are they for
>> you? Are you warming up those queries?
>> 
>> Also, have you done any profiling?
>> 
>> 
>>> I looked at HathiTrust digital library articles. The methods indicated
>> there
>>> talk about avoiding reading the positions index (converting PhraseQueries
>> to
>>> TermQueries). That will not work in my case because, I still have to read
>>> the positions index to get the payload information during scoring. Let me
>>> know if my understanding is incorrect.
>>> 
>>> 
>>> Thanks,
>>> -Raghu
>> 
>> --------------------------
>> Grant Ingersoll
>> http://www.lucidimagination.com/
>> 
>> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
>> Solr/Lucene:
>> http://www.lucidimagination.com/search
>> 
>>

Re: payload queries running slow

Reply via email to