Hi, With help from the group here, I have been able to set up a search application with payloads enabled. However, there is a noticeable increase in query response times with payloads as compared to the same queries without payloads. I am also seeing a lot more disk IO (I have a 7200 rpm disk) and comparatively lesser cpu usage.
I am guessing this is because of the use of payloadTermQuery and payloadNearQuery both of which extend SpanQuery formats. SpanQueries read the positions index which will be much larger than the index accessed by a simple TermQuery. Is there any way of making this system faster without having to distribute the index. My index size is hardly 1GB (~200k documents and only one field to search in). I am experiencing query times as high as 2 seconds (average). Any indications on the direction in which I can experiment will also be very helpful. I looked at HathiTrust digital library articles. The methods indicated there talk about avoiding reading the positions index (converting PhraseQueries to TermQueries). That will not work in my case because, I still have to read the positions index to get the payload information during scoring. Let me know if my understanding is incorrect. Thanks, -Raghu
