On Dec 17, 2009, at 4:52 AM, Raghuveer Kancherla wrote: > Hi, > With help from the group here, I have been able to set up a search > application with payloads enabled. However, there is a noticeable increase > in query response times with payloads as compared to the same queries > without payloads. I am also seeing a lot more disk IO (I have a 7200 rpm > disk) and comparatively lesser cpu usage. > > I am guessing this is because of the use of payloadTermQuery and > payloadNearQuery both of which extend SpanQuery formats. SpanQueries read > the positions index which will be much larger than the index accessed by a > simple TermQuery. > > Is there any way of making this system faster without having to distribute > the index. My index size is hardly 1GB (~200k documents and only one field > to search in). I am experiencing query times as high as 2 seconds (average). > > Any indications on the direction in which I can experiment will also be very > helpful. >
Yeah, payloads are going to be slower, but how much slower are they for you? Are you warming up those queries? Also, have you done any profiling? > I looked at HathiTrust digital library articles. The methods indicated there > talk about avoiding reading the positions index (converting PhraseQueries to > TermQueries). That will not work in my case because, I still have to read > the positions index to get the payload information during scoring. Let me > know if my understanding is incorrect. > > > Thanks, > -Raghu -------------------------- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search