Turns out it was a case of an oversite. My warming queries weren't setting the sort order and as a result don't successfully complete. After setting the sort order things appear to be responding quickly.
Thanks for the help. On Mon, Jun 17, 2013 at 9:45 AM, Shane Perry <thry...@gmail.com> wrote: > Using 4.3.1-SNAPSHOT I have identified where the issue is occurring. For > a query in the format (it returns one document, sorted by field4) > > +(field0:UUID0) -field1:string0 +field2:string1 +field3:text0 > +field4:"text1" > > > with the field types > > <fieldType name="uuid" class="solr.UUIDField" indexed="true"/> > > <fieldType name="string" class="solr.StrField" sortMissingFirst="true" > omitNorms="true"/> > > <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> > <analyzer> > <charFilter class="solr.HTMLStripCharFilterFactory"/> > <tokenizer class="solr.StandardTokenizerFactory"/> > <filter class="solr.StandardFilterFactory"/> > <filter class="solr.TrimFilterFactory"/> > <filter class="solr.ICUFoldingFilterFactory"/> > </analyzer> > </fieldType> > > > the method FieldCacheImpl$SortedDocValuesCache#createValue, the reader > reports 2640449 terms. As a result, the loop on line 1198 is > executed 2640449 and the inner loop is executed a total of 658310778. My > index contains 56180128 documents. > > My configuration file sets the <queries> for the newSearcher and > firstSearcher listeners to the value > > <lst> > <str name="q">static firstSearcher warming in solrconfig.xml</str> > <str name="sort">field4</str> > </lst> > > > which does not appear to affect the speed. I'm not sure how replication > plays into the equation outside the fact that we are relatively aggressive > on the replication (every 60 seconds). I fear I may be at the end of my > knowledge without really getting into the code so any help at this point > would be greatly appreciated. > > Shane > > On Thu, Jun 13, 2013 at 4:11 PM, Shane Perry <thry...@gmail.com> wrote: > >> I've dug through the code and have narrowed the delay down >> to TopFieldCollector$OneComparatorNonScoringCollector.setNextReader() at >> the point where the comparator's setNextReader() method is called (line 98 >> in the lucene_solr_4_3 branch). That line is actually two method calls so >> I'm not yet certain which path is the cause. I'll continue to dig through >> the code but am on thin ice so input would be great. >> >> Shane >> >> >> On Thu, Jun 13, 2013 at 7:56 AM, Shane Perry <thry...@gmail.com> wrote: >> >>> Erick, >>> >>> We do have soft commits turned. Initially, autoCommit was set at 15000 >>> and autoSoftCommit at 1000. We did up those to 1200000 and 600000 >>> respectively. However, since the core in question is a slave, we don't >>> actually do writes to the core but rely on replication only to populate the >>> index. In this case wouldn't autoCommit and autoSoftCommit essentially be >>> no-ops? I thought I had pulled out all hard commits but a double check >>> shows one instance where it still occurs. >>> >>> Thanks for your time. >>> >>> Shane >>> >>> On Thu, Jun 13, 2013 at 5:19 AM, Erick Erickson <erickerick...@gmail.com >>> > wrote: >>> >>>> Shane: >>>> >>>> You've covered all the config stuff that I can think of. There's one >>>> other possibility. Do you have the soft commits turned on and are >>>> they very short? Although soft commits shouldn't invalidate any >>>> segment-level caches (but I'm not sure whether the sorting buffers >>>> are low-level or not). >>>> >>>> About the only other thing I can think of is that you're somehow >>>> doing hard commits from, say, the client but that's really >>>> stretching. >>>> >>>> All I can really say at this point is that this isn't a problem I've >>>> seen >>>> before, so it's _likely_ some innocent-seeming config has changed. >>>> I'm sure it'll be obvious once you find it <G>... >>>> >>>> Erick >>>> >>>> On Wed, Jun 12, 2013 at 11:51 PM, Shane Perry <thry...@gmail.com> >>>> wrote: >>>> > Erick, >>>> > >>>> > I agree, it doesn't make sense. I manually merged the solrconfig.xml >>>> from >>>> > the distribution example with my 3.6 solrconfig.xml, pulling out what >>>> I >>>> > didn't need. There is the possibility I removed something I >>>> shouldn't have >>>> > though I don't know what it would be. Minus removing the dynamic >>>> fields, a >>>> > custom tokenizer class, and changing all my fields to be stored, the >>>> > schema.xml file should be the same as well. I'm not currently in the >>>> > position to do so, but I'll double check those two files. Finally, >>>> the >>>> > data was re-indexed when I moved to 4.3. >>>> > >>>> > My statement about field values wasn't stated very well. What I >>>> meant is >>>> > that the 'text' field has more unique terms than some of my other >>>> fields. >>>> > >>>> > As for this being an edge case, I'm not sure why it would manifest >>>> itself >>>> > in 4.3 but not in 3.6 (short of me having a screwy configuration >>>> setting). >>>> > If I get a chance, I'll see if I can duplicate the behavior with a >>>> small >>>> > document count in a sandboxed environment. >>>> > >>>> > Shane >>>> > >>>> > On Wed, Jun 12, 2013 at 5:14 PM, Erick Erickson < >>>> erickerick...@gmail.com>wrote: >>>> > >>>> >> This doesn't make much sense, particularly the fact >>>> >> that you added first/new searchers. I'm assuming that >>>> >> these are sorting on the same field as your slow query. >>>> >> >>>> >> But sorting on a text field for which >>>> >> "Overall, the values of the field are unique" >>>> >> is a red-flag. Solr doesn't sort on fields that have >>>> >> more than one term, so you might as well use a >>>> >> string field and be done with it, it's possible you're >>>> >> hitting some edge case. >>>> >> >>>> >> Did you just copy your 3.6 schema and configs to >>>> >> 4.3? Did you re-index? >>>> >> >>>> >> Best >>>> >> Erick >>>> >> >>>> >> On Wed, Jun 12, 2013 at 5:11 PM, Shane Perry <thry...@gmail.com> >>>> wrote: >>>> >> > Thanks for the responses. >>>> >> > >>>> >> > Setting first/newSearcher had no noticeable effect. I'm sorting >>>> on a >>>> >> > stored/indexed field named 'text' who's fieldType is >>>> solr.TextField. >>>> >> > Overall, the values of the field are unique. The JVM is only >>>> using about >>>> >> > 2G of the available 12G, so no OOM/GC issue (at least on the >>>> surface). >>>> >> The >>>> >> > server is question is a slave with approximately 56 million >>>> documents. >>>> >> > Additionally, sorting on a field of the same type but with >>>> significantly >>>> >> > less uniqueness results quick response times. >>>> >> > >>>> >> > The following is a sample of *debugQuery=true* for a query which >>>> returns >>>> >> 1 >>>> >> > document: >>>> >> > >>>> >> > <lst name="process"> >>>> >> > <double name="time">61458.0</double> >>>> >> > <lst name="query"> >>>> >> > <double name="time">61452.0</double> >>>> >> > </lst> >>>> >> > <lst name="facet"> >>>> >> > <double name="time">0.0</double> >>>> >> > </lst> >>>> >> > <lst name="mlt"> >>>> >> > <double name="time">0.0</double> >>>> >> > </lst> >>>> >> > <lst name="highlight"> >>>> >> > <double name="time">0.0</double> >>>> >> > </lst> >>>> >> > <lst name="stats"> >>>> >> > <double name="time">0.0</double> >>>> >> > </lst> >>>> >> > <lst name="debug"> >>>> >> > <double name="time">6.0</double> >>>> >> > </lst> >>>> >> > </lst> >>>> >> > >>>> >> > >>>> >> > -- Update -- >>>> >> > >>>> >> > Out of desperation, I turned off replication by commenting out the >>>> *<list >>>> >> > name="slave">* element in the replication requestHandler block. >>>> After >>>> >> > restarting tomcat I was surprised to find that the replication >>>> admin UI >>>> >> > still reported the core as replicating. Search queries were still >>>> slow. >>>> >> I >>>> >> > then disabled replication via the UI and the display updated to >>>> report >>>> >> the >>>> >> > core was no longer replicating. Queries are now fast so it >>>> appears that >>>> >> > the sorting may be a red-herring. >>>> >> > >>>> >> > It's may be of note to also mention that the slow queries don't >>>> appear to >>>> >> > be getting cached. >>>> >> > >>>> >> > Thanks again for the feed back. >>>> >> > >>>> >> > On Wed, Jun 12, 2013 at 2:33 PM, Jack Krupansky < >>>> j...@basetechnology.com >>>> >> >wrote: >>>> >> > >>>> >> >> Rerun the sorted query with &debugQuery=true and look at the >>>> module >>>> >> >> timings. See what stands out >>>> >> >> >>>> >> >> Are you actually sorting on a "text" field, as opposed to a >>>> "string" >>>> >> field? >>>> >> >> >>>> >> >> Of course, it's always possible that maybe you're hitting some odd >>>> >> OOM/GC >>>> >> >> condition as a result of Solr growing between releases. >>>> >> >> >>>> >> >> -- Jack Krupansky >>>> >> >> >>>> >> >> -----Original Message----- From: Shane Perry >>>> >> >> Sent: Wednesday, June 12, 2013 3:00 PM >>>> >> >> To: solr-user@lucene.apache.org >>>> >> >> Subject: Sorting by field is slow >>>> >> >> >>>> >> >> >>>> >> >> In upgrading from Solr 3.6.1 to 4.3.0, our query response time has >>>> >> >> increased exponentially. After testing in 4.3.0 it appears the >>>> same >>>> >> query >>>> >> >> (with 1 matching document) returns after 100 ms without sorting >>>> but >>>> >> takes 1 >>>> >> >> minute when sorting by a text field. I've looked around but >>>> haven't yet >>>> >> >> found a reason for the degradation. Can someone give me some >>>> insight or >>>> >> >> point me in the right direction for resolving this? In most >>>> cases, I >>>> >> can >>>> >> >> change my code to do client-side sorting but I do have a couple of >>>> >> >> situations where pagination prevents client-side sorting. Any >>>> help >>>> >> would >>>> >> >> be greatly appreciated. >>>> >> >> >>>> >> >> Thanks, >>>> >> >> >>>> >> >> Shane >>>> >> >> >>>> >> >>>> >>> >>> >> >