Re: Dergraded performance between Solr 4 and Solr 5

Erick Erickson Wed, 27 Apr 2016 18:52:15 -0700

This is rather strange, not sure what's going on here, I'll have
to leave it to others to speculate I'm afraid...


Although I do wonder what a profiling tool would show.

Best,
Erick

On Wed, Apr 27, 2016 at 8:51 AM, Jaroslaw Rozanski
<s...@jarekrozanski.com> wrote:
> Ok, so here is interesting find.
>
> As my setup requires frequent (soft) commits cache brings little value.
> I tested following on Solr 5.5.0:
>
> q={!cache=false}*:*&
> fq={!cache=false}query1 /* not expensive */&
> fq={!cache=false cost=200}query2 /* expensive! */&
>
> Only with above set-up (and forcing Solr Post Filtering for expensive
> query, hence cost 200) I was able to return to Solr 4.10.3 performance.
>
> By Solr 4 performance I mean:
> - not only Solr 4 response times (roughly) for queries returning values,
> but also
> - very fast response for queries that have 0 results
>
> I wonder what could be the underlying cause.
>
> Thanks,
> Jarek
>
> On Wed, 27 Apr 2016, at 09:13, Jaroslaw Rozanski wrote:
>> Hi Eric,
>>
>> Measuring running queries via JMeter. Values provided are rounded median
>> of multiple samples. Medians are just slightly better than 99th
>> percentile for all samples.
>>
>> Filter cache is useless as you mentioned; they are effectively not used.
>> There is auto-warming through cache autoWarm but no auto-warming
>> queries.
>>
>> Small experiment with passing &NOW=... seems not to make any difference
>> which would not be surprising given caches are barely involved.
>>
>> Thanks for the suggestion on IO. After stopping indexing, the response
>> time barely changed on Solr 5. On Solr 4, with indexing running it is
>> still fast. So to effectively, Solr 4 under indexing load is faster than
>> idle Solr 5. Both set-ups have same heap size and available RAM on
>> machine (2x heap).
>>
>> One other thing I am testing is issuing request to specific core, with
>> distrib=false. No significant improvements there.
>>
>> Now what is interesting is that aforementioned query takes the same
>> amount of time to execute despite the number of documents found.
>> - Whether it is 0 or 10k, it takes couple seconds on Solr 5.5.0.
>> - Meanwhile, on Solr 4.10.3, the response time is dependent on results
>> size. For Solr 4 no results returns in few ms and few seconds for couple
>> thousands of results.
>> (query used {!cache=false}q=...)
>>
>>
>> Thanks,
>> Jarek
>>
>> On Wed, 27 Apr 2016, at 04:39, Erick Erickson wrote:
>> > Well, the first question is always "how are you measuring this"?
>> > Measuring a few queries is almost completely uninformative,
>> > especially if the two systems have differing warmups. The only
>> > meaningful measurements are when throwing away the first bunch
>> > of queries then measuring a meaningful sample.
>> >
>> > The setup you describe will be very sensitive to disk access
>> > with the autowarm of 1 second, so if there's much at all in
>> > the way of differences in I/O that would be a red flag.
>> >
>> > From here on down doesn't really respond to the question, but
>> > I thought I'd mention it.
>> >
>> > And you don't have to worry about disabling your fitlerCache since
>> > any filter query of the form fq=field:[mention NOW in here without
>> > rounding]
>> > never be re-used. So you might as well use {!cache=false}. Here's the
>> > background:
>> >
>> > https://lucidworks.com/blog/2012/02/23/date-math-now-and-filter-queries/
>> >
>> > And your soft commit is probably throwing out all the filter caches
>> > anyway.
>> >
>> > I doubt you're doing any autowarming at all given the autocommit interval
>> > of 1 second and continuously updating documents and your reported
>> > query times. So you can pretty much forget what I said about throwing
>> > out your first N queries since you're (probably) not getting any benefit
>> > out of caches anyway.
>> >
>> > On Tue, Apr 26, 2016 at 10:34 AM, Jaroslaw Rozanski
>> > <s...@jarekrozanski.com> wrote:
>> > > Hi all,
>> > >
>> > > I am migrating a large Solr Cloud cluster from Solr 4.10 to Solr 5.5.0
>> > > and I observed big difference in query execution time.
>> > >
>> > > First a setup summary:
>> > > - multiple collections - 6
>> > > - each has multiple shards - 6
>> > > - same/similar hardware
>> > > - indexing tens of messages per second
>> > > - autoSoftCommit with 1s; hard commit few tens of seconds
>> > > - Java 8
>> > >
>> > > The query has following form: field1:[* TO NOW-14DAYS] OR (-field1:[* TO
>> > > *] AND field2:[* TO NOW-14DAYS])
>> > >
>> > > The fields field1 & field2 are of date type:
>> > > <fieldType name="date" class="solr.TrieDateField" precisionStep="0"
>> > > positionIncrementGap="0"/>
>> > >
>> > > As query (q={!cache=false}...)
>> > > Solr 4.10 -> 5s
>> > > Solr 5.5.0 -> 12s
>> > >
>> > > As filter query (q={!cache=false}*:*&fq=..,)
>> > > Solr 4.10 -> 9s
>> > > Solr 5.5.0 -> 11s
>> > >
>> > > The query itself is bad and its optimization aside, I am wondering if
>> > > there is anything in Lucene/Solr that would have such an impact on query
>> > > execution time between versions.
>> > >
>> > > Originally I though it might be related to
>> > > https://issues.apache.org/jira/browse/SOLR-8251 and testing on small
>> > > scale proved that there is a difference in performance. However upgraded
>> > > version is already 5.5.0.
>> > >
>> > >
>> > >
>> > > Thanks,
>> > > Jarek
>> > >

Re: Dergraded performance between Solr 4 and Solr 5

Reply via email to