Re: optimize requests that fetch 1000 rows

Matteo Grolla Fri, 12 Feb 2016 01:58:04 -0800

Hi Jack,
     tell me if I'm wrong but qtime accounts for search time excluding the
fetch of stored fields (I have a 90ms qtime and a ~30s time to obtain the
results on the client on a LAN infrastructure for 300kB response). debug
explains how much of qtime is used by each search component.
For me 90ms are ok, I wouldn't spend time trying to make them 50ms, it's
the ~30s to obtain the response that I'd like to tackle.



2016-02-12 5:42 GMT+01:00 Jack Krupansky <jack.krupan...@gmail.com>:

> Again, first things first... debugQuery=true and see which Solr search
> components are consuming the bulk of qtime.
>
> -- Jack Krupansky
>
> On Thu, Feb 11, 2016 at 11:33 AM, Matteo Grolla <matteo.gro...@gmail.com>
> wrote:
>
> > virtual hardware, 200ms is taken on the client until response is written
> to
> > disk
> > qtime on solr is ~90ms
> > not great but acceptable
> >
> > Is it possible that the method FilenameUtils.splitOnTokens is really so
> > heavy when requesting a lot of rows on slow hardware?
> >
> > 2016-02-11 17:17 GMT+01:00 Jack Krupansky <jack.krupan...@gmail.com>:
> >
> > > Good to know. Hmmm... 200ms for 10 rows is not outrageously bad, but
> > still
> > > relatively bad. Even 50ms for 10 rows would be considered barely okay.
> > > But... again it depends on query complexity - simple queries should be
> > well
> > > under 50 ms for decent modern hardware.
> > >
> > > -- Jack Krupansky
> > >
> > > On Thu, Feb 11, 2016 at 10:36 AM, Matteo Grolla <
> matteo.gro...@gmail.com
> > >
> > > wrote:
> > >
> > > > Hi Jack,
> > > >       response time scale with rows. Relationship doens't seem linear
> > but
> > > > Below 400 rows times are much faster,
> > > > I view query times from solr logs and they are fast
> > > > the same query with rows = 1000 takes 8s
> > > > with rows = 10 takes 0.2s
> > > >
> > > >
> > > > 2016-02-11 16:22 GMT+01:00 Jack Krupansky <jack.krupan...@gmail.com
> >:
> > > >
> > > > > Are queries scaling linearly - does a query for 100 rows take
> 1/10th
> > > the
> > > > > time (1 sec vs. 10 sec or 3 sec vs. 30 sec)?
> > > > >
> > > > > Does the app need/expect exactly 1,000 documents for the query or
> is
> > > that
> > > > > just what this particular query happened to return?
> > > > >
> > > > > What does they query look like? Is it complex or use wildcards or
> > > > function
> > > > > queries, or is it very simple keywords? How many operators?
> > > > >
> > > > > Have you used the debugQuery=true parameter to see which search
> > > > components
> > > > > are taking the time?
> > > > >
> > > > > -- Jack Krupansky
> > > > >
> > > > > On Thu, Feb 11, 2016 at 9:42 AM, Matteo Grolla <
> > > matteo.gro...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi Yonic,
> > > > > >      after the first query I find 1000 docs in the document
> cache.
> > > > > > I'm using curl to send the request and requesting javabin format
> to
> > > > mimic
> > > > > > the application.
> > > > > > gc activity is low
> > > > > > I managed to load the entire 50GB index in the filesystem cache,
> > > after
> > > > > that
> > > > > > queries don't cause disk activity anymore.
> > > > > > Time improves now queries that took ~30s take <10s. But I hoped
> > > better
> > > > > > I'm going to use jvisualvm's sampler to analyze where time is
> spent
> > > > > >
> > > > > >
> > > > > > 2016-02-11 15:25 GMT+01:00 Yonik Seeley <ysee...@gmail.com>:
> > > > > >
> > > > > > > On Thu, Feb 11, 2016 at 7:45 AM, Matteo Grolla <
> > > > > matteo.gro...@gmail.com>
> > > > > > > wrote:
> > > > > > > > Thanks Toke, yes, they are long times, and solr qtime (to
> > execute
> > > > the
> > > > > > > > query) is a fraction of a second.
> > > > > > > > The response in javabin format is around 300k.
> > > > > > >
> > > > > > > OK, That tells us a lot.
> > > > > > > And if you actually tested so that all the docs would be in the
> > > cache
> > > > > > > (can you verify this by looking at the cache stats after you
> > > > > > > re-execute?) then it seems like the slowness is down to any of:
> > > > > > > a) serializing the response (it doesn't seem like a 300K
> response
> > > > > > > should take *that* long to serialize)
> > > > > > > b) reading/processing the response (how fast the client can do
> > > > > > > something with each doc is also a factor...)
> > > > > > > c) other (GC, network, etc)
> > > > > > >
> > > > > > > You can try taking client processing out of the equation by
> > trying
> > > a
> > > > > > > curl request.
> > > > > > >
> > > > > > > -Yonik
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: optimize requests that fetch 1000 rows

Reply via email to