Re: optimize requests that fetch 1000 rows

Matteo Grolla Thu, 11 Feb 2016 08:34:15 -0800

virtual hardware, 200ms is taken on the client until response is written to
disk
qtime on solr is ~90ms
not great but acceptable


Is it possible that the method FilenameUtils.splitOnTokens is really so
heavy when requesting a lot of rows on slow hardware?

2016-02-11 17:17 GMT+01:00 Jack Krupansky <[email protected]>:

> Good to know. Hmmm... 200ms for 10 rows is not outrageously bad, but still
> relatively bad. Even 50ms for 10 rows would be considered barely okay.
> But... again it depends on query complexity - simple queries should be well
> under 50 ms for decent modern hardware.
>
> -- Jack Krupansky
>
> On Thu, Feb 11, 2016 at 10:36 AM, Matteo Grolla <[email protected]>
> wrote:
>
> > Hi Jack,
> >       response time scale with rows. Relationship doens't seem linear but
> > Below 400 rows times are much faster,
> > I view query times from solr logs and they are fast
> > the same query with rows = 1000 takes 8s
> > with rows = 10 takes 0.2s
> >
> >
> > 2016-02-11 16:22 GMT+01:00 Jack Krupansky <[email protected]>:
> >
> > > Are queries scaling linearly - does a query for 100 rows take 1/10th
> the
> > > time (1 sec vs. 10 sec or 3 sec vs. 30 sec)?
> > >
> > > Does the app need/expect exactly 1,000 documents for the query or is
> that
> > > just what this particular query happened to return?
> > >
> > > What does they query look like? Is it complex or use wildcards or
> > function
> > > queries, or is it very simple keywords? How many operators?
> > >
> > > Have you used the debugQuery=true parameter to see which search
> > components
> > > are taking the time?
> > >
> > > -- Jack Krupansky
> > >
> > > On Thu, Feb 11, 2016 at 9:42 AM, Matteo Grolla <
> [email protected]>
> > > wrote:
> > >
> > > > Hi Yonic,
> > > >      after the first query I find 1000 docs in the document cache.
> > > > I'm using curl to send the request and requesting javabin format to
> > mimic
> > > > the application.
> > > > gc activity is low
> > > > I managed to load the entire 50GB index in the filesystem cache,
> after
> > > that
> > > > queries don't cause disk activity anymore.
> > > > Time improves now queries that took ~30s take <10s. But I hoped
> better
> > > > I'm going to use jvisualvm's sampler to analyze where time is spent
> > > >
> > > >
> > > > 2016-02-11 15:25 GMT+01:00 Yonik Seeley <[email protected]>:
> > > >
> > > > > On Thu, Feb 11, 2016 at 7:45 AM, Matteo Grolla <
> > > [email protected]>
> > > > > wrote:
> > > > > > Thanks Toke, yes, they are long times, and solr qtime (to execute
> > the
> > > > > > query) is a fraction of a second.
> > > > > > The response in javabin format is around 300k.
> > > > >
> > > > > OK, That tells us a lot.
> > > > > And if you actually tested so that all the docs would be in the
> cache
> > > > > (can you verify this by looking at the cache stats after you
> > > > > re-execute?) then it seems like the slowness is down to any of:
> > > > > a) serializing the response (it doesn't seem like a 300K response
> > > > > should take *that* long to serialize)
> > > > > b) reading/processing the response (how fast the client can do
> > > > > something with each doc is also a factor...)
> > > > > c) other (GC, network, etc)
> > > > >
> > > > > You can try taking client processing out of the equation by trying
> a
> > > > > curl request.
> > > > >
> > > > > -Yonik
> > > > >
> > > >
> > >
> >
>

Re: optimize requests that fetch 1000 rows

Reply via email to