Hi Jack, tell me if I'm wrong but qtime accounts for search time excluding the fetch of stored fields (I have a 90ms qtime and a ~30s time to obtain the results on the client on a LAN infrastructure for 300kB response). debug explains how much of qtime is used by each search component. For me 90ms are ok, I wouldn't spend time trying to make them 50ms, it's the ~30s to obtain the response that I'd like to tackle.
2016-02-12 5:42 GMT+01:00 Jack Krupansky <jack.krupan...@gmail.com>: > Again, first things first... debugQuery=true and see which Solr search > components are consuming the bulk of qtime. > > -- Jack Krupansky > > On Thu, Feb 11, 2016 at 11:33 AM, Matteo Grolla <matteo.gro...@gmail.com> > wrote: > > > virtual hardware, 200ms is taken on the client until response is written > to > > disk > > qtime on solr is ~90ms > > not great but acceptable > > > > Is it possible that the method FilenameUtils.splitOnTokens is really so > > heavy when requesting a lot of rows on slow hardware? > > > > 2016-02-11 17:17 GMT+01:00 Jack Krupansky <jack.krupan...@gmail.com>: > > > > > Good to know. Hmmm... 200ms for 10 rows is not outrageously bad, but > > still > > > relatively bad. Even 50ms for 10 rows would be considered barely okay. > > > But... again it depends on query complexity - simple queries should be > > well > > > under 50 ms for decent modern hardware. > > > > > > -- Jack Krupansky > > > > > > On Thu, Feb 11, 2016 at 10:36 AM, Matteo Grolla < > matteo.gro...@gmail.com > > > > > > wrote: > > > > > > > Hi Jack, > > > > response time scale with rows. Relationship doens't seem linear > > but > > > > Below 400 rows times are much faster, > > > > I view query times from solr logs and they are fast > > > > the same query with rows = 1000 takes 8s > > > > with rows = 10 takes 0.2s > > > > > > > > > > > > 2016-02-11 16:22 GMT+01:00 Jack Krupansky <jack.krupan...@gmail.com > >: > > > > > > > > > Are queries scaling linearly - does a query for 100 rows take > 1/10th > > > the > > > > > time (1 sec vs. 10 sec or 3 sec vs. 30 sec)? > > > > > > > > > > Does the app need/expect exactly 1,000 documents for the query or > is > > > that > > > > > just what this particular query happened to return? > > > > > > > > > > What does they query look like? Is it complex or use wildcards or > > > > function > > > > > queries, or is it very simple keywords? How many operators? > > > > > > > > > > Have you used the debugQuery=true parameter to see which search > > > > components > > > > > are taking the time? > > > > > > > > > > -- Jack Krupansky > > > > > > > > > > On Thu, Feb 11, 2016 at 9:42 AM, Matteo Grolla < > > > matteo.gro...@gmail.com> > > > > > wrote: > > > > > > > > > > > Hi Yonic, > > > > > > after the first query I find 1000 docs in the document > cache. > > > > > > I'm using curl to send the request and requesting javabin format > to > > > > mimic > > > > > > the application. > > > > > > gc activity is low > > > > > > I managed to load the entire 50GB index in the filesystem cache, > > > after > > > > > that > > > > > > queries don't cause disk activity anymore. > > > > > > Time improves now queries that took ~30s take <10s. But I hoped > > > better > > > > > > I'm going to use jvisualvm's sampler to analyze where time is > spent > > > > > > > > > > > > > > > > > > 2016-02-11 15:25 GMT+01:00 Yonik Seeley <ysee...@gmail.com>: > > > > > > > > > > > > > On Thu, Feb 11, 2016 at 7:45 AM, Matteo Grolla < > > > > > matteo.gro...@gmail.com> > > > > > > > wrote: > > > > > > > > Thanks Toke, yes, they are long times, and solr qtime (to > > execute > > > > the > > > > > > > > query) is a fraction of a second. > > > > > > > > The response in javabin format is around 300k. > > > > > > > > > > > > > > OK, That tells us a lot. > > > > > > > And if you actually tested so that all the docs would be in the > > > cache > > > > > > > (can you verify this by looking at the cache stats after you > > > > > > > re-execute?) then it seems like the slowness is down to any of: > > > > > > > a) serializing the response (it doesn't seem like a 300K > response > > > > > > > should take *that* long to serialize) > > > > > > > b) reading/processing the response (how fast the client can do > > > > > > > something with each doc is also a factor...) > > > > > > > c) other (GC, network, etc) > > > > > > > > > > > > > > You can try taking client processing out of the equation by > > trying > > > a > > > > > > > curl request. > > > > > > > > > > > > > > -Yonik > > > > > > > > > > > > > > > > > > > > > > > > > > > >