Re: optimize requests that fetch 1000 rows

Matteo Grolla Thu, 11 Feb 2016 07:17:37 -0800

[image: Immagine incorporata 1]

2016-02-11 16:05 GMT+01:00 Matteo Grolla <matteo.gro...@gmail.com>:


> I see a lot of time spent in splitOnTokens
>
> which is called by (last part of stack trace)
>
> BinaryResponseWriter$Resolver.writeResultsBody()
> ...
> solr.search.ReturnsField.wantsField()
> commons.io.FileNameUtils.wildcardmatch()
> commons.io.FileNameUtils.splitOnTokens()
>
>
>
> 2016-02-11 15:42 GMT+01:00 Matteo Grolla <matteo.gro...@gmail.com>:
>
>> Hi Yonic,
>>      after the first query I find 1000 docs in the document cache.
>> I'm using curl to send the request and requesting javabin format to mimic
>> the application.
>> gc activity is low
>> I managed to load the entire 50GB index in the filesystem cache, after
>> that queries don't cause disk activity anymore.
>> Time improves now queries that took ~30s take <10s. But I hoped better
>> I'm going to use jvisualvm's sampler to analyze where time is spent
>>
>>
>> 2016-02-11 15:25 GMT+01:00 Yonik Seeley <ysee...@gmail.com>:
>>
>>> On Thu, Feb 11, 2016 at 7:45 AM, Matteo Grolla <matteo.gro...@gmail.com>
>>> wrote:
>>> > Thanks Toke, yes, they are long times, and solr qtime (to execute the
>>> > query) is a fraction of a second.
>>> > The response in javabin format is around 300k.
>>>
>>> OK, That tells us a lot.
>>> And if you actually tested so that all the docs would be in the cache
>>> (can you verify this by looking at the cache stats after you
>>> re-execute?) then it seems like the slowness is down to any of:
>>> a) serializing the response (it doesn't seem like a 300K response
>>> should take *that* long to serialize)
>>> b) reading/processing the response (how fast the client can do
>>> something with each doc is also a factor...)
>>> c) other (GC, network, etc)
>>>
>>> You can try taking client processing out of the equation by trying a
>>> curl request.
>>>
>>> -Yonik
>>>
>>
>>
>

Re: optimize requests that fetch 1000 rows

Reply via email to