Are you requesting all 100K results in one request? If so, that is pretty fast.

If you are doing that, don't do that. Page the results.

wunder

On Jul 16, 2013, at 9:30 AM, Daniel Collins wrote:

> You only have a 20Gb collection but is that per machine or total
> collection, so 10Gb per machine?  What memory do you have available on
> those 2 machines, is it enough to get the collection into the disk cache?
> What OS is it (linux/windows, etc)?
> What heap size does your JVM have?
> Is it a static collection or are you updating it as well?
> 
> 4s for a query to 25s end to end time seems a long disparity to me, I'd be
> curious as to where the time is going. SolrCloud will distribute the
> initial queries out to the shards (but with fl=<uniquekey>,score), then it
> seconds a second request once it has the list of documents, with
> fl=<whatever you asked for> to get the stored fields.  Might be interesting
> to see if the query is 4s, how long does the stored field request take (if
> its long you might want to consider docValues or ask for less!).
> 
> If you are using SolrCloud, you should be able to see the distributed
> requests (we see 3 per "user request": distributed (on each shard),
> storedfields (on each shard that returned something) and then the "user
> request" on the machine you sent the request to), see if that gives you any
> indications where the time is going?
> 
> 
> 
> 
> On 16 July 2013 16:12, Michael Della Bitta <
> michael.della.bi...@appinions.com> wrote:
> 
>> Have you looked at cache utilization?
>> Have you checked the IO and CPU load to see what the bottlenecks are?
>> Are you sure things like your heap and servlet container threads are tuned?
>> 
>> After you look at those issues, I'd probably think about adding http
>> caching and more replicas.
>> 
>> Michael Della Bitta
>> 
>> Applications Developer
>> 
>> o: +1 646 532 3062  | c: +1 917 477 7906
>> 
>> appinions inc.
>> 
>> “The Science of Influence Marketing”
>> 
>> 18 East 41st Street
>> 
>> New York, NY 10017
>> 
>> t: @appinions <https://twitter.com/Appinions> | g+:
>> plus.google.com/appinions
>> w: appinions.com <http://www.appinions.com/>
>> 
>> 
>> On Tue, Jul 16, 2013 at 10:42 AM, adfel70 <adfe...@gmail.com> wrote:
>> 
>>> Hi
>>> I need to create a solr cluster that contains geospatial information and
>>> provides the ability to perform a few hundreds queries per second, each
>>> query should retrieve around 100k results.
>>> The data is around 100k documents, around 300gb total.
>>> 
>>> I started with 2 shard cluster (replicationFactor 1) and a portion of the
>>> data - 20 gb.
>>> 
>>> I run some load-tests and see that when 100 requests are sent in one
>>> second,
>>> the average qTime is around 4 seconds, but the average total response
>> time
>>> (measuring from sending the request to solr untill getting a response )
>>> reaches 20-25 seconds which is very bad.
>>> 
>>> Currently I load-balance myself between the 2 solr servers (each request
>> is
>>> sent to another server)
>>> 
>>> Any advice on  which resources do I need and how my solr cluster should
>>> look
>>> like?
>>> More shards? more replicas? another webserver?
>>> 
>>> Thanks.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> --
>>> View this message in context:
>>> 
>> http://lucene.472066.n3.nabble.com/Need-advice-on-performing-300-queries-per-second-on-solr-index-tp4078353.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>> 
>> 

--
Walter Underwood
wun...@wunderwood.org



Reply via email to