Re: Multiple solr instances per host vs Multiple cores in same solr instance

Erick Erickson Tue, 28 Aug 2018 11:38:04 -0700

Bernd:

If you only knew how many times I've had the conversation "No, I can't
tell you what's best, you have to test with _your_ data on _your_
hardware with _your_ queries"  ;)


I suspect, but have no real proof, that GC is the biggest difference,
Solr has we call "the laggard problem". Since one replica from each
shard _must_ respond (twice) before the query returns, the slowest
replica to respond governs the total response for any individual
query. But that's a guess. The CPU utilization might give a clue, but
if it is GC then some of the CPU cycles are being used for GC, so that
isn't definitive.

Best,
Erick
On Tue, Aug 28, 2018 at 12:37 AM Bernd Fehling
<bernd.fehl...@uni-bielefeld.de> wrote:
>
> Yes, I tested many cases.
> As I already mentioned 3 Server as 3x3 SolrCloud cluster.
> - 12 Mio. data records from our big single index
> - always the same queries (SWD, german keyword norm data)
> - Apache jmeter 3.1 for the load (separate server)
> - Haproxy 1.6.11 with roundrobin (separate server)
> - no autowarming in solr
> - always with any setup, one first (cold) run (to see how the system behaves 
> with empty caches)
> - afterwards two (warm) runs with filled caches from first and second run
> - all this with preferLocalShards set to true and false
> - and all this with single instance multicore and multi instance multinode.
> That was a lot of testing, starting, stopping, loading test data...
>
> The difference between single instance and multi instance was that
> single instance per server got 12GB JAVA heap (because it had to handle 3 
> cores)
> and multi instance got 4GB JAVA heap per instance (because each instance had 
> to handle just 1 core).
>
> No real difference in CPU/memory utilization, but I used different
> heap size between single instance and multi instance (see above).
> But the response time with multi instance is much better and gives higher 
> performance.
> Between 30 and 60 QPS multi instance is about 1.5 times better than single 
> instance
> in my test case with my test data ... and so on, but the Cloud is much more 
> complex.
>
> preferLocalShards really gives advantage in 3x3 or 5x5 SolrCloud but I don't
> know how it would compare to say 5x3 (5 server, 5 shards, 3 replicas).
>
> Servers in total:
> - 3 VM server on 3 different XEN hosts connected with 2 Gigabit Networks
>    (the discs were not SSD as in our production system, just 15rpm spinning 
> discs)
>    3 zookeeper, one on each server but separate instances (not the solr 
> internal ones)
> - 1 extra server for haproxy
> - 1 extra server for Apache jmeter
>
> It's hard to tell where the bottleneck is, at least not with 60QPS and with 
> spinning discs.
> SSD as storage and separate physical server boxes will increase performance.
>
> I think the matter is how complex is your data in the index, your query and 
> query analysis.
> My query not very easy, rows=100, facet.limit=100, 9 facet.fields and a boost 
> with bq.
> If you have rows=10 and facet=false without bq you will get higher 
> performance.
>
> Regards
> Bernd
>
>
> Am 27.08.2018 um 22:45 schrieb Wei:
> > Thanks Bernd.  Do you have preferLocalShards=true in both cases? Do you
> > notice CPU/memory utilization difference between the two deployments? How
> > many servers did you use in total?  I am curious what's the bottleneck for
> > the one instance and 3 cores configuration.
> >
> > Thanks,
> > Wei
> >
> > On Mon, Aug 27, 2018 at 1:45 AM Bernd Fehling <
> > bernd.fehl...@uni-bielefeld.de> wrote:
> >
> >> My tests with many combinations (instance, node, core) on a 3 server
> >> cluster
> >> with SolrCloud pointed out that highest performance is with multiple solr
> >> instances and shards and replicas placed by rules so that you get advantage
> >> from preferLocalShards=true.
> >>
> >> The disadvantage ist the handling of the system, which means setup,
> >> starting
> >> and stopping, setting up the shards and replicas with rules and so on.
> >>
> >> I tested with 3x3 SolrCloud (3 shards, 3 replicas).
> >> A 3x3 system with one instance and 3 cores per host could handle up to
> >> 30QPS.
> >> A 3x3 system with multi instance (different ports, single core and shard
> >> per
> >> instance) could handle 60QPS on same hardware with same data.
> >>
> >> Also, the single instance per server setup has spikes in the response time
> >> graph
> >> which are not seen with a multi instance setup.
> >>
> >> Tested about 2 month ago with SolCloud 6.4.2.
> >>
> >> Regards,
> >> Bernd
> >>
> >>
> >> Am 26.08.2018 um 08:00 schrieb Wei:
> >>> Hi,
> >>>
> >>> I have a question about the deployment configuration in solr cloud.  When
> >>> we need to increase the number of shards in solr cloud, there are two
> >>> options:
> >>>
> >>> 1.  Run multiple solr instances per host, each with a different port and
> >>> hosting a single core for one shard.
> >>>
> >>> 2.  Run one solr instance per host, and have multiple cores(shards) in
> >> the
> >>> same solr instance.
> >>>
> >>> Which would be better performance wise? For the first option I think JVM
> >>> size for each solr instance can be smaller, but deployment is more
> >>> complicated? Are there any differences for cpu utilization?
> >>>
> >>> Thanks,
> >>> Wei
> >>>
> >>
> >

Re: Multiple solr instances per host vs Multiple cores in same solr instance

Reply via email to