Hi Tomas,
I give you other details.

   - The fragment field contains 3KB xml messages.
   - The queries that I used for the test are (I only change the word to
   search inside the fragment field between requests): curl "
   
http://localhost:8983/solr/sepa/select?q=+fragment%3A*A*+&fq=marked%3AT&fq=-fragmentContentType%3ABULK&start=0&rows=100&sort=creationTimestamp+desc%2Cid+asc";

   - All the tests was executed inside VMs on dedicated HW in details:

2 Hypervisor ESX 5.5 on:


   - Server PowerEdge T420 - Dual Xeon E5-2420 with 128Gb di RAM
   - RAID10 local storage, 4xNear Line Sas 7.200 (about 100MB/s guaranteed
   bandwidth)


I have executed another test with the configuration: 8 shards of 35M
documents on VM1 and 8 empty shards on VM2 (CONF4). The configuration is
without replica.

We can now compare the response times (in seconds) for CONF2 and CONF4:


   - without indexing operations


   -

   CONF2
   -

      *sequential: 12,3 **17,4*
      -

      5 parallel: 32,5 34,2
      -

      10 parallel: 45,4 49
      -

      20 parallel: 64,6 74


   -

   CONF4
   -

      sequential: 5 9,1
      -

      5 parallel: 25 31
      -

      10 parallel: 41 49
      -

      20 parallel: 60 73



   - with indexing operations



   -

   CONF2
   -

      sequential: 12,3 19
      -

      5 parallel: 39 40,8
      -

      10 parallel: 56,6 62,9
      -

      *20 parallel: 79 116*


   -

   CONF4
   -

      sequential: 15,5 17,5
      -

      5 parallel: 30,7 38,3
      -

      10 parallel: 57,5 64,2
      -

      20 parallel: 60 81,4


During the test:

   - CONF2: 8 core on VM1 and 8 core on VM2 100% used (except for
   sequential test without indexing operations where the usage was about 80%).
   - CONF4: 8 core on VM1 100% used


As you can see performance are similar for tests with 5 and 10 parallel
requests both with during indexing operations and without indexing
operations but very different
with sequential requests and with 20 parallel requests. I don't understand
why.

Thanks,
Luca

On Fri, Jan 8, 2016 at 6:47 PM, Tomás Fernández Löbbe <tomasflo...@gmail.com
> wrote:

> Hi Luca,
> It looks like your queries are complex wildcard queries. My theory is that
> you are CPU-bounded, for a single query one CPU core for each shard will be
> at 100% for the duration of the sub-query. Smaller shards make these
> sub-queries faster which is why 16 shards is better than 8 in your case.
> * In your 16x1 configuration, you have exactly one shard per CPU core, so
> in a single query, 16 subqueries will go to both nodes evenly and use one
> of the CPU cores.
> * In your 8x2 configuration, you still get to use one CPU core per shard,
> but the shards are bigger, so maybe each subquery takes longer (for the
> single query thread and 8x2 scenario I would expect CPU utilization to be
> lower?).
> * In your 16x2 case 16 subqueries will be distributed un-evenly, and some
> node will get more than 8 subqueries, which means that some of the
> subqueries will have to wait for their turn for a CPU core. In addition,
> more Solr cores will be competing for resources.
> If this theory is correct, adding more replicas won't speedup your queries,
> you need to either get faster CPU or simplify your queries/configuration in
> some way. Adding more replicas should improve your query throughput, but
> only if you add them in more HW, not the same one.
>
> ...anyway, just a theory
>
> Tomás
>
> On Fri, Jan 8, 2016 at 7:40 AM, Shawn Heisey <apa...@elyograg.org> wrote:
>
> > On 1/8/2016 7:55 AM, Luca Quarello wrote:
> > > I used solr5.3.1 and I sincerely expected response times with replica
> > > configuration near to response times without replica configuration.
> > >
> > > Do you agree with me?
> > >
> > > I read here
> > >
> >
> http://lucene.472066.n3.nabble.com/Solr-Cloud-Query-Scaling-td4110516.html
> > > that "Queries do not need to be routed to leaders; they can be handled
> by
> > > any replica in a shard. Leaders are only needed for handling update
> > > requests. "
> > >
> > > I haven't found this behaviour. In my case CONF2 e CONF3 have all
> > replicas
> > > on VM2 but analyzing core utilization during a request is 100% on both
> > > machines. Why?
> >
> > Indexing is a little bit slower with replication -- the update must
> > happen on all replicas.
> >
> > If your index is sharded (which I believe you did indicate in your
> > initial message), you may find that all replicas get used even for
> > queries.  It is entirely possible that some of the shard subqueries will
> > be processed on one replica and some of them will be processed on other
> > replicas.  I do not know if this commonly happens, but I would not be
> > surprised if it does.  If the machines are sized appropriately for the
> > index, this separation should speed up queries, because you have the
> > resources of multiple machines handling one query.
> >
> > That phrase "sized appropriately" is very important.  Your initial
> > message indicated that you have a 90GB index, and that you are running
> > in virtual machines.  Typically VMs have fairly small memory sizes.  It
> > is very possible that you simply don't have enough memory in the VM for
> > good performance with an index that large.  With 90GB of index data on
> > one machine, I would hope for at least 64GB of RAM, and I would prefer
> > to have 128GB.  If there is more than 90GB of data on one machine, then
> > even more memory would be needed.
> >
> > Thanks,
> > Shawn
> >
> >
>

Reply via email to