Streaming expressions will utilize all replicas of a cluster when the
number of workers >= the number of replicas.

For example if there are 40 workers and 40 shards and 5 replicas.

For a single parallel request:

Each worker will send 1 query to a random replica in each shard. This is
1600 hundreds requests. The 1600 requests will be spread evenly across all
200 nodes in the cluster, with each node handling 8 requests. Each request
will return 1/1600 of the result set.

If you add another row of replicas the 1600 hundred requests will be
handled by 240 nodes.

-----

In streaming expressions you use the parallel function to send requests to
workers.

In SQL you specify aggregationMode=map_reduce and workers=X. The SQL
interface only goes into parallel mode for GROUP BY and SELECT DISTINCT
queries.











Joel Bernstein
http://joelsolr.blogspot.com/

On Mon, May 23, 2016 at 7:17 PM, Joel Bernstein <joels...@gmail.com> wrote:

> The image is the correct flow. Are you using workers?
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Mon, May 23, 2016 at 7:16 PM, Timothy Potter <thelabd...@gmail.com>
> wrote:
>
>> This image from the wiki kind of gives that impression to me:
>>
>>
>> https://cwiki.apache.org/confluence/download/attachments/61311194/cluster.png?version=1&modificationDate=1447365789000&api=v2
>>
>> On Mon, May 23, 2016 at 11:50 AM, Erick Erickson
>> <erickerick...@gmail.com> wrote:
>> > I _think_ this is a distinction between
>> > serving the query and processing the results. The
>> > query is the standard Solr processing returning
>> > results from one replica per shard.
>> >
>> > Those results can be partitioned out to N Solr instances
>> > for sub-processing, where N is  however many worker
>> > nodes you specified that may or may not be host
>> > to any replicas of that collection.
>> >
>> > At least I think that's what's up, but then again this is
>> > new to me too.
>> >
>> > Which bits of the doc anyway? Sounds like some
>> > clarification is in order.
>> >
>> > Best,
>> > Erick
>> >
>> > On Mon, May 23, 2016 at 9:32 AM, Timothy Potter <thelabd...@gmail.com>
>> wrote:
>> >> I've seen docs and diagrams that seem to indicate a streaming
>> >> expression can utilize all replicas of a shard but I'm seeing only 1
>> >> replica per shard (I have 2) being queried.
>> >>
>> >> All replicas are on the same host for my experimentation, could that
>> >> be the issue? What are the circumstances where all replicas will be
>> >> utilized?
>> >>
>> >> Or is this a mis-understanding of the docs?
>>
>
>

Reply via email to