Hello!

I'm wondering what information you may be able to provide on performance
implications of implicit routing VS composite ID routing. In particular,
I'm curious what the horizontal scaling behavior may be of implicit routing
or composite ID routing with and without the "/<bits>" param appended on.

I've been following this documentation
<https://lucene.apache.org/solr/guide/6_6/shards-and-indexing-data-in-solrcloud.html>,
and a few other blogs/articles I've seen around the web (including on lucid
works). Many of these discuss what the techniques and general philosophy
are of different document routing techniques, but I haven't been able to
find a "Big O" assessment so far by searching online. I'm aware than any
particular workload really needs a Sizing Exercise
<https://lucidworks.com/blog/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/>
to fully understand its implications, but I'm hoping to plan high level
architecture beyond what I can currently forsee in scale.

A relatively simple assessment I've done belowleads me to believe the
following is likely the case: if we have S shards and B as our "/bits"
param, then resource usage would Big O scale as follows (note: Previously
I've received the advice that any shard should be capped at a max of 120M
documents, which is where the cap on docs/shard-key comes from)

   - Implicit routing
      - One Read: O(S)
         - hits horizontal scaling limit eventually as S grows
         - No cap on docs per shard key (no shard key)
      - Composite ID routing, no bits param:
      - One Read: O(1)
         - no horizontal scaling limit as S grows
         - Docs on a shard key capped at 120 million
      - Composite ID routing with bits param:
      - One Read: O(2^B)
         - no horizontal scaling limit as S grows for fixed B
         - Docs on a shard key capped at 120 million * 2^B


So my questions: Is this "big O" analysis about correct? Does SOLR have an
ability to scale horizontally on implicit routing despite what my simple
analysis would suggest? Are there other considerations here you can
enlighten me on?

I would guess the answer to the second question is "no" because otherwise
it wouldn't seem to me that composite ID routing would add much concrete
value. But perhaps there are some other factors I've yet to consider.

Thanks for your time and help! Looking forward to hearing back :)

Cheers,
Stephen

Reply via email to