Walter...
Thanks for the additional data points. Clearly we're a long way from
needing anything too complex.
Cheers!
...scott
On 3/14/18 1:12 PM, Walter Underwood wrote:
That would be my recommendation for a first setup. One Solr instance per host,
one shard per collection. We run 5 millio
That would be my recommendation for a first setup. One Solr instance per host,
one shard per collection. We run 5 million document cores with 8 GB of heap for
the JVM. We size the RAM so that all the indexes fit in OS filesystem buffers.
Our big cluster is 32 hosts, 21 million documents in four
Erick...
Thanks. Yes. I think we were just going shard-happy without really
understanding the purpose. I think we'll start by keeping things simple
.. no shards, fewer replicas, maybe a bit more RAM. Then we can assess
the performance and make adjustments as needed.
Yes, that's the main reas
Scott:
Eventually you'll hit the limit of your hardware, regardless of VMs.
I've seen multiple VMs help a lot when you have really beefy hardware,
as in 32 cores, 128G memory and the like. Otherwise it's iffier.
re: sharding or not. As others wrote, sharding is only useful when a
single collectio
Emir...
Thanks for the input. Our larger collections are localized content, so
it may make sense to shard those so we can target the specific index.
I'll need to confirm how it's being used, if queries are always within a
language or if they are cross-language.
Thanks also for the link .. ve
Greg...
Thanks. That's very helpful, and is inline with what I've been seeing.
So, to be clear, you're saying that the size of all collections on a
server should be less than the available RAM. It looks like we've got
about 13GB of documents in all (and growing), so, if we're restricted to
16
Hi Scott,
There is no definite answer - it depends on your documents and query patterns.
Sharding does come with an overhead but also allows Solr to parallelise search.
Query latency is usually something that tells you if you need to split
collection to multiple shards or not. In caseIf you are
A single shard is much simpler conceptually and also cheaper to query. I
would say that even your 1.2M collection can be a single shard. I'm running
a single shard setup 4X that size. You can still have replicas of this
shard for redundancy / availability purposes.
I'm not an expert, but I think o