On Fri, Dec 16, 2016 at 11:31 AM, Toke Eskildsen <t...@statsbiblioteket.dk> wrote:
> On Fri, 2016-12-16 at 11:19 +0100, Dorian Hoxha wrote: > > On Fri, Dec 16, 2016 at 10:45 AM, Toke Eskildsen > > <t...@statsbiblioteket.dk> wrote: > > > We try hard to stay below 32GB, but for some setups the penalty of > > > crossing the boundary is worth it. If, for example, having > > > everything in 1 shard means a heap requirement of 50GB, it can be a > > > better solution than a multi-shard setup with 2*25GB heap. > > > > > The heap is for the instance, not for each shard. Yeah, having less > > shards is ~more efficient since terms-dictionary,cache etc have lower > > duplication. > > True, but that was not my point. What I tried to communicate is that > there can be a huge difference between having 1 shard in the collection > and having more than 1 shard. Not for document searches, but for > aggregations such as grouping and especially String faceting. > > - Toke Eskildsen, State and University Library, Denmark > Yes makes sense, I remember doing cross-shard aggs may require more than 1 call (1 call to get top(x), 1 other verify that they really are top(x) cross-shards). So less shards less merges to get final values.