On Fri, Dec 16, 2016 at 11:31 AM, Toke Eskildsen <t...@statsbiblioteket.dk>
wrote:

> On Fri, 2016-12-16 at 11:19 +0100, Dorian Hoxha wrote:
> > On Fri, Dec 16, 2016 at 10:45 AM, Toke Eskildsen
> > <t...@statsbiblioteket.dk> wrote:
> > > We try hard to stay below 32GB, but for some setups the penalty of
> > > crossing the boundary is worth it. If, for example, having
> > > everything in 1 shard means a heap requirement of 50GB, it can be a
> > > better solution than a multi-shard setup with 2*25GB heap.
> > >
> > The heap is for the instance, not for each shard. Yeah, having less
> > shards is ~more efficient since terms-dictionary,cache etc have lower
> > duplication.
>
> True, but that was not my point. What I tried to communicate is that
> there can be a huge difference between having 1 shard in the collection
> and having more than 1 shard. Not for document searches, but for
> aggregations such as grouping and especially String faceting.
>
> - Toke Eskildsen, State and University Library, Denmark
>
Yes makes sense, I remember doing cross-shard aggs may require more than 1
call (1 call to get top(x), 1 other verify that they really are top(x)
cross-shards). So less shards less merges to get final values.

Reply via email to