bq: "That is really a job for streaming, not simple faceting.”

True, it’s the next step to improve our performance (right now we are using 
JSON facets), and 6.3.0 has a lot of useful tools to work with streaming 
expressions. Our last release before 6.3 was 5.3.1 and the streaming 
expressions were buggy in some scenarios.

bq: "Okay. You could create a new collection with the wanted amount of shards 
and do a full re-index into that.”

True, you are right but we are trying to avoid that (this point falls into 
“keep management low”).

Solr it’s a amazing tool, with a lack of auto magic management stuff. You have 
all the power and therefore all the work :p

Following your advices I will try to review the topology of my collection and 
try to point the oversharded collections.

--

/Yago Riveiro

On 27 Dec 2016 21:54 +0000, Toke Eskildsen <t...@statsbiblioteket.dk>, wrote:
> Yago Riveiro <yago.rive...@gmail.com> wrote:
> > One thing that I forget to mention is that my clients can aggregate
> > by any field in the schema with limit=-1, this is not a problem with
> > 99% of the fields, but 2 or 3 of them are URLs. URLs has very
> > high cardinality and one of the reasons to sharding collections is
> > to lower the memory footprint to not blow the node and do the
> > last merge in a big machine.
>
> That is really a job for streaming, not simple faceting.
>
> Even if you insist on faceting, the problem remains that your merger needs to 
> be powerful enough to process the full result set. Using that machine with a 
> single shard collection instead would eliminate the excessive overhead of 
> doing distributed faceting on millions of values, sparing a lot of hardware 
> allocation, which could be used to beef up the single-shard hardware even 
> more.
>
> [Toke: You can always split later]
>
> > Every time I run the SPLITSHARD command, the command fails
> > in a different way. IMHO right now Solr doesn’t have an efficient
> > way to rebalance collection’s shard.
>
> Okay. You coul create a new collection with the wanted amount of shards and 
> do a full re-index into that.
>
> [Toke: "And yes, more logistics on your part as one size no longer fits all”]
>
> > The key point of this deploy is reduce the amount of management
> > as much as possible,
>
> That is your prerogative. I hope my suggestions can be used by other people 
> with similar challenges then.
>
> - Toke Eskildsen

Reply via email to