bq: "That is really a job for streaming, not simple faceting.” True, it’s the next step to improve our performance (right now we are using JSON facets), and 6.3.0 has a lot of useful tools to work with streaming expressions. Our last release before 6.3 was 5.3.1 and the streaming expressions were buggy in some scenarios.
bq: "Okay. You could create a new collection with the wanted amount of shards and do a full re-index into that.” True, you are right but we are trying to avoid that (this point falls into “keep management low”). Solr it’s a amazing tool, with a lack of auto magic management stuff. You have all the power and therefore all the work :p Following your advices I will try to review the topology of my collection and try to point the oversharded collections. -- /Yago Riveiro On 27 Dec 2016 21:54 +0000, Toke Eskildsen <t...@statsbiblioteket.dk>, wrote: > Yago Riveiro <yago.rive...@gmail.com> wrote: > > One thing that I forget to mention is that my clients can aggregate > > by any field in the schema with limit=-1, this is not a problem with > > 99% of the fields, but 2 or 3 of them are URLs. URLs has very > > high cardinality and one of the reasons to sharding collections is > > to lower the memory footprint to not blow the node and do the > > last merge in a big machine. > > That is really a job for streaming, not simple faceting. > > Even if you insist on faceting, the problem remains that your merger needs to > be powerful enough to process the full result set. Using that machine with a > single shard collection instead would eliminate the excessive overhead of > doing distributed faceting on millions of values, sparing a lot of hardware > allocation, which could be used to beef up the single-shard hardware even > more. > > [Toke: You can always split later] > > > Every time I run the SPLITSHARD command, the command fails > > in a different way. IMHO right now Solr doesn’t have an efficient > > way to rebalance collection’s shard. > > Okay. You coul create a new collection with the wanted amount of shards and > do a full re-index into that. > > [Toke: "And yes, more logistics on your part as one size no longer fits all”] > > > The key point of this deploy is reduce the amount of management > > as much as possible, > > That is your prerogative. I hope my suggestions can be used by other people > with similar challenges then. > > - Toke Eskildsen