Re: Solr-Cloud, join and collection collocation

2019-10-16 Thread Nicolas Paris
> Note: adding score=none as a local param. Turns another algorithm > dragging by from side join. Indeed, the behavior with score=none local param is a query time correlated with the joined collection subset size. For subset of 100k documenrs, the query time is 1 seconds, 4 sec for 1M I get client

Re: Solr-Cloud, join and collection collocation

2019-10-16 Thread Mikhail Khludnev
Note: adding score=none as a local param. Turns another algorithm dragging by from side join. On Wed, Oct 16, 2019 at 11:37 AM Nicolas Paris wrote: > Sadly, the join performances are poor. > The joined collection is 12M documents, and the performances are 6k ms > versus 60ms when I compare to th

Re: Solr-Cloud, join and collection collocation

2019-10-16 Thread Nicolas Paris
Sadly, the join performances are poor. The joined collection is 12M documents, and the performances are 6k ms versus 60ms when I compare to the denormalized field. Apparently, the performances does not change when the filter on the joined collection is changed. It is still 6k ms when the subset is

Re: Solr-Cloud, join and collection collocation

2019-10-15 Thread Nicolas Paris
> You can certainly replicate the joined collection to every shard. It > must fit in one shard and a replica of that shard must be co-located > with every replica of the “to” collection. Yes, I found this in the documentation, with a clear example just after this mail. I will test it today. I also

Re: Solr-Cloud, join and collection collocation

2019-10-15 Thread Erick Erickson
You can certainly replicate the joined collection to every shard. It must fit in one shard and a replica of that shard must be co-located with every replica of the “to” collection. Have you looked at streaming and “streaming expressions"? It does not have the same problem, although it does have

Solr-Cloud, join and collection collocation

2019-10-15 Thread Nicolas Paris
Hi I have several large collections that cannot fit in a standalone solr instance. They are split over multiple shards in solr-cloud mode. Those collections are supposed to be joined to an other collection to retrieve subset. Because I am using distributed collections, I am not able to use the so