My gut instinct is that it's a hard path you're considering. There is the
logistics of sharding by document similarity on both the indexing side and
query side. Even if you pull that off, it would be extremely difficult to
know if you're getting good results and really hard to fix if you're not
get
Hi Joel,
Right now, we are (web) crawling almost 85millions of documents and this
can increase to double. Collection is plainly divided into shards and so
while searching, its search across all shards.
If it is possible for a system to distributed documents into shards based
on documents similarit
I don't know of any contrib or module that does this. Can you describe why
you'd want to route documents to shards based on similarity? What
advantages would you get by using this approach?
Joel Bernstein
http://joelsolr.blogspot.com/
On Wed, Apr 6, 2016 at 1:36 PM, davidphilip cherian <
davidphi
Any thoughts?
On Tue, Apr 5, 2016 at 9:05 PM, davidphilip cherian <
davidphilipcher...@gmail.com> wrote:
> Hi,
>
> Is there any contribution(open source contrib module) that routes
> documents to shards based on document similarity technique? Or any
> suggestions that integrates mahout to solr f
Hi,
Is there any contribution(open source contrib module) that routes documents
to shards based on document similarity technique? Or any suggestions that
integrates mahout to solr for this use case?
>From what I know, currently there are two document route strategies as
explained here
https://luc