Re: Merging results from Shards - relevancy and performance

2012-01-01 Thread Erick Erickson
1> Yes. Note that the distributed tf/idf is an issue, although it's changing. That is, if your documents are statistically very different across shards, the scores aren't really comparable. This is changing, but I don't think it's committed yet. 2> Well, you're mixing apples and orang

Merging results from Shards - relevancy and performance

2012-01-01 Thread shlomi java
hola, 1) When distributing search across several Shards, is the merged result reflects the overall ranking, cross-shards? I'm talking about stuff like "document frequency". I guess it does, otherwise distributed search wouldn't have overhead. talking about overhead, 2) is there a known ratio of t