Is there any difference to the relevancy score for a document that has been added directly to an index vs. the same document that got into the index because of a merge?

In other words, I'd like to build my index in pieces (since people in different cities will be working on parts of it), but I want the search results to be as if it were one index.

My first thought was to keep the indexes separate and use multicore shards to search both indexes. I decided against that because of two things:

1) It is slower.
2) The relevancies are wrong, since the frequency of words is really different in the two indexes.

My second thought is to have the people work on separate indexes, and merge them together just before going to production. That would definitely solve the first problem, but I don't know if it solves the second.

I also don't know how to test that myself. I want to build my index both ways then do a search and compare the results, but how decisive that is depends on the particular words I use in the search. Is there a way to dump everything about a particular document, so I could compare the two indexes? Are there other tools available that would help?

Thanks for any insight.

Reply via email to