bq: I have a sharded index. When I re-index a document (vs new index, which is
different process), I need to delete the old one first to avoid dup

No, you do not need to issue the delete in a sharded collection
_assuming_ that the doc has the same <uniqueKey>. Why
do you think you do? If it's in some doc somewhere we need
to fix it.

Docs are routed by a hash no the <uniqueKey> in the default
case. So since it goes to the same shard, the fact that it's a
new version will be detected and it'll replace the old version.

Are you seeing anything different?

Best,
Erick

On Wed, Sep 2, 2015 at 1:24 PM, Renee Sun <renee_...@mcafee.com> wrote:
> Shawn,
> thanks for the reply.
>
> I have a sharded index. When I re-index a document (vs new index, which is
> different process), I need to delete the old one first to avoid dup. We all
> know that if there is only one core, the newly added document will replace
> the old one, but with multiple core indexes, we will have to issue delete
> command first to ALL shards since we do NOT know/remember which core the old
> document was indexed to ...
>
> I also wanted to know if there is a better way handling this efficiently.
>
> Anyways, we are sending delete to all cores of this customer, one of them
> hit , others did not.
>
> But consequently, when I need to decide about commit, I do NOT want blindly
> commit to all cores, I want to know which one actually had the old doc so I
> only send commit to that core.
>
> I could alternatively use query first and skip if it did not hit, but delete
> if it does, and I can't short circuit since we have dups :-( based on a
> historical reason.
>
> any suggestion how to make this more efficiently?
>
> thanks!
>
>
>
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/is-there-any-way-to-tell-delete-by-query-actually-deleted-anything-tp4226776p4226788.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to