bq: I have a sharded index. When I re-index a document (vs new index, which is different process), I need to delete the old one first to avoid dup
No, you do not need to issue the delete in a sharded collection _assuming_ that the doc has the same <uniqueKey>. Why do you think you do? If it's in some doc somewhere we need to fix it. Docs are routed by a hash no the <uniqueKey> in the default case. So since it goes to the same shard, the fact that it's a new version will be detected and it'll replace the old version. Are you seeing anything different? Best, Erick On Wed, Sep 2, 2015 at 1:24 PM, Renee Sun <renee_...@mcafee.com> wrote: > Shawn, > thanks for the reply. > > I have a sharded index. When I re-index a document (vs new index, which is > different process), I need to delete the old one first to avoid dup. We all > know that if there is only one core, the newly added document will replace > the old one, but with multiple core indexes, we will have to issue delete > command first to ALL shards since we do NOT know/remember which core the old > document was indexed to ... > > I also wanted to know if there is a better way handling this efficiently. > > Anyways, we are sending delete to all cores of this customer, one of them > hit , others did not. > > But consequently, when I need to decide about commit, I do NOT want blindly > commit to all cores, I want to know which one actually had the old doc so I > only send commit to that core. > > I could alternatively use query first and skip if it did not hit, but delete > if it does, and I can't short circuit since we have dups :-( based on a > historical reason. > > any suggestion how to make this more efficiently? > > thanks! > > > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/is-there-any-way-to-tell-delete-by-query-actually-deleted-anything-tp4226776p4226788.html > Sent from the Solr - User mailing list archive at Nabble.com.