atris commented on PR #10702: URL: https://github.com/apache/pinot/pull/10702#issuecomment-1528974176
> > This won't work. We need to keep 2 bitmaps, one for valid docs and one for queryable docs. If we only keep valid docs, and count delete as removing it from the valid doc, the next event won't work as expected because we lost track of the delete event. > > This is precisely what we realized during our previous attempt (#10035) . Your solution isn't any different from what was attempted before. > There are many subtle changes from the last PR -- and this actually works for most cases except the snapshot load, which is easy to fix using the queryable map. The last PR was unable to handle out of order records and had no concept of denying partial updates on deleted records but allowing new records with the same primary key to be updated. To answer your specific question, we are *not* losing the deleted event in this PR. That is clearly demonstrated in the unit tests. For further details, please compare the two PRs. > @atris : You had reached out to me via DM (on slack) to collaborate and agreed to review the existing design. Yet, you went ahead and did your own work. This is duplicated effort and not setting the right standard for committers of the Apache Pinot project. Thank you for highlighting this. I do not wish to go into details of our 1:1 conversation here -- but my understanding was that: 1. You were planning on working this post May 10th. 2. You mentioned that it would be hard for us to collaborate since we are in different timezones. 3. I had mentioned that I have a patch I have been playing with -- and have a hard timeline. In all honesty, I feel it is against the spirit of OSS if we "block" issues that we have not yet written code for. A good example is the issue on adding support for GROUP BY for filter aggregations -- where the issue was assigned to me, and I had a half baked PR that I did not get around finishing to. Since Evan posted a PR, I was happy to let my patch take a step back. In this case, I only finished my PR since you mentioned that you will work on it post May 10th and would find it hard to collaborate with me -- which led me to think that our internal timelines might not be matched. If you have any further comments, I would be happy to talk 1:1. Also, please feel free to review the PR and leave any feedback -- I would be happy to address it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org