tibrewalpratik17 opened a new pull request, #14506:
URL: https://github.com/apache/pinot/pull/14506

   After https://github.com/apache/pinot/pull/14406, we are able to 
successfully take deepstore backup but now we see that there are lot of 
UpsertCompactionTask failures with the following error:
   
   ```
   java.lang.IllegalStateException: Crc mismatched between ZK and deepstore 
copy of segment: rta_cadence_visibility_gs_production__0__641__20240923T2154Z. 
Expected crc from ZK: 2934158065, crc from deepstore: 2050894616
        at 
org.apache.pinot.plugin.minion.tasks.upsertcompaction.UpsertCompactionTaskExecutor.convert(UpsertCompactionTaskExecutor.java:69)
        at 
org.apache.pinot.plugin.minion.tasks.BaseSingleSegmentConversionExecutor.executeTask(BaseSingleSegmentConversionExecutor.java:132)
        at 
org.apache.pinot.plugin.minion.tasks.BaseSingleSegmentConversionExecutor.executeTask(BaseSingleSegmentConversionExecutor.java:60)
        at 
org.apache.pinot.minion.taskfactory.TaskFactoryRegistry$1.runInternal(TaskFactoryRegistry.java:157)
        at 
org.apache.pinot.minion.taskfactory.TaskFactoryRegistry$1.run(TaskFactoryRegistry.java:118)
        at org.apache.helix.task.TaskRunner.run(TaskRunner.java:75)
        at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent
   ```
   
   It seems that the deepstore-upload retry task can take a backup from any 
arbitrary replica and not particularly the one with which the CRC matches in 
ZK. This patch is to fix the issue where we allow deepstore upload only from 
the replica which matches the ZK metadata's crc values. If we don't find one, 
then we end up taking the deepstore backup anyways from one random replica.
   
   For divergent CRC in replicas, the reason can be particularly using 
text-indexes. We have been discussing this in multiple issues: 
https://github.com/apache/pinot/issues/13491, 
https://github.com/apache/pinot/issues/11004


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to