szehon-ho commented on issue #10312:
URL: https://github.com/apache/iceberg/issues/10312#issuecomment-2261085249

   yea I think @RussellSpitzer is right, we should rely on  validation error to 
prevent this scenario here, ie T1 should not be able to commit successfully.
   
   i need to understand one thing:
   
   > When I set use-starting-sequence-number = false for rewriteDataFiles, 
Thread 1 compact data files failed at t4. stacktrace:
   > 
   > Caused by: org.apache.iceberg.exceptions.ValidationException: Cannot 
commit, found new delete for replaced data file: GenericDataFile{content=data, 
file_path=/var/folders/5z/dqrlv_ts0wqf36vd39bb384h0000gn/T/junit17491575750166086656/9f77fae8-d62a-426d-971f-a342b6775c44/test_schema/test_table/data/00000-2-52ae94aa-b796-4c42-bf9c-92d36c52e522-00001.parquet,
 file_format=PARQUET, spec_id=0, partition=PartitionData{}, record_count=1, 
file_size_in_bytes=407, column_sizes=null, 
value_counts=org.apache.iceberg.util.SerializableMap@0, 
null_value_counts=org.apache.iceberg.util.SerializableMap@1, 
nan_value_counts=org.apache.iceberg.util.SerializableMap@0, 
lower_bounds=org.apache.iceberg.SerializableByteBufferMap@e1782, 
upper_bounds=org.apache.iceberg.SerializableByteBufferMap@e1782, 
key_metadata=null, split_offsets=[4], equality_ids=null, sort_order_id=null}
   >    at 
org.apache.iceberg.exceptions.ValidationException.check(ValidationException.java:50)
   >    at 
org.apache.iceberg.MergingSnapshotProducer.validateNoNewDeletesForDataFiles(MergingSnapshotProducer.java:418)
   >    at 
org.apache.iceberg.MergingSnapshotProducer.validateNoNewDeletesForDataFiles(MergingSnapshotProducer.java:367)
   >    at 
org.apache.iceberg.BaseRewriteFiles.validate(BaseRewriteFiles.java:108)
   >    at org.apache.iceberg.SnapshotProducer.apply(SnapshotProducer.java:175)
   >    at 
org.apache.iceberg.SnapshotProducer.lambda$commit$2(SnapshotProducer.java:296)
   >    at 
org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:404)
   >    at 
org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:214)
   >    at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:198)
   >    at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:190)
   >    at org.apache.iceberg.SnapshotProducer.commit(SnapshotProducer.java:295)
   >    at 
org.apache.iceberg.actions.RewriteDataFilesCommitManager.commitFileGroups(RewriteDataFilesCommitManager.java:89)
   >    at 
org.apache.iceberg.actions.RewriteDataFilesCommitManager.commitOrClean(RewriteDataFilesCommitManager.java:110)
   >    at 
org.apache.iceberg.spark.actions.RewriteDataFilesSparkAction.doExecute(RewriteDataFilesSparkAction.java:291)
   >    ... 8 more
   
   > your process is in use-starting-sequence-number = true ?
   > I test with use-starting-sequence-number = true and compact failed(apache 
iceberg1.4.3):
   > Exception in thread "main" 
org.apache.iceberg.exceptions.ValidationException: Cannot commit, found new 
delete for replaced data file: GenericDataFile ...
   
   from above conversation it seem we get the validationException in both 
code-paths, isnt it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to