Re: [PR] Flink: Apply DeleteGranularity for writes [iceberg]

via GitHub Thu, 25 Apr 2024 04:08:36 -0700


pvary commented on PR #10200:
URL: https://github.com/apache/iceberg/pull/10200#issuecomment-2076931744


   > After taking a closer look at `BaseTaskWriter`, I think we may have a 
correctness issue when encoding changes if the table contains multiple specs. 
Our current implementation of `BaseTaskWriter` assumes all writes are encoded 
against the current spec. However, what if there are some matching keys in 
other specs? Written deletes will be scoped to the current partition spec and 
will not apply to data in other specs, potentially missing to upsert some 
records. I think we would want to eventually migrate Flink to new writers that 
inherit `PartitioningWriter`. This is out of scope of this PR, however.
   
   I agree, that we need to fix this. Also agree, that in a different PR.
   
   > I am also not sure we need `ContinuousFileScopedPositionDeleteWriter`. I 
understand we want to solve the companion issue before Flink migrates to 
`PartitioningWriter` so we have to come up with a fix for `TaskWriter`. What 
about directly using `SortingPositionOnlyDeleteWriter` with file granularity in 
`BaseEqualityDeltaWriter`? We only need to pass a closure to create new 
position delete writers and that class should already sort deletes for us on 
the fly. We never needed to persist deleted rows in position deletes so no 
behaviour change.
   
   Updated to use the `SortingPositionOnlyDeleteWriter`. Needed a small "fix" 
to prevent writing empty delete file, but I would like to ask you to double 
check this.
   
   Also there is a behavioural change that the previous write rolled deletes on 
`DEFAULT_RECORDS_NUM_THRESHOLD`=100_000L threshold (no config, just a hard 
coded value). We can't do this with the new writer, but I think it is highly 
unlikely that anyone reached this in a production scenario. If we need this 
feature back, either we need to use the `SortedPosDeleteWriter`, or we can 
provide similar feature by `RollingFileWriter`. The issue is that 
`RollingFileWriter` uses a size threshold instead of a RECORDS_NUM threshold, 
and we need to rewrite/enhance the `RollingFileWriter` to use a closure to 
create a writer instead of the factory (as it does today).
   
   I see 3 options:
   1. Accept that there is no 100_000L threshold for the positional delete file 
size anymore
   2. Use the old SortedPosDeleteWriter for PARTITION granularity. The FILE 
granularity is a new feature, and most probably will not create big files (as 
the data files tend to be bigger, than the delete files, and when a data file 
is rolled, the delete file rolls automatically) - so the behaviour change will 
affect only very-edge cases 
   3. Invest in implementing the RollingFileWriter changes
   
   I would vote for the 1st option, and would accept the 2nd if that is the 
consensus. I do not think that the 3rd worth the investment.
   
   What do you think @stevenzwu, @aokolnychyi?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Flink: Apply DeleteGranularity for writes [iceberg]

Reply via email to