github-actions[bot] commented on PR #9323:
URL: https://github.com/apache/iceberg/pull/9323#issuecomment-2623222682
This pull request has been closed due to lack of activity. This is not a
judgement on the merit of the PR in any way. It is just a way of keeping the PR
queue manageable. If y
github-actions[bot] closed pull request #9323: API: New API For sequential /
streaming updates
URL: https://github.com/apache/iceberg/pull/9323
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the sp
github-actions[bot] commented on PR #9323:
URL: https://github.com/apache/iceberg/pull/9323#issuecomment-2594192610
This pull request has been marked as stale due to 30 days of inactivity. It
will be closed in 1 week if no further activity occurs. If you think that’s
incorrect or this pull
jasonf20 commented on PR #9323:
URL: https://github.com/apache/iceberg/pull/9323#issuecomment-2250437040
I rebased this PR again so it's mergable. Would appreciate reviving this PR.
@rdblue Could you please have a look or let me know if there is someone else
who could help?
--
This is a
jasonf20 commented on PR #9323:
URL: https://github.com/apache/iceberg/pull/9323#issuecomment-1944914307
@rdblue Based on our discussions could you have another look at this?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
rdblue commented on code in PR #9323:
URL: https://github.com/apache/iceberg/pull/9323#discussion_r1477125444
##
core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java:
##
@@ -221,34 +223,52 @@ protected boolean addsDeleteFiles() {
/** Add a data file to the new s
rdblue commented on code in PR #9323:
URL: https://github.com/apache/iceberg/pull/9323#discussion_r1477125444
##
core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java:
##
@@ -221,34 +223,52 @@ protected boolean addsDeleteFiles() {
/** Add a data file to the new s
rdblue commented on PR #9323:
URL: https://github.com/apache/iceberg/pull/9323#issuecomment-1906444781
@jasonf20, explicitly setting the sequence number isn't safe. Sequence
numbers are assigned when the client attempts to commit and must be updated if
the client has to retry. You could mak
jasonf20 commented on PR #9323:
URL: https://github.com/apache/iceberg/pull/9323#issuecomment-1876954557
@rdblue Sure. I added support for setting the sequence number explicitly per
file in `MergingSnapshotProducer`. This was almost supported already (it didn't
support per file level for ad
rdblue commented on PR #9323:
URL: https://github.com/apache/iceberg/pull/9323#issuecomment-1875787341
@jasonf20, to make that work, I think you'd need to keep track of a base
sequence number and update the metadata for each new manifest with the correct
sequence number when the manifest li
jasonf20 commented on PR #9323:
URL: https://github.com/apache/iceberg/pull/9323#issuecomment-1863068142
@rdblue Correct, we need multiple sequence (new) sequence numbers since each
batch has deletes that need to apply to prior batches, but not newer batches.
Committing more than once wo
rdblue commented on PR #9323:
URL: https://github.com/apache/iceberg/pull/9323#issuecomment-1861669693
@jasonf20, I don't quite understand the use case. It looks like the purpose
is to commit multiple batches of data at the same time. Why would not not just
use a single operation? Do you ne
jasonf20 commented on PR #9323:
URL: https://github.com/apache/iceberg/pull/9323#issuecomment-1859197491
**Benchmark**
The following test was run locally just to demonstrate that the difference
in IO performance is very significant. While the transaction approach IO grows
linearly with t
jasonf20 opened a new pull request, #9323:
URL: https://github.com/apache/iceberg/pull/9323
**Explanation**
Certain data production patterns can result in a bunch of micro-batch
updates that need to be applied to the table sequentially. If these batches
include updates they need to be c
14 matches
Mail list logo