mzzz-zzm commented on code in PR #982:
URL: https://github.com/apache/iceberg-go/pull/982#discussion_r3202177808


##########
table/snapshot_producers.go:
##########
@@ -903,9 +943,103 @@ func (sp *snapshotProducer) commit(ctx context.Context) 
(_ []Update, _ []Require
                })
        }
 
+       // Build the manifest-list rebuild closure. It is called by doCommit
+       // on each OCC retry to regenerate the manifest list so it correctly
+       // inherits all data files committed by concurrent writers since the
+       // original snapshot was built.
+       formatVersion := sp.txn.meta.formatVersion
+       snapshotID := sp.snapshotID
+       commitUUID := sp.commitUuid
+       capturedSnapshot := snapshot // copy the value so the closure is 
self-contained
+       processManifestsFn := func(m []iceberg.ManifestFile) 
([]iceberg.ManifestFile, error) {
+               return sp.processManifests(m)
+       }
+
+       rebuildFn := func(_ context.Context, freshParent *Snapshot, fio 
iceio.WriteFileIO, attempt int) (_ *Snapshot, retErr error) {
+               // Load inherited manifests from the fresh parent.
+               var inherited []iceberg.ManifestFile
+               if freshParent != nil {
+                       inherited, retErr = freshParent.Manifests(fio)
+                       if retErr != nil {
+                               return nil, fmt.Errorf("rebuild manifest list: 
load parent manifests: %w", retErr)
+                       }
+               }
+
+               // Combine own manifests with inherited ones, applying any
+               // producer-specific processing (no-op for fast/merge-append).
+               combined, procErr := 
processManifestsFn(slices.Concat(ownManifests, inherited))
+               if procErr != nil {
+                       return nil, fmt.Errorf("rebuild manifest list: process 
manifests: %w", procErr)
+               }
+
+               // Derive the sequence number. When there is a fresh parent, 
use its
+               // sequence number + 1 so the rebuilt snapshot is strictly 
greater than
+               // any committed peer. When there is no fresh parent (first 
snapshot in
+               // the table or unknown parent), preserve the original sequence 
number
+               // from the initial build.
+               var newSeq int64
+               if freshParent != nil && formatVersion >= 2 {
+                       newSeq = freshParent.SequenceNumber + 1

Review Comment:
   `newSeq` is now always derived from the table-wide `LastSequenceNumber` in 
the fresh metadata, not from the branch-local parent snapshot's 
`SequenceNumber`:
   
   ```go
   var newSeq int64
   if formatVersion >= 2 {
       newSeq = freshMeta.LastSequenceNumber() + 1
   }
   ```
   
   Both the `freshParent != nil` and `freshParent == nil` branches collapse to 
this single expression. For v1 tables, `newSeq` stays `0` per spec.
   
   `freshMeta` is threaded into the closure as a parameter (same change that 
also fixes Comments 1 and 2), so it reflects the metadata state at retry time, 
never at attempt 0.
   
   Tests added: `TestRebuildFn_SeqNumDerivedFromFreshMeta` (peer bumps 
`lastSeq` to 100 on another branch; asserts rebuilt `SequenceNumber == 101`), 
`TestRebuildFn_SeqNumV1TableIsZero` (v1 table; asserts `SequenceNumber == 0`).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to