ForeverAngry commented on issue #2409:
URL: 
https://github.com/apache/iceberg-python/issues/2409#issuecomment-3251396079

   @QlikFrederic thanks for reporting this! I'm sorry you ran into this bug :/
   
   I believe i reproduced and found the cause.
   
   I think the issue is here:
   
   ```python
   class ExpireSnapshots(UpdateTableMetadata["ExpireSnapshots"]):
       _snapshot_ids_to_expire: Set[int] = set()  # ❌ SHARED ACROSS ALL 
INSTANCES!
       _updates: Tuple[TableUpdate, ...] = ()      
       _requirements: Tuple[TableRequirement, ...] = ()  
   ```
   
   Where the `_snapshot_ids_to_expire` is a **class-level attribute**, not 
instance-level, so when Thread 1 does `table1.expire_snapshots().by_id(1001)` 
and Thread 2 does `table2.expire_snapshots().by_id(2001)`, they're both adding 
to the **same shared set**. 
   
   
   The fix seems trivia, i think... 🤞 - I moved those attributes to the 
`__init__` method:
   
   ```python
   def __init__(self, transaction: Transaction) -> None:
       super().__init__(transaction)
       # ✅ Instance-level now - each table gets its own!
       self._snapshot_ids_to_expire: Set[int] = set()
       self._updates: Tuple[TableUpdate, ...] = ()
       self._requirements: Tuple[TableRequirement, ...] = ()
   ```
   
   I wrote tests to reproduced the bug (literally got the exact same error 
message as your issue), applied the fix, and thread safety seemed to be 
restored! No more snapshot ID mix-ups between tables.
   
   
   That being said, if I'm right, the same issue exists in the 
`ManageSnapshots` class as well.
   
   I'm traveling at the moment, once I'm home, I can push a branch for you to 
test (sometime in the next 24-48 hours).  Let me know what you think!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to