lirui-apache commented on issue #10006:
URL: https://github.com/apache/iceberg/issues/10006#issuecomment-2011212269

   @szehon-ho  @pvary I'm using this test case to demonstrate the issue:
   ```java
       final int parallel = 2;
       ThreadPoolExecutor pool = (ThreadPoolExecutor) 
Executors.newFixedThreadPool(
               parallel,
               new 
ThreadFactoryBuilder().setDaemon(true).setNameFormat("committer-%d").build());
       pool.prestartAllCoreThreads();
       try {
         List<Future<Void>> futures = Lists.newArrayListWithCapacity(parallel);
         futures.add(pool.submit(() -> {
           Table table = catalog.loadTable(tableId);
           table.updateProperties().set(HIVE_LOCK_ENABLED, "false").commit();
           return null;
         }));
         futures.add(pool.submit(() -> {
           Table table = catalog.loadTable(tableId);
           table.updateProperties().set(GC_ENABLED, "false").commit();
           return null;
         }));
         for(Future<Void> future : futures) {
           future.get();
         }
         Table table = catalog.loadTable(tableId);
         assertEquals("false", table.properties().get(HIVE_LOCK_ENABLED));
         assertEquals("false", table.properties().get(GC_ENABLED));
       }
   ```
   There's a chance the test fails because HIVE_LOCK_ENABLED was not set. BTW 
the test connects to a local standalone HMS that has the required changes in 
Hive.
   
   Let's call the threads setting HIVE_LOCK_ENABLED and GC_ENABLED `thread-1` 
and `thread-2` respectively. I think the issue happens like this:
   
   1. `thread-1` decides to use NoLock because its new table properties have 
`(HIVE_LOCK_ENABLED, "false")`
   2. `thread-2` decides to use MetastoreLock
   3. `thread-2` acquires the HMS lock and verifies the metadata location has 
not been changed
   4. `thread-1` made the commit via the no-lock path
   5. `thread-2` made the commit, and the change made by `thread-1` is lost
   
   So my question is should we use the old table metadata (i.e. the `base`) to 
decide which lock to use? So that both threads will use MetastoreLock in the 
above example. Or do you think this use case is a known limitation and should 
be avoided in the first place?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to