lirui-apache commented on issue #10006: URL: https://github.com/apache/iceberg/issues/10006#issuecomment-2011212269
@szehon-ho @pvary I'm using this test case to demonstrate the issue: ```java final int parallel = 2; ThreadPoolExecutor pool = (ThreadPoolExecutor) Executors.newFixedThreadPool( parallel, new ThreadFactoryBuilder().setDaemon(true).setNameFormat("committer-%d").build()); pool.prestartAllCoreThreads(); try { List<Future<Void>> futures = Lists.newArrayListWithCapacity(parallel); futures.add(pool.submit(() -> { Table table = catalog.loadTable(tableId); table.updateProperties().set(HIVE_LOCK_ENABLED, "false").commit(); return null; })); futures.add(pool.submit(() -> { Table table = catalog.loadTable(tableId); table.updateProperties().set(GC_ENABLED, "false").commit(); return null; })); for(Future<Void> future : futures) { future.get(); } Table table = catalog.loadTable(tableId); assertEquals("false", table.properties().get(HIVE_LOCK_ENABLED)); assertEquals("false", table.properties().get(GC_ENABLED)); } ``` There's a chance the test fails because HIVE_LOCK_ENABLED was not set. BTW the test connects to a local standalone HMS that has the required changes in Hive. Let's call the threads setting HIVE_LOCK_ENABLED and GC_ENABLED `thread-1` and `thread-2` respectively. I think the issue happens like this: 1. `thread-1` decides to use NoLock because its new table properties have `(HIVE_LOCK_ENABLED, "false")` 2. `thread-2` decides to use MetastoreLock 3. `thread-2` acquires the HMS lock and verifies the metadata location has not been changed 4. `thread-1` made the commit via the no-lock path 5. `thread-2` made the commit, and the change made by `thread-1` is lost So my question is should we use the old table metadata (i.e. the `base`) to decide which lock to use? So that both threads will use MetastoreLock in the above example. Or do you think this use case is a known limitation and should be avoided in the first place? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org