[I] Can not get the correct UGI, always get the inital ugi in the spark master{after update the ugi in the spark master} [iceberg]

via GitHub Sun, 21 Sep 2025 20:08:39 -0700


Yao-MR opened a new issue, #14146:
URL: https://github.com/apache/iceberg/issues/14146


   ### Apache Iceberg version
   
   None
   
   ### Query engine
   
   None
   
   ### Please describe the bug 🐞
   
   2025-09-22 10:16:49,787 [INFO] [SparkSQLSessionManager-exec-pool: 
Thread-489] Created broadcast 13 from broadcast at SparkBatch.java:79 
(org.apache.spark.SparkContext(org.apache.spark.internal.Logging.logInfo:60))
   2025-09-22 10:16:49,796 [WARN] [iceberg-worker-pool-64] Exception 
encountered while connecting to the server  
(org.apache.hadoop.ipc.Client(org.apache.hadoop.ipc.Client$Connection$1.run:787))
   org.apache.hadoop.ipc.RemoteException: token (token for hadoop: 
HDFS_DELEGATION_TOKEN owner=hadoop, renewer=hadoop, 
realUser=hadoop/*****-{$IP}@*****-9ZHNN93W, issueDate=1758251993978, 
maxDate=1758856793978, sequenceNumber=64812, masterKeyId=108) can't be found in 
cache
        at 
org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:495) 
~[hadoop-client-api-3.2.2-*****-5.3.1.3.jar:?]
        at 
org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:636) 
~[hadoop-client-api-3.2.2-*****-5.3.1.3.jar:?]
        at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:420) 
~[hadoop-client-api-3.2.2-*****-5.3.1.3.jar:?]
        at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:838) 
~[hadoop-client-api-3.2.2-*****-5.3.1.3.jar:?]
        at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:834) 
~[hadoop-client-api-3.2.2-*****-5.3.1.3.jar:?]
        at java.security.AccessController.doPrivileged(Native Method) ~[?:?]
        at javax.security.auth.Subject.doAs(Subject.java:423) ~[?:?]
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:2065)
 ~[hadoop-client-api-3.2.2-*****-5.3.1.3.jar:?]
        at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:834) 
~[hadoop-client-api-3.2.2-*****-5.3.1.3.jar:?]
        at org.apache.hadoop.ipc.Client$Connection.access$3900(Client.java:420) 
~[hadoop-client-api-3.2.2-*****-5.3.1.3.jar:?]
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1682) 
~[hadoop-client-api-3.2.2-*****-5.3.1.3.jar:?]
        at org.apache.hadoop.ipc.Client.call(Client.java:1498) 
~[hadoop-client-api-3.2.2-*****-5.3.1.3.jar:?]
        at org.apache.hadoop.ipc.Client.call(Client.java:1451) 
~[hadoop-client-api-3.2.2-*****-5.3.1.3.jar:?]
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
 ~[hadoop-client-api-3.2.2-*****-5.3.1.3.jar:?]
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
 ~[hadoop-client-api-3.2.2-*****-5.3.1.3.jar:?]
        at com.sun.proxy.$Proxy28.getBlockLocations(Unknown Source) ~[?:?]
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:232)
 ~[hadoop-client-api-3.2.2-*****-5.3.1.3.jar:?]
        at jdk.internal.reflect.GeneratedMethodAccessor63.invoke(Unknown 
Source) ~[?:?]
        at 
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:?]
        at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:434)
 ~[hadoop-client-api-3.2.2-*****-5.3.1.3.jar:?]
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
 ~[hadoop-client-api-3.2.2-*****-5.3.1.3.jar:?]
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
 ~[hadoop-client-api-3.2.2-*****-5.3.1.3.jar:?]
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
 ~[hadoop-client-api-3.2.2-*****-5.3.1.3.jar:?]
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
 ~[hadoop-client-api-3.2.2-*****-5.3.1.3.jar:?]
        at com.sun.proxy.$Proxy29.getBlockLocations(Unknown Source) ~[?:?]
        at 
org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:973) 
~[hadoop-client-api-3.2.2-*****-5.3.1.3.jar:?]
        at 
org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:962) 
~[hadoop-client-api-3.2.2-*****-5.3.1.3.jar:?]
        at 
org.apache.hadoop.hdfs.DFSClient.getBlockLocations(DFSClient.java:1021) 
~[hadoop-client-api-3.2.2-*****-5.3.1.3.jar:?]
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$2.doCall(DistributedFileSystem.java:297)
 ~[hadoop-client-api-3.2.2-*****-5.3.1.3.jar:?]
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$2.doCall(DistributedFileSystem.java:294)
 ~[hadoop-client-api-3.2.2-*****-5.3.1.3.jar:?]
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
 ~[hadoop-client-api-3.2.2-*****-5.3.1.3.jar:?]
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileBlockLocations(DistributedFileSystem.java:304)
 ~[hadoop-client-api-3.2.2-*****-5.3.1.3.jar:?]
        at 
org.apache.iceberg.hadoop.HadoopInputFile.getBlockLocations(HadoopInputFile.java:210)
 ~[iceberg-spark-runtime-3.5_2.12-1.6.1-*****-5.3.1_2025p4-SNAPSHOT.jar:?]
        at org.apache.iceberg.hadoop.Util.blockLocations(Util.java:111) 
~[iceberg-spark-runtime-3.5_2.12-1.6.1-*****-5.3.1_2025p4-SNAPSHOT.jar:?]
        at org.apache.iceberg.hadoop.Util.blockLocations(Util.java:84) 
~[iceberg-spark-runtime-3.5_2.12-1.6.1-*****-5.3.1_2025p4-SNAPSHOT.jar:?]
        at 
org.apache.iceberg.spark.source.SparkPlanningUtil.lambda$fetchBlockLocations$0(SparkPlanningUtil.java:49)
 ~[iceberg-spark-runtime-3.5_2.12-1.6.1-*****-5.3.1_2025p4-SNAPSHOT.jar:?]
        at 
org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413) 
~[iceberg-spark-runtime-3.5_2.12-1.6.1-*****-5.3.1_2025p4-SNAPSHOT.jar:?]
        at org.apache.iceberg.util.Tasks$Builder.access$300(Tasks.java:69) 
~[iceberg-spark-runtime-3.5_2.12-1.6.1-*****-5.3.1_2025p4-SNAPSHOT.jar:?]
        at org.apache.iceberg.util.Tasks$Builder$1.run(Tasks.java:315) 
~[iceberg-spark-runtime-3.5_2.12-1.6.1-*****-5.3.1_2025p4-SNAPSHOT.jar:?]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[?:?]
        at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
~[?:?]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
~[?:?]
        at java.lang.Thread.run(Thread.java:829) ~[?:?]
   
   
   
   
   Also we have already update the ugi in the spark master automatic
   
   2025-09-22 10:16:49,690 [INFO] [SparkSQLSessionManager-exec-pool: 
Thread-489] Scanning table 
spark_catalog.test_icebergdb_001.test_icebergtb_000000 snapshot 
6013412532893968251 created at 2025-06-18T02:41:03.245+00:00 with filter true 
(org.apache.iceberg.SnapshotScan(org.apache.iceberg.SnapshotScan.planFiles:124))
   2025-09-22 10:16:49,694 [INFO] [SparkSQLSessionManager-exec-pool: 
Thread-489] Debug for UserGroupInformation <====================> 
(org.apache.iceberg.BaseDistributedDataScan(org.apache.iceberg.BaseDistributedDataScan.lambda$processTokens$2:278))
   2025-09-22 10:16:49,694 [INFO] [SparkSQLSessionManager-exec-pool: 
Thread-489] Token Kind: HDFS_DELEGATION_TOKEN 
(org.apache.iceberg.BaseDistributedDataScan(org.apache.iceberg.BaseDistributedDataScan.lambda$processTokens$2:279))
   2025-09-22 10:16:49,694 [INFO] [SparkSQLSessionManager-exec-pool: 
Thread-489] Token Identifier: [0, 8, 119, 101, 100, 97, 116, 97, 45, 97, 8, 
119, 101, 1*************************, 83, 45, 57, 90, 72, 78, 78, 57, 51, 87, 
-118, 1, -103, 111, 53, -96, -113, -118, 1, -103, -109, 66, 36, -113, -115, 1, 
4, -9, 111] 
(org.apache.iceberg.BaseDistributedDataScan(org.apache.iceberg.BaseDistributedDataScan.lambda$processTokens$2:280))
   2025-09-22 10:16:49,694 [INFO] [SparkSQLSessionManager-exec-pool: 
Thread-489] Token Service: ha-hdfs:HDFS1111111 
(org.apache.iceberg.BaseDistributedDataScan(org.apache.iceberg.BaseDistributedDataScan.lambda$processTokens$2:281))
   2025-09-22 10:16:49,694 [INFO] [SparkSQLSessionManager-exec-pool: 
Thread-489] Token Password: [-95, 85, 81,*************************1, -38, -64, 
97, -123, -78] 
(org.apache.iceberg.BaseDistributedDataScan(org.apache.iceberg.BaseDistributedDataScan.lambda$processTokens$2:282))
   2025-09-22 10:16:49,694 [INFO] [SparkSQLSessionManager-exec-pool: 
Thread-489] Token toString(): Kind: HDFS_DELEGATION_TOKEN, Service: 
ha-hdfs:HDFS1111111, Ident: (token for wedata-a: HDFS_DELEGATION_TOKEN 
owner=wedata-a, renewer=wedata-a, realUser=hadoop/***-{$ip}@***-9ZHNN93W, 
issueDate=1758507409551, maxDate=1759112209551, sequenceNumber=66807, 
masterKeyId=111) 
(org.apache.iceberg.BaseDistributedDataScan(org.apache.iceberg.BaseDistributedDataScan.lambda$processTokens$2:283))
   2025-09-22 10:16:49,694 [INFO] [SparkSQLSessionManager-exec-pool: 
Thread-489] <====================> 
(org.apache.iceberg.BaseDistributedDataScan(org.apache.iceberg.BaseDistributedDataScan.lambda$processTokens$2:284))
   2025-09-22 10:16:49,694 [INFO] [SparkSQLSessionManager-exec-pool: 
Thread-489] Debug for UserGroupInformation <====================> 
(org.apache.iceberg.BaseDistributedDataScan(org.apache.iceberg.BaseDistributedDataScan.lambda$processTokens$2:278))
   2025-09-22 10:16:49,694 [INFO] [SparkSQLSessionManager-exec-pool: 
Thread-489] Token Kind: HIVE_DELEGATION_TOKEN 
(org.apache.iceberg.BaseDistributedDataScan(org.apache.iceberg.BaseDistributedDataScan.lambda$processTokens$2:279))
   2025-09-22 10:16:49,694 [INFO] [SparkSQLSessionManager-exec-pool: 
Thread-489] Token Identifier: [0, 8, 119, 101, 100, 97, 116, 97, 45, 
97,****************** 57, 51, 87, -118, 1, -103, 111, 53, -96, -112, -118, 1, 
-103, -109, 66, 36, -112, -114, 9, -72, -113, -49] 
(org.apache.iceberg.BaseDistributedDataScan(org.apache.iceberg.BaseDistributedDataScan.lambda$processTokens$2:280))
   2025-09-22 10:16:49,694 [INFO] [SparkSQLSessionManager-exec-pool: 
Thread-489] Token Service:  
(org.apache.iceberg.BaseDistributedDataScan(org.apache.iceberg.BaseDistributedDataScan.lambda$processTokens$2:281))
   2025-09-22 10:16:49,694 [INFO] [SparkSQLSessionManager-exec-pool: 
Thread-489] Token Password: [-16, -16, 27, -64, 126, 45, -27, 27, -103, 104, 
-112, -19, -8, -67, 121, 41, 34, 125, -26, 69] 
(org.apache.iceberg.BaseDistributedDataScan(org.apache.iceberg.BaseDistributedDataScan.lambda$processTokens$2:282))
   2025-09-22 10:16:49,694 [INFO] [SparkSQLSessionManager-exec-pool: 
Thread-489] Token toString(): Kind: HIVE_DELEGATION_TOKEN, Service: , Ident: 00 
08********************** a0 90 8a 01 99 93 42 24 90 8e 09 b8 8f cf 
(org.apache.iceberg.BaseDistributedDataScan(org.apache.iceberg.BaseDistributedDataScan.lambda$processTokens$2:283))
   2025-09-22 10:16:49,694 [INFO] [SparkSQLSessionManager-exec-pool: 
Thread-489] <====================> 
(org.apache.iceberg.BaseDistributedDataScan(org.apache.iceberg.BaseDistributedDataScan.lambda$processTokens$2:284))
   2025-09-22 10:16:49,694 [INFO] [SparkSQLSessionManager-exec-pool: 
Thread-489] ==================================Planning file tasks locally for 
table spark_catalog.test_icebergdb_001.test_icebergtb_000000 
(org.apache.iceberg.BaseDistributedDataScan(org.apache.iceberg.BaseDistributedDataScan.planFileTasksLocally:295))
   2025-09-22 10:16:49,720 [INFO] [SparkSQLSessionManager-exec-pool: 
Thread-489] Reporting UnknownPartitioning with 1 partition(s) for table 
spark_catalog.test_icebergdb_001.test_icebergtb_000000 
(org.apache.iceberg.spark.source.SparkPartitioningAwareScan(org.apache.iceberg.spark.source.SparkPartitioningAwareScan.outputPartitioning:102))
   
   ### Willingness to contribute
   
   - [ ] I can contribute a fix for this bug independently
   - [ ] I would be willing to contribute a fix for this bug with guidance from 
the Iceberg community
   - [ ] I cannot contribute a fix for this bug at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] Can not get the correct UGI, always get the inital ugi in the spark master{after update the ugi in the spark master} [iceberg]

Reply via email to