risey-yimu opened a new issue, #13781:
URL: https://github.com/apache/iceberg/issues/13781

   ### Apache Iceberg version
   
   1.8.1
   
   ### Query engine
   
   Spark
   
   ### Please describe the bug 🐞
   
   use spark3.5.3 remove_orphan_files Failed!spark conf are flows:
   ```scala
   sparkConf
         .set("spark.ui.enabled", "true")
         .set("spark.task.maxFailures", ls0Config.sparkTaskMaxFailures)
         .set("spark.rpc.message.maxSize", ls0Config.sparkRpcMessageMaxSize)
         .set("spark.sql.iceberg.handle-timestamp-without-timezone", "true")
   
         .set("spark.hadoop.fs.s3a.access.key",ls0Config.jdbcCatalogS3AccessKey)
         .set("spark.hadoop.fs.s3a.secret.key", 
ls0Config.jdbcCatalogS3SecretKey)
         .set("spark.hadoop.fs.s3a.endpoint", ls0Config.jdbcCatalogS3Endpoint)
         .set("spark.hadoop.fs.s3a.path.style.access", "true")
         .set("spark.hadoop.fs.s3a.region", "cn-east-1")
         .set("spark.hadoop.fs.s3a.impl", 
"org.apache.hadoop.fs.s3a.S3AFileSystem")
         .set("spark.hadoop.fs.defaultFS", "s3a://warehouse")
         .set("spark.hadoop.fs.s3.impl", 
"org.apache.hadoop.fs.s3a.S3AFileSystem")
         
.set("spark.hadoop.fs.AbstractFileSystem.s3.impl","org.apache.hadoop.fs.s3a.S3A")
   
   
         .set("spark.sql.extensions", 
"org.projectnessie.spark.extensions.NessieSparkSessionExtensions,org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions")
         .set(s"spark.sql.catalog.${ls0Config.restCatalogName}", 
"org.apache.iceberg.spark.SparkCatalog")
         .set(s"spark.sql.catalog.${ls0Config.restCatalogName}.type", "rest")
         .set(s"spark.sql.catalog.${ls0Config.restCatalogName}.uri", 
ls0Config.restCatalogURL)
         .set(s"spark.sql.catalogImplementation", "in-memory")
   
         .set(s"spark.sql.catalog.${ls0Config.jdbcCatalogName}", 
"org.apache.iceberg.spark.SparkCatalog")
         .set(s"spark.sql.catalog.${ls0Config.jdbcCatalogName}.type", "jdbc")
         .set(s"spark.sql.catalog.${ls0Config.jdbcCatalogName}.uri", 
ls0Config.jdbcCatalogJDBCURL)
         .set(s"spark.sql.catalog.${ls0Config.jdbcCatalogName}.jdbc.user", 
ls0Config.jdbcCatalogJDBCUser)
         .set(s"spark.sql.catalog.${ls0Config.jdbcCatalogName}.jdbc.password", 
ls0Config.jdbcCatalogJDBCPwd)
         .set(s"spark.sql.catalog.${ls0Config.jdbcCatalogName}.warehouse", 
ls0Config.jdbcCatalogWarehousePath)
         .set(s"spark.sql.catalog.${ls0Config.jdbcCatalogName}.io-impl", 
"org.apache.iceberg.aws.s3.S3FileIO")
         .set(s"spark.sql.catalog.${ls0Config.jdbcCatalogName}.s3.endpoint", 
ls0Config.jdbcCatalogS3Endpoint)
         
.set(s"spark.sql.catalog.${ls0Config.jdbcCatalogName}.s3.access-key-id", 
ls0Config.jdbcCatalogS3AccessKey)
         
.set(s"spark.sql.catalog.${ls0Config.jdbcCatalogName}.s3.secret-access-key", 
ls0Config.jdbcCatalogS3SecretKey)
         .set(s"spark.sql.catalog.${ls0Config.jdbcCatalogName}.client.region", 
ls0Config.jdbcCatalogClientRegion)
   
         .set("spark.sql.defaultCatalog", ls0Config.restCatalogName)
   
   
       SparkSession.builder.config(sparkConf).getOrCreate
   ``` 
   I use spark-sql function remove_orphan_files to delete orphan_files,code:
   ```scala
       spark.conf.set("spark.sql.autoBroadcastJoinThreshold", "-1")
       spark.sparkContext.setLogLevel("INFO")
       spark.sql("ALTER TABLE nessie.demo.ods_source_nome_t4 SET TBLPROPERTIES 
('gc.enabled' = 'true')")
       spark.sql(
         s"""
            |CALL nessie.system.remove_orphan_files(table => 
'nessie.demo.ods_source_nome_test' ,location => 
's3a://warehouse/demo/ods_source_nome_test_e0341727-b242-434a-ad20-d6b120d6fd59/data/dt=2025-07-31/',older_than
 => TIMESTAMP '2025-08-03 00:00:00.000')
            |""".stripMargin)
   ``` 
   but l found logs:
   ```
   java.lang.IllegalArgumentException: Invalid S3 URI: 
'http://los.uisee.com/warehouse?delete'
        at 
org.apache.iceberg.rest.ErrorHandlers$DefaultErrorHandler.accept(ErrorHandlers.java:206)
 ~[iceberg-spark-runtime-3.5_2.12-1.8.1.jar:?]
        at 
org.apache.iceberg.rest.ErrorHandlers$DefaultErrorHandler.accept(ErrorHandlers.java:188)
 ~[iceberg-spark-runtime-3.5_2.12-1.8.1.jar:?]
        at org.apache.iceberg.rest.HTTPClient.throwFailure(HTTPClient.java:224) 
~[iceberg-spark-runtime-3.5_2.12-1.8.1.jar:?]
        at org.apache.iceberg.rest.HTTPClient.execute(HTTPClient.java:308) 
~[iceberg-spark-runtime-3.5_2.12-1.8.1.jar:?]
        at org.apache.iceberg.rest.BaseHTTPClient.post(BaseHTTPClient.java:100) 
~[iceberg-spark-runtime-3.5_2.12-1.8.1.jar:?]
        at 
org.apache.iceberg.aws.s3.signer.S3V4RestSignerClient.sign(S3V4RestSignerClient.java:351)
 ~[iceberg-spark-runtime-3.5_2.12-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.SigningStage.lambda$signRequest$4(SigningStage.java:154)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.util.MetricUtils.measureDuration(MetricUtils.java:63)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.SigningStage.signRequest(SigningStage.java:153)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.SigningStage.execute(SigningStage.java:72)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.SigningStage.execute(SigningStage.java:50)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:74)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:43)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:79)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:41)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:55)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:39)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage2.executeRequest(RetryableStage2.java:93)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage2.execute(RetryableStage2.java:56)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage2.execute(RetryableStage2.java:36)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:53)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:35)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:82)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:62)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:43)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:210)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.invoke(BaseSyncClientHandler.java:103)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.doExecute(BaseSyncClientHandler.java:173)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$1(BaseSyncClientHandler.java:80)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:182)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:74)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:45)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:53)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at 
software.amazon.awssdk.services.s3.DefaultS3Client.deleteObjects(DefaultS3Client.java:3626)
 ~[iceberg-aws-bundle-1.8.1.jar:?]
        at org.apache.iceberg.aws.s3.S3FileIO.deleteBatch(S3FileIO.java:281) 
~[iceberg-spark-runtime-3.5_2.12-1.8.1.jar:?]
        at 
org.apache.iceberg.aws.s3.S3FileIO.lambda$deleteFiles$3(S3FileIO.java:219) 
~[iceberg-spark-runtime-3.5_2.12-1.8.1.jar:?]
        at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
[?:?]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
[?:?]
        at java.lang.Thread.run(Thread.java:829) [?:?]
   17:52:36.791 [main] WARN  org.apache.iceberg.aws.s3.S3FileIO - Failed to 
delete object at path 
s3://warehouse/demo/ods_source_nome_test_e0341727-b242-434a-ad20-d6b120d6fd59/data/dt=2025-07-31/00000-0-cd989951-6115-4da4-bb6e-b8942a69ba31-00177.parquet
   17:52:36.791 [main] WARN  org.apache.iceberg.aws.s3.S3FileIO - Failed to 
delete object at path 
s3://warehouse/demo/ods_source_nome_test_e0341727-b242-434a-ad20-d6b120d6fd59/data/dt=2025-07-31/00000-0-cd989951-6115-4da4-bb6e-b8942a69ba31-00191.parquet
   17:52:36.791 [main] WARN  org.apache.iceberg.aws.s3.S3FileIO - Failed to 
delete object at path 
s3://warehouse/demo/ods_source_nome_test_e0341727-b242-434a-ad20-d6b120d6fd59/data/dt=2025-07-31/00001-0-baf52e17-063f-4a42-a02a-18b1286b8a77-00186.parquet
   17:52:36.791 [main] WARN  org.apache.iceberg.aws.s3.S3FileIO - Failed to 
delete object at path 
s3://warehouse/demo/ods_source_nome_test_e0341727-b242-434a-ad20-d6b120d6fd59/data/dt=2025-07-31/00000-0-cd989951-6115-4da4-bb6e-b8942a69ba31-00186.parquet
   17:52:36.791 [main] WARN  org.apache.iceberg.aws.s3.S3FileIO - Failed to 
delete object at path 
s3://warehouse/demo/ods_source_nome_test_e0341727-b242-434a-ad20-d6b120d6fd59/data/dt=2025-07-31/00001-0-baf52e17-063f-4a42-a02a-18b1286b8a77-00190.parquet
   17:52:36.791 [main] WARN  org.apache.iceberg.aws.s3.S3FileIO - Failed to 
delete object at path 
s3://warehouse/demo/ods_source_nome_test_e0341727-b242-434a-ad20-d6b120d6fd59/data/dt=2025-07-31/00001-0-baf52e17-063f-4a42-a02a-18b1286b8a77-00176.parquet
   17:52:36.791 [main] WARN  org.apache.iceberg.aws.s3.S3FileIO - Failed to 
delete object at path 
s3://warehouse/demo/ods_source_nome_test_e0341727-b242-434a-ad20-d6b120d6fd59/data/dt=2025-07-31/00001-0-baf52e17-063f-4a42-a02a-18b1286b8a77-00195.parquet
   17:52:36.791 [main] WARN  org.apache.iceberg.aws.s3.S3FileIO - Failed to 
delete object at path 
s3://warehouse/demo/ods_source_nome_test_e0341727-b242-434a-ad20-d6b120d6fd59/data/dt=2025-07-31/00000-0-cd989951-6115-4da4-bb6e-b8942a69ba31-00182.parquet
   17:52:36.791 [main] WARN  org.apache.iceberg.aws.s3.S3FileIO - Failed to 
delete object at path 
s3://warehouse/demo/ods_source_nome_test_e0341727-b242-434a-ad20-d6b120d6fd59/data/dt=2025-07-31/00001-0-baf52e17-063f-4a42-a02a-18b1286b8a77-00177.parquet
   17:52:36.791 [main] WARN  org.apache.iceberg.aws.s3.S3FileIO - Failed to 
delete object at path 
s3://warehouse/demo/ods_source_nome_test_e0341727-b242-434a-ad20-d6b120d6fd59/data/dt=2025-07-31/00000-0-cd989951-6115-4da4-bb6e-b8942a69ba31-00173.parquet
   17:52:36.792 [main] WARN  org.apache.iceberg.aws.s3.S3FileIO - Failed to 
delete object at path 
s3://warehouse/demo/ods_source_nome_test_e0341727-b242-434a-ad20-d6b120d6fd59/data/dt=2025-07-31/00001-0-baf52e17-063f-4a42-a02a-18b1286b8a77-00182.parquet
   17:52:36.792 [main] WARN  
org.apache.iceberg.spark.actions.DeleteOrphanFilesSparkAction - Deleted only 0 
of 11 files using bulk deletes
   17:52:36.809 [main] INFO  
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator - Code 
generated in 3.935872 ms
   17:52:36.826 [main] INFO  
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator - Code 
generated in 9.080008 ms
   
[s3a://warehouse/demo/ods_source_nome_test_e0341727-b242-434a-ad20-d6b120d6fd59/data/dt=2025-07-31/00000-0-cd989951-6115-4da4-bb6e-b8942a69ba31-00173.parquet]
   
[s3a://warehouse/demo/ods_source_nome_test_e0341727-b242-434a-ad20-d6b120d6fd59/data/dt=2025-07-31/00000-0-cd989951-6115-4da4-bb6e-b8942a69ba31-00177.parquet]
   
[s3a://warehouse/demo/ods_source_nome_test_e0341727-b242-434a-ad20-d6b120d6fd59/data/dt=2025-07-31/00000-0-cd989951-6115-4da4-bb6e-b8942a69ba31-00182.parquet]
   
   ``` 
   why delete orphan_files failed,How to solve this problem
   
   ### Willingness to contribute
   
   - [ ] I can contribute a fix for this bug independently
   - [x] I would be willing to contribute a fix for this bug with guidance from 
the Iceberg community
   - [x] I cannot contribute a fix for this bug at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to