SanjayKhoros commented on issue #10907:
URL: https://github.com/apache/iceberg/issues/10907#issuecomment-2353234819

   Thanks for the quick reply @RussellSpitzer 
   
   Sharing little more details,
   Flink version - 1.20.0
   Iceberg version - 1.6.1
   
   ` long cutoffDateMillis = LocalDateTime.now()
                   
.minusDays(Long.parseLong(flinkConfig.dataCleanup.retentionPeriod))
                   .toInstant(ZoneOffset.UTC)
                   .toEpochMilli();`
   
   I printed my cutOffDateMillis -> **1726313530137**
   Currently testing the issue in my Dev environment so changed the retain day 
to 2 days.
   
   Like I mentioned earlier, Soft delete is working without any issues. When I 
query the records based on day, I only see 2 days of data, older records are 
not appearing ! Major issue is the data not getting cleaned up from S3.
   
   My **data/** folder is hardly around **600MB** while **metadata/** is around 
**1TB** ! I get no errors executing the above rewriteManifests() & 
expireSnapshots() as well !
   
   Based on your comments above, I thought maybe I should run 
**deleteOrphanFiles** so added the below support as well:
   ```
           <dependency>
               <groupId>org.apache.iceberg</groupId>
               <artifactId>iceberg-spark-runtime-3.4_2.12</artifactId>
               <version>${iceberg.version}</version>
           </dependency>
   ```
   And added it below expireSnapshots which is inside a try catch block  
   ```
               icebergTable.expireSnapshots()
                       .expireOlderThan(cutoffDateMillis)
                       .commit();
               icebergTable.refresh();
               
               logger.info("executing deleteOrphanFiles " + 
System.currentTimeMillis());
               SparkActions.get().deleteOrphanFiles(icebergTable)
                       .olderThan(cutoffDateMillis)
                       .execute();
               logger.info("deleteOrphanFiles completed successfully");
   ```
   Currently the service is on hold after "executing deleteOrphanFiles" log for 
the past 4 hours ! I'm hoping it does something or throws any error atleast. 
   
   If you have any suggestions please do share, I'm out of options and 
references at this point, Thank you !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to