ForeverAngry commented on issue #2130:
URL: 
https://github.com/apache/iceberg-python/issues/2130#issuecomment-3006410326

   So @MrDerecho , are you saying that you are looking for a function that can 
remove a `DataFile` entry, and then create a new snapshot with an updated 
`ManifestFile`?  
   
   Does this illustration capture the problem?
   
   ```mermaid
   graph TD
     subgraph Iceberg Table Metadata
       manifest1["ManifestFile"]
       snapshot1["Snapshot"]
       dataFile1["DataFile A"]
       dataFile2["DataFile B"]
       parquetFile["Parquet File (s3://bucket/path/to/data.parquet)"]
     end
   
     snapshot1 --> manifest1
     manifest1 --> dataFile1
     manifest1 --> dataFile2
     dataFile1 --> parquetFile
     dataFile2 --> parquetFile
   
     note1["Note: Both DataFile A and B point to the same Parquet file"]
     note1 --- parquetFile
   
   ```
   
   Which i think could include situations like this as well:
   
   ```mermaid
   graph TD
     subgraph Iceberg Table Metadata
       snapshot1["Snapshot"]
       manifest1["ManifestFile A"]
       manifest2["ManifestFile B"]
       dataFile1["DataFile A (in Manifest A)"]
       dataFile2["DataFile B (in Manifest B)"]
       parquetFile["Parquet File (s3://bucket/path/to/data.parquet)"]
     end
   
     snapshot1 --> manifest1
     snapshot1 --> manifest2
     manifest1 --> dataFile1
     manifest2 --> dataFile2
     dataFile1 --> parquetFile
     dataFile2 --> parquetFile
   
     note1["Note: Both Manifest Files refer to DataFiles that share the same 
physical Parquet file"]
     note1 --- parquetFile
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to