coolderli opened a new issue, #9249: URL: https://github.com/apache/iceberg/issues/9249
### Query engine _No response_ ### Question Recently, I was researching solutions for managing unstructured files and discovered [the volume of databricks](https://docs.databricks.com/en/data-governance/unity-catalog/create-volumes.html). I was wondering if it could be implemented on Iceberg. Here is my simple idea. Using catalogs and volumes to manage unstructured files can facilitate better data governance, such as lifecycle management. I envision a volume as a logical volume that can contain actual files and achieve transaction isolation through snapshots. For example, mapping files `volume://catalog_name/database_name/table_name/mydb/my-volume/file1` to `s3://bucket_ Name/mydb/my volume/file1`, `volume://catalog_name/database_name/table_name/mydb/my-volume/file2` to `abfss://azure_account_name/container_name/mydb/my-volumn/file2` Another consideration is that I want to read file rather than read a table, as file access is supported in many deep learning frameworks such as TensorFlow. And the formats supported by these frameworks are relatively difficult to structure. Let's take Spark as an example, I prefer to use `Spark. read(). csv(“ volume://catalog_name/database_name/table_name/mydb/my-volume/ ")` not `spark. read(). table (" catalogname. databasename. tablename ")` But I found that there is a lack of api support on engines like Spark, and I was wondering if it's worth trying -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org