electrum commented on issue #6758: URL: https://github.com/apache/iceberg/issues/6758#issuecomment-1421540890
We have a hack in Trino to allow reading non-standard paths. That's the [`HadoopPaths`](https://github.com/trinodb/trino/blob/master/lib/trino-filesystem/src/main/java/io/trino/filesystem/hdfs/HadoopPaths.java) code referenced above, along with [this code in `TrinoS3FileSystem`](https://github.com/trinodb/trino/blob/e41c3a7534d20ed55791335c892c8bd96ae30dd2/plugin/trino-hive/src/main/java/io/trino/plugin/hive/s3/TrinoS3FileSystem.java#L987-L990). All of our writing code currently goes through Hadoop, so paths in S3 will be normalized, but we have a project to [decouple Trino from Hadoop and Hive codebases](https://github.com/trinodb/trino/issues/15921), so we'll want our upcoming non-Hadoop [`TrinoFileSystem`](https://github.com/trinodb/trino/blob/master/lib/trino-filesystem/src/main/java/io/trino/filesystem/TrinoFileSystem.java) implementations to handle this correctly. Handling `.` and `..` is tricky, though. If a user does this: ```sql CREATE TABLE ... WITH (location = 's3://foo/bar/../baz') ``` What should the resulting object name in S3 be? What gets written to the Iceberg manifest? Where should the normalization happen? Do we need to normalize all "user input" locations? While I like the idea of just using strings and not dealing with directories (which we copied from Iceberg `FileIO`), it certainly has annoyances and problems when interacting with POSIX-like systems. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org