[GitHub] [iceberg] electrum commented on issue #6758: S3FileIO Can Create Non-Posix Paths


electrum commented on issue #6758:
URL: https://github.com/apache/iceberg/issues/6758#issuecomment-1421540890

We have a hack in Trino to allow reading non-standard paths. That's the
[`HadoopPaths`](https://github.com/trinodb/trino/blob/master/lib/trino-filesystem/src/main/java/io/trino/filesystem/hdfs/HadoopPaths.java)
code referenced above, along with [this code in
`TrinoS3FileSystem`](https://github.com/trinodb/trino/blob/e41c3a7534d20ed55791335c892c8bd96ae30dd2/plugin/trino-hive/src/main/java/io/trino/plugin/hive/s3/TrinoS3FileSystem.java#L987-L990).

All of our writing code currently goes through Hadoop, so paths in S3 will
be normalized, but we have a project to [decouple Trino from Hadoop and Hive
codebases](https://github.com/trinodb/trino/issues/15921), so we'll want our
upcoming non-Hadoop
[`TrinoFileSystem`](https://github.com/trinodb/trino/blob/master/lib/trino-filesystem/src/main/java/io/trino/filesystem/TrinoFileSystem.java)
implementations to handle this correctly.

Handling `.` and `..` is tricky, though. If a user does this:

```sql
CREATE TABLE ... WITH (location = 's3://foo/bar/../baz')
```

What should the resulting object name in S3 be? What gets written to the
Iceberg manifest? Where should the normalization happen? Do we need to
normalize all "user input" locations?

While I like the idea of just using strings and not dealing with directories
(which we copied from Iceberg `FileIO`), it certainly has annoyances and
problems when interacting with POSIX-like systems.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

[GitHub] [iceberg] electrum commented on issue #6758: S3FileIO Can Create Non-Posix Paths

Reply via email to