[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6637: Spark: Spark SQL Extensions for create tag

2023-02-07 Thread via GitHub
amogh-jahagirdar commented on code in PR #6637: URL: https://github.com/apache/iceberg/pull/6637#discussion_r1098956529 ## spark/v3.3/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestReplaceTag.java: ## @@ -0,0 +1,180 @@ +/* + * Licensed to the Apache Softw

[GitHub] [iceberg] RussellSpitzer commented on issue #6758: S3FileIO Can Create Non-Posix Paths

2023-02-07 Thread via GitHub
RussellSpitzer commented on issue #6758: URL: https://github.com/apache/iceberg/issues/6758#issuecomment-1421124658 @findepi Is this an issue with Trino's file IO rules? Or does it always require all paths be Posix compliant? I know you have special S3 IO code as well -- This is an automa

[GitHub] [iceberg] Fokko commented on pull request #6721: Python: Remove the DNF conversion

2023-02-07 Thread via GitHub
Fokko commented on PR #6721: URL: https://github.com/apache/iceberg/pull/6721#issuecomment-1421134926 I think it might be useful at some point for using filtering in Dask: https://www.coiled.io/blog/parquet-file-column-pruning-predicate-pushdown The `filter` method does not (yet) acce

[GitHub] [iceberg] Fokko merged pull request #6721: Python: Remove the DNF conversion

2023-02-07 Thread via GitHub
Fokko merged PR #6721: URL: https://github.com/apache/iceberg/pull/6721 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] Fokko commented on pull request #6721: Python: Remove the DNF conversion

2023-02-07 Thread via GitHub
Fokko commented on PR #6721: URL: https://github.com/apache/iceberg/pull/6721#issuecomment-1421136110 Thanks for the review @rdblue -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [iceberg] findepi commented on issue #6758: S3FileIO Can Create Non-Posix Paths

2023-02-07 Thread via GitHub
findepi commented on issue #6758: URL: https://github.com/apache/iceberg/issues/6758#issuecomment-1421151085 @RussellSpitzer thanks for the ping, but honestly I do not know. I guess by non-POSIX you mean handling of double-slashes. For that @electrum @ebyhr are experts and I think we diff

[GitHub] [iceberg] RussellSpitzer commented on issue #6758: S3FileIO Can Create Non-Posix Paths

2023-02-07 Thread via GitHub
RussellSpitzer commented on issue #6758: URL: https://github.com/apache/iceberg/issues/6758#issuecomment-1421155443 That's basically the only issue, I think you may be safe if you never use a Hadoop File System implementation but it can potentially make an Iceberg table with paths that anot

[GitHub] [iceberg] stevenzwu opened a new pull request, #6764: Flink: improve metrics (elapsedSecondsSinceLastSuccessfulCommit) and …

2023-02-07 Thread via GitHub
stevenzwu opened a new pull request, #6764: URL: https://github.com/apache/iceberg/pull/6764 …logging for IcebergFilesCommitter -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6764: Flink: improve metrics (elapsedSecondsSinceLastSuccessfulCommit) and …

2023-02-07 Thread via GitHub
stevenzwu commented on code in PR #6764: URL: https://github.com/apache/iceberg/pull/6764#discussion_r1098987976 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergFilesCommitter.java: ## @@ -230,6 +230,7 @@ public void notifyCheckpointComplete(long checkpoi

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6764: Flink: improve metrics (elapsedSecondsSinceLastSuccessfulCommit) and …

2023-02-07 Thread via GitHub
stevenzwu commented on code in PR #6764: URL: https://github.com/apache/iceberg/pull/6764#discussion_r1098992688 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergFilesCommitterMetrics.java: ## @@ -37,6 +40,9 @@ class IcebergFilesCommitterMetrics {

[GitHub] [iceberg] jackye1995 commented on issue #6758: S3FileIO Can Create Non-Posix Paths

2023-02-07 Thread via GitHub
jackye1995 commented on issue #6758: URL: https://github.com/apache/iceberg/issues/6758#issuecomment-1421233502 If we add this enforcement and still want backwards compatibility, it seems like we will need to do that as a feature flag. And this cannot just be an engine level feature f

[GitHub] [iceberg] RussellSpitzer commented on issue #6758: S3FileIO Can Create Non-Posix Paths

2023-02-07 Thread via GitHub
RussellSpitzer commented on issue #6758: URL: https://github.com/apache/iceberg/issues/6758#issuecomment-1421239068 I think (2) we can actually do immediately without losing backwards compatibility. All readers can read paths written in the posix compliant matter, so implementing 2 even wit

[GitHub] [iceberg] jackye1995 commented on issue #6758: S3FileIO Can Create Non-Posix Paths

2023-02-07 Thread via GitHub
jackye1995 commented on issue #6758: URL: https://github.com/apache/iceberg/issues/6758#issuecomment-1421247540 I see, I thought you were saying the file paths will still be stored in the original way like `s3://foo/bar//baz`, but will store data at `s3://foo/bar/baz`. Reader also sees `s3:

[GitHub] [iceberg] RussellSpitzer commented on issue #6758: S3FileIO Can Create Non-Posix Paths

2023-02-07 Thread via GitHub
RussellSpitzer commented on issue #6758: URL: https://github.com/apache/iceberg/issues/6758#issuecomment-1421251672 Yeah my plan would be, regardless of what string ends up at the FileIO. The FileIO is responsible for only writing a posix compatible path. So just because I have a table loca

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6716: Spark 3.3: Implement Position Deletes Table

2023-02-07 Thread via GitHub
szehon-ho commented on code in PR #6716: URL: https://github.com/apache/iceberg/pull/6716#discussion_r1099052715 ## core/src/main/java/org/apache/iceberg/MetadataTable.java: ## @@ -0,0 +1,29 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contri

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6716: Spark 3.3: Implement Position Deletes Table

2023-02-07 Thread via GitHub
szehon-ho commented on code in PR #6716: URL: https://github.com/apache/iceberg/pull/6716#discussion_r1099053130 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/PositionDeleteRowReader.java: ## @@ -0,0 +1,114 @@ +/* + * Licensed to the Apache Software Foundatio

[GitHub] [iceberg] jackye1995 commented on issue #6758: S3FileIO Can Create Non-Posix Paths

2023-02-07 Thread via GitHub
jackye1995 commented on issue #6758: URL: https://github.com/apache/iceberg/issues/6758#issuecomment-1421274323 > This is a slightly unique situation because I'm not sure if there are any other FileIO's which allow access to the same underlying filesystem. I think this is probably a p

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #5029: Flink: Use Tag or Branch to scan data.

2023-02-07 Thread via GitHub
jackye1995 commented on code in PR #5029: URL: https://github.com/apache/iceberg/pull/5029#discussion_r1099074713 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/source/StreamingMonitorFunction.java: ## @@ -124,17 +126,27 @@ public void initializeState(FunctionInitia

[GitHub] [iceberg] singhpk234 commented on a diff in pull request #6752: Spark: DROP BRANCH SQL implementation

2023-02-07 Thread via GitHub
singhpk234 commented on code in PR #6752: URL: https://github.com/apache/iceberg/pull/6752#discussion_r1099076234 ## spark/v3.3/spark-extensions/src/main/antlr/org.apache.spark.sql.catalyst.parser.extensions/IcebergSqlExtensions.g4: ## @@ -74,6 +74,7 @@ statement | ALTER TA

[GitHub] [iceberg] RussellSpitzer commented on issue #6758: S3FileIO Can Create Non-Posix Paths

2023-02-07 Thread via GitHub
RussellSpitzer commented on issue #6758: URL: https://github.com/apache/iceberg/issues/6758#issuecomment-1421298665 I have not started, so you are free to take it if you like. I do think we need confirmation from the Trino folks that this is something they can also follow. I know they have

[GitHub] [iceberg] jackye1995 commented on issue #6758: S3FileIO Can Create Non-Posix Paths

2023-02-07 Thread via GitHub
jackye1995 commented on issue #6758: URL: https://github.com/apache/iceberg/issues/6758#issuecomment-1421313305 I think Trino is mostly fine, because all the file paths are converted to Hadoop paths before writing, so technically it already enforces posix style. https://github.com/tr

[GitHub] [iceberg] RussellSpitzer commented on issue #6758: S3FileIO Can Create Non-Posix Paths

2023-02-07 Thread via GitHub
RussellSpitzer commented on issue #6758: URL: https://github.com/apache/iceberg/issues/6758#issuecomment-1421314669 Oh then this is actually much needed for trino too, since by default they can't read a non-posix file either? -- This is an automated message from the Apache Git Service. To

[GitHub] [iceberg] jackye1995 commented on issue #6758: S3FileIO Can Create Non-Posix Paths

2023-02-07 Thread via GitHub
jackye1995 commented on issue #6758: URL: https://github.com/apache/iceberg/issues/6758#issuecomment-1421321586 Correct, that's why I said there is an internal patch in Athena to support non-posix files. -- This is an automated message from the Apache Git Service. To respond to the messag

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6742: support registerTable in GlueCatalog

2023-02-07 Thread via GitHub
jackye1995 commented on code in PR #6742: URL: https://github.com/apache/iceberg/pull/6742#discussion_r1099101322 ## aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java: ## @@ -440,38 +442,50 @@ public void renameTable(TableIdentifier from, TableIdentifier to) { }

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6742: support registerTable in GlueCatalog

2023-02-07 Thread via GitHub
jackye1995 commented on code in PR #6742: URL: https://github.com/apache/iceberg/pull/6742#discussion_r1099102517 ## aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java: ## @@ -437,6 +439,44 @@ public void renameTable(TableIdentifier from, TableIdentifier to) {

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6742: support registerTable in GlueCatalog

2023-02-07 Thread via GitHub
jackye1995 commented on code in PR #6742: URL: https://github.com/apache/iceberg/pull/6742#discussion_r1099103737 ## aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java: ## @@ -437,6 +441,56 @@ public void renameTable(TableIdentifier from, TableIdentifier to) {

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6742: support registerTable in GlueCatalog

2023-02-07 Thread via GitHub
jackye1995 commented on code in PR #6742: URL: https://github.com/apache/iceberg/pull/6742#discussion_r1099104451 ## aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java: ## @@ -437,6 +441,56 @@ public void renameTable(TableIdentifier from, TableIdentifier to) {

[GitHub] [iceberg] rubenvdg commented on pull request #6644: Python: Add support for static table

2023-02-07 Thread via GitHub
rubenvdg commented on PR #6644: URL: https://github.com/apache/iceberg/pull/6644#issuecomment-1421340074 Tuuurlijk maestro -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [iceberg] yyanyy commented on a diff in pull request #6449: Delta: Support Snapshot Delta Lake Table to Iceberg Table

2023-02-07 Thread via GitHub
yyanyy commented on code in PR #6449: URL: https://github.com/apache/iceberg/pull/6449#discussion_r1099128107 ## delta-lake/src/test/java/org/apache/iceberg/delta/TestDeltaLakeTypeToType.java: ## @@ -0,0 +1,143 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under on

[GitHub] [iceberg] theoxu31 commented on a diff in pull request #6742: support registerTable in GlueCatalog

2023-02-07 Thread via GitHub
theoxu31 commented on code in PR #6742: URL: https://github.com/apache/iceberg/pull/6742#discussion_r1099139757 ## aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java: ## @@ -437,6 +439,44 @@ public void renameTable(TableIdentifier from, TableIdentifier to) { LO

[GitHub] [iceberg] theoxu31 commented on a diff in pull request #6742: support registerTable in GlueCatalog

2023-02-07 Thread via GitHub
theoxu31 commented on code in PR #6742: URL: https://github.com/apache/iceberg/pull/6742#discussion_r1099142183 ## aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java: ## @@ -437,6 +441,56 @@ public void renameTable(TableIdentifier from, TableIdentifier to) { LO

[GitHub] [iceberg] JonasJ-ap commented on a diff in pull request #6449: Delta: Support Snapshot Delta Lake Table to Iceberg Table

2023-02-07 Thread via GitHub
JonasJ-ap commented on code in PR #6449: URL: https://github.com/apache/iceberg/pull/6449#discussion_r1099185940 ## delta-lake/src/test/java/org/apache/iceberg/delta/TestDeltaLakeTypeToType.java: ## @@ -0,0 +1,143 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] [iceberg] JonasJ-ap commented on a diff in pull request #6449: Delta: Support Snapshot Delta Lake Table to Iceberg Table

2023-02-07 Thread via GitHub
JonasJ-ap commented on code in PR #6449: URL: https://github.com/apache/iceberg/pull/6449#discussion_r1099188621 ## delta-lake/src/main/java/org/apache/iceberg/delta/BaseSnapshotDeltaLakeTableAction.java: ## @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6554: Parquet: Improve Test Coverage of RowGroupFilter Code with Nans #6518

2023-02-07 Thread via GitHub
RussellSpitzer commented on code in PR #6554: URL: https://github.com/apache/iceberg/pull/6554#discussion_r1099188730 ## data/src/test/java/org/apache/iceberg/data/TestMetricsRowGroupFilter.java: ## @@ -86,6 +65,27 @@ import org.junit.runner.RunWith; import org.junit.runners.P

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6742: AWS: support force register table in GlueCatalog

2023-02-07 Thread via GitHub
jackye1995 commented on code in PR #6742: URL: https://github.com/apache/iceberg/pull/6742#discussion_r1099301939 ## aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java: ## @@ -437,6 +441,68 @@ public void renameTable(TableIdentifier from, TableIdentifier to) {

[GitHub] [iceberg] jackye1995 commented on pull request #6449: Delta: Support Snapshot Delta Lake Table to Iceberg Table

2023-02-07 Thread via GitHub
jackye1995 commented on PR #6449: URL: https://github.com/apache/iceberg/pull/6449#issuecomment-1421512814 Looks like all the comments are addressed. Synced offline with @danielcweeks and @nastra and I believe they don't have further comments either. Given the fact that this PR has be

[GitHub] [iceberg] jackye1995 merged pull request #6449: Delta: Support Snapshot Delta Lake Table to Iceberg Table

2023-02-07 Thread via GitHub
jackye1995 merged PR #6449: URL: https://github.com/apache/iceberg/pull/6449 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

[GitHub] [iceberg] jackye1995 opened a new issue, #6766: Support adding Delta transaction to Iceberg

2023-02-07 Thread via GitHub
jackye1995 opened a new issue, #6766: URL: https://github.com/apache/iceberg/issues/6766 ### Feature Request / Improvement Similar to the `addFiles` feature for Hive to Iceberg, we can support `addTransactions` for Delta to Iceberg migration experience, so that people can do an initi

[GitHub] [iceberg] dependabot[bot] opened a new pull request, #6767: Build: Bump cryptography from 39.0.0 to 39.0.1 in /python

2023-02-07 Thread via GitHub
dependabot[bot] opened a new pull request, #6767: URL: https://github.com/apache/iceberg/pull/6767 Bumps [cryptography](https://github.com/pyca/cryptography) from 39.0.0 to 39.0.1. Changelog Sourced from https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst";>cryptography's

[GitHub] [iceberg] jackye1995 opened a new issue, #6768: Support Delta name mapping to Iceberg conversion

2023-02-07 Thread via GitHub
jackye1995 opened a new issue, #6768: URL: https://github.com/apache/iceberg/issues/6768 ### Feature Request / Improvement Support mapping Delta physical column name to Iceberg using Iceberg's name mapping feature ### Query engine None -- This is an automated message

[GitHub] [iceberg] jackye1995 opened a new issue, #6769: Support version travel by tag for Delta migrated table

2023-02-07 Thread via GitHub
jackye1995 opened a new issue, #6769: URL: https://github.com/apache/iceberg/issues/6769 ### Feature Request / Improvement We can add tags for each Delta lake table log when we add to Iceberg transaction, such that users can travel using the same time and logical version ID. For exam

[GitHub] [iceberg] electrum commented on issue #6758: S3FileIO Can Create Non-Posix Paths

2023-02-07 Thread via GitHub
electrum commented on issue #6758: URL: https://github.com/apache/iceberg/issues/6758#issuecomment-1421540890 We have a hack in Trino to allow reading non-standard paths. That's the [`HadoopPaths`](https://github.com/trinodb/trino/blob/master/lib/trino-filesystem/src/main/java/io/trino/files

[GitHub] [iceberg] rdblue commented on a diff in pull request #6485: API: New KMS Client Interface

2023-02-07 Thread via GitHub
rdblue commented on code in PR #6485: URL: https://github.com/apache/iceberg/pull/6485#discussion_r1099333500 ## api/src/main/java/org/apache/iceberg/encryption/KmsClient.java: ## @@ -22,7 +22,11 @@ import java.nio.ByteBuffer; import java.util.Map; -/** A minimum client inte

[GitHub] [iceberg] rdblue commented on a diff in pull request #6485: API: New KMS Client Interface

2023-02-07 Thread via GitHub
rdblue commented on code in PR #6485: URL: https://github.com/apache/iceberg/pull/6485#discussion_r1099334157 ## core/src/main/java/org/apache/iceberg/encryption/envelope/KmsClient.java: ## @@ -0,0 +1,110 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

[GitHub] [iceberg] RussellSpitzer commented on issue #6758: S3FileIO Can Create Non-Posix Paths

2023-02-07 Thread via GitHub
RussellSpitzer commented on issue #6758: URL: https://github.com/apache/iceberg/issues/6758#issuecomment-1421559622 My call would be that everything new file we create with FileIO (in iceberg) hits this transformation ```scala import java.nio.file.Paths scala> def posixNormalize

[GitHub] [iceberg] jackye1995 commented on issue #6758: S3FileIO Can Create Non-Posix Paths

2023-02-07 Thread via GitHub
jackye1995 commented on issue #6758: URL: https://github.com/apache/iceberg/issues/6758#issuecomment-1421564259 > Handling . and .. is tricky, though. If a user does this: > > CREATE TABLE ... WITH (location = 's3://foo/bar/../baz') > > What should the resulting object name in

[GitHub] [iceberg] RussellSpitzer commented on issue #6758: S3FileIO Can Create Non-Posix Paths

2023-02-07 Thread via GitHub
RussellSpitzer commented on issue #6758: URL: https://github.com/apache/iceberg/issues/6758#issuecomment-1421567311 I still think we just set this up, Iceberg only makes posix paths for any new files it creates. Anything going through FileIO.create is normalized. This means everything in th

[GitHub] [iceberg] szehon-ho commented on issue #6257: Partitions metadata table shows old partitions

2023-02-07 Thread via GitHub
szehon-ho commented on issue #6257: URL: https://github.com/apache/iceberg/issues/6257#issuecomment-1421582477 Hi yes, sounds good to me. I think we are doing a doc per engine, and so right now I believe it's only there for Spark. Ref some earlier discussions on how to document metadata t

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6661: Core: Support delete file stats in partitions metadata table

2023-02-07 Thread via GitHub
szehon-ho commented on code in PR #6661: URL: https://github.com/apache/iceberg/pull/6661#discussion_r1099372207 ## core/src/main/java/org/apache/iceberg/PartitionsTable.java: ## @@ -47,7 +48,11 @@ public class PartitionsTable extends BaseMetadataTable { Types.Neste

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6764: Flink: improve metrics (elapsedSecondsSinceLastSuccessfulCommit) and …

2023-02-07 Thread via GitHub
stevenzwu commented on code in PR #6764: URL: https://github.com/apache/iceberg/pull/6764#discussion_r1099404793 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergFilesCommitterMetrics.java: ## @@ -37,6 +40,9 @@ class IcebergFilesCommitterMetrics {

[GitHub] [iceberg] electrum commented on issue #6758: S3FileIO Can Create Non-Posix Paths

2023-02-07 Thread via GitHub
electrum commented on issue #6758: URL: https://github.com/apache/iceberg/issues/6758#issuecomment-1421663434 I just realized this has implications for the specification. If a manifest file contains a location containing `.` or `..` path segments, how are readers supposed to interpret that?

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6746: AWS: Load HttpClientBuilder dynamically to avoid runtime deps of both urlconnection and apache client

2023-02-07 Thread via GitHub
stevenzwu commented on code in PR #6746: URL: https://github.com/apache/iceberg/pull/6746#discussion_r109967 ## aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java: ## @@ -1314,55 +1270,18 @@ private void configureEndpoint(T builder, String en } } - @Vis

[GitHub] [iceberg] github-actions[bot] commented on issue #5351: Docs:Double click [Docs] menu return 404 Not Found

2023-02-07 Thread via GitHub
github-actions[bot] commented on issue #5351: URL: https://github.com/apache/iceberg/issues/5351#issuecomment-1421680896 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

[GitHub] [iceberg] github-actions[bot] closed issue #5351: Docs:Double click [Docs] menu return 404 Not Found

2023-02-07 Thread via GitHub
github-actions[bot] closed issue #5351: Docs:Double click [Docs] menu return 404 Not Found URL: https://github.com/apache/iceberg/issues/5351 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

[GitHub] [iceberg] JonasJ-ap commented on pull request #6746: AWS: Load HttpClientBuilder dynamically to avoid runtime deps of both urlconnection and apache client

2023-02-07 Thread via GitHub
JonasJ-ap commented on PR #6746: URL: https://github.com/apache/iceberg/pull/6746#issuecomment-1421709404 Thanks everyone for reviewing this! I will add a proof of running the latest update on EKS later today or tomorrow -- This is an automated message from the Apache Git Service. To resp

[GitHub] [iceberg] singhpk234 commented on a diff in pull request #6752: Spark: DROP BRANCH SQL implementation

2023-02-07 Thread via GitHub
singhpk234 commented on code in PR #6752: URL: https://github.com/apache/iceberg/pull/6752#discussion_r1099477153 ## spark/v3.3/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestBranchDDL.java: ## @@ -206,9 +212,54 @@ public void testCreateBranchUseCustomMax

[GitHub] [iceberg] jackye1995 opened a new issue, #6770: Add dedicated documentation page for table migrations

2023-02-07 Thread via GitHub
jackye1995 opened a new issue, #6770: URL: https://github.com/apache/iceberg/issues/6770 ### Feature Request / Improvement I was frequently asked how to migrate from Hive to Iceberg, and have to point to the Spark documentation for the procedures. I think it would be beneficial to cr

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6752: Spark: DROP BRANCH SQL implementation

2023-02-07 Thread via GitHub
amogh-jahagirdar commented on code in PR #6752: URL: https://github.com/apache/iceberg/pull/6752#discussion_r1099541561 ## spark/v3.3/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestBranchDDL.java: ## @@ -206,9 +212,54 @@ public void testCreateBranchUseCus

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6752: Spark: DROP BRANCH SQL implementation

2023-02-07 Thread via GitHub
amogh-jahagirdar commented on code in PR #6752: URL: https://github.com/apache/iceberg/pull/6752#discussion_r1099542104 ## spark/v3.3/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestBranchDDL.java: ## @@ -206,9 +212,54 @@ public void testCreateBranchUseCus

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6660: Flink: Support writes to branches in FlinkSink

2023-02-07 Thread via GitHub
amogh-jahagirdar commented on code in PR #6660: URL: https://github.com/apache/iceberg/pull/6660#discussion_r1099553716 ## flink/v1.16/flink/src/test/java/org/apache/iceberg/flink/sink/TestFlinkIcebergSinkBranch.java: ## @@ -0,0 +1,398 @@ +/* + * Licensed to the Apache Software

[GitHub] [iceberg] mas-chen commented on a diff in pull request #6764: Flink: improve metrics (elapsedSecondsSinceLastSuccessfulCommit) and …

2023-02-07 Thread via GitHub
mas-chen commented on code in PR #6764: URL: https://github.com/apache/iceberg/pull/6764#discussion_r1099564976 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergFilesCommitterMetrics.java: ## @@ -54,12 +60,34 @@ void commitDuration(long commitDurationMs) {

[GitHub] [iceberg] mas-chen commented on a diff in pull request #6764: Flink: improve metrics (elapsedSecondsSinceLastSuccessfulCommit) and …

2023-02-07 Thread via GitHub
mas-chen commented on code in PR #6764: URL: https://github.com/apache/iceberg/pull/6764#discussion_r1099564976 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergFilesCommitterMetrics.java: ## @@ -54,12 +60,34 @@ void commitDuration(long commitDurationMs) {

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6764: Flink: improve metrics (elapsedSecondsSinceLastSuccessfulCommit) and …

2023-02-07 Thread via GitHub
stevenzwu commented on code in PR #6764: URL: https://github.com/apache/iceberg/pull/6764#discussion_r1099586805 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergFilesCommitterMetrics.java: ## @@ -54,12 +60,34 @@ void commitDuration(long commitDurationMs)

[GitHub] [iceberg] RussellSpitzer commented on issue #6758: S3FileIO Can Create Non-Posix Paths

2023-02-07 Thread via GitHub
RussellSpitzer commented on issue #6758: URL: https://github.com/apache/iceberg/issues/6758#issuecomment-1421931089 @electrum thats part of my suggestion, no path entries can contain any unnormailzed posix characters. No .., ., //, maybe also specify no symbolic links although I'm not sure

[GitHub] [iceberg] mas-chen commented on a diff in pull request #6764: Flink: improve metrics (elapsedSecondsSinceLastSuccessfulCommit) and …

2023-02-07 Thread via GitHub
mas-chen commented on code in PR #6764: URL: https://github.com/apache/iceberg/pull/6764#discussion_r1099674674 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergFilesCommitterMetrics.java: ## @@ -54,12 +60,34 @@ void commitDuration(long commitDurationMs) {

[GitHub] [iceberg] pvary commented on a diff in pull request #6765: Doc: update Flink doc for sink metrics

2023-02-07 Thread via GitHub
pvary commented on code in PR #6765: URL: https://github.com/apache/iceberg/pull/6765#discussion_r1099676158 ## docs/flink-getting-started.md: ## @@ -747,6 +747,44 @@ FlinkSink.builderFor( .append(); ``` +### monitoring metrics + +The following Flink metrics are provided b

[GitHub] [iceberg] pvary commented on a diff in pull request #6765: Doc: update Flink doc for sink metrics

2023-02-07 Thread via GitHub
pvary commented on code in PR #6765: URL: https://github.com/apache/iceberg/pull/6765#discussion_r1099677164 ## docs/flink-getting-started.md: ## @@ -747,6 +747,44 @@ FlinkSink.builderFor( .append(); ``` +### monitoring metrics + +The following Flink metrics are provided b

[GitHub] [iceberg] pvary commented on pull request #6570: Hive: Use EnvironmentContext instead of Hive Locks to provide transactional commits after HIVE-26882

2023-02-07 Thread via GitHub
pvary commented on PR #6570: URL: https://github.com/apache/iceberg/pull/6570#issuecomment-1422053720 @szehon-ho: Gentle reminder, could you please take a look when you have some time? Thanks, Peter -- This is an automated message from the Apache Git Service. To respond to the message

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6660: Flink: Support writes to branches in FlinkSink

2023-02-07 Thread via GitHub
amogh-jahagirdar commented on code in PR #6660: URL: https://github.com/apache/iceberg/pull/6660#discussion_r1099553716 ## flink/v1.16/flink/src/test/java/org/apache/iceberg/flink/sink/TestFlinkIcebergSinkBranch.java: ## @@ -0,0 +1,398 @@ +/* + * Licensed to the Apache Software

[GitHub] [iceberg] krvikash commented on pull request #6499: AWS, Core, Hive: Fix `checkCommitStatus` when create table commit fails

2023-02-07 Thread via GitHub
krvikash commented on PR #6499: URL: https://github.com/apache/iceberg/pull/6499#issuecomment-1422081585 Rebased and resolved conflicts. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [iceberg] Fokko merged pull request #6767: Build: Bump cryptography from 39.0.0 to 39.0.1 in /python

2023-02-07 Thread via GitHub
Fokko merged PR #6767: URL: https://github.com/apache/iceberg/pull/6767 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] mas-chen commented on a diff in pull request #6764: Flink: improve metrics (elapsedSecondsSinceLastSuccessfulCommit) and …

2023-02-07 Thread via GitHub
mas-chen commented on code in PR #6764: URL: https://github.com/apache/iceberg/pull/6764#discussion_r1099720211 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergFilesCommitterMetrics.java: ## @@ -54,12 +60,34 @@ void commitDuration(long commitDurationMs) {

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #5029: Flink: Use Tag or Branch to scan data.

2023-02-07 Thread via GitHub
amogh-jahagirdar commented on code in PR #5029: URL: https://github.com/apache/iceberg/pull/5029#discussion_r1099722438 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/source/StreamingMonitorFunction.java: ## @@ -124,11 +126,33 @@ public void initializeState(Function

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #5984: Core, API: Support incremental scanning with branch

2023-02-07 Thread via GitHub
amogh-jahagirdar commented on code in PR #5984: URL: https://github.com/apache/iceberg/pull/5984#discussion_r1016157787 ## api/src/main/java/org/apache/iceberg/IncrementalScan.java: ## @@ -33,6 +34,18 @@ */ ThisT fromSnapshotInclusive(long fromSnapshotId); + /** + *

[GitHub] [iceberg-docs] nastra commented on pull request #199: [doc] Update the doris doc url for 1.1.0

2023-02-07 Thread via GitHub
nastra commented on PR #199: URL: https://github.com/apache/iceberg-docs/pull/199#issuecomment-1422168881 just saw that this targets 1.1.0 after approving. I don't think we currently retro-fix docs after a release has been made. -- This is an automated message from the Apache Git Service.

[GitHub] [iceberg-docs] Fokko merged pull request #199: [doc] Update the doris doc url for 1.1.0

2023-02-07 Thread via GitHub
Fokko merged PR #199: URL: https://github.com/apache/iceberg-docs/pull/199 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

[GitHub] [iceberg] youngxinler commented on pull request #6571: Data: java api add GenericTaskWriter and add write demo to Doc.

2023-02-07 Thread via GitHub
youngxinler commented on PR #6571: URL: https://github.com/apache/iceberg/pull/6571#issuecomment-1422170497 @jackye1995 @amogh-jahagirdar about this PR, please let me know if it can be merged into the master or if there is anything that needs to be done further? I would be happy to see yo

[GitHub] [iceberg] Fokko commented on issue #6505: Python: Infer Iceberg schema from the Parquet file

2023-02-08 Thread via GitHub
Fokko commented on issue #6505: URL: https://github.com/apache/iceberg/issues/6505#issuecomment-143727 @JonasJ-ap Anything I can help with? If you don't have time, maybe @amogh-jahagirdar is interested in picking this up. I'd love to get this in 0.4.0 -- This is an automated message

[GitHub] [iceberg] ggershinsky commented on a diff in pull request #6450: Core: Envelope encryption with Avro key metadata

2023-02-08 Thread via GitHub
ggershinsky commented on code in PR #6450: URL: https://github.com/apache/iceberg/pull/6450#discussion_r1099898702 ## core/src/main/java/org/apache/iceberg/encryption/envelope/EnvelopeKeyMetadata.java: ## @@ -0,0 +1,122 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] [iceberg] ggershinsky commented on pull request #6485: API: New KMS Client Interface

2023-02-08 Thread via GitHub
ggershinsky commented on PR #6485: URL: https://github.com/apache/iceberg/pull/6485#issuecomment-1422356262 fixed in the last commit -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [iceberg] gaborkaszab opened a new pull request, #6771: Docs: Document that partitions metadata table might show 'old' partitions

2023-02-08 Thread via GitHub
gaborkaszab opened a new pull request, #6771: URL: https://github.com/apache/iceberg/pull/6771 https://github.com/apache/iceberg/issues/6257 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[GitHub] [iceberg] gaborkaszab commented on pull request #6771: Docs: Document that partitions metadata table might show 'old' partitions

2023-02-08 Thread via GitHub
gaborkaszab commented on PR #6771: URL: https://github.com/apache/iceberg/pull/6771#issuecomment-1422517138 cc @szehon-ho -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6771: Docs: Document that partitions metadata table might show 'old' partitions

2023-02-08 Thread via GitHub
ajantha-bhat commented on code in PR #6771: URL: https://github.com/apache/iceberg/pull/6771#discussion_r1100063008 ## docs/spark-queries.md: ## @@ -346,6 +346,9 @@ SELECT * FROM prod.db.table.partitions; Note: For unpartitioned tables, the partitions table will contain only t

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6771: Docs: Document that partitions metadata table might show 'old' partitions

2023-02-08 Thread via GitHub
ajantha-bhat commented on code in PR #6771: URL: https://github.com/apache/iceberg/pull/6771#discussion_r1100067028 ## docs/spark-queries.md: ## @@ -346,6 +346,9 @@ SELECT * FROM prod.db.table.partitions; Note: For unpartitioned tables, the partitions table will contain only t

[GitHub] [iceberg-docs] gaborkaszab commented on a diff in pull request #187: Update the how-to-release page with findings after being a release manager

2023-02-08 Thread via GitHub
gaborkaszab commented on code in PR #187: URL: https://github.com/apache/iceberg-docs/pull/187#discussion_r1100101504 ## landing-page/content/common/how-to-release.md: ## @@ -192,11 +212,15 @@ This release includes important changes that I should have summarized here, but Pl

[GitHub] [iceberg-docs] gaborkaszab commented on a diff in pull request #187: Update the how-to-release page with findings after being a release manager

2023-02-08 Thread via GitHub
gaborkaszab commented on code in PR #187: URL: https://github.com/apache/iceberg-docs/pull/187#discussion_r1100198419 ## landing-page/content/common/how-to-release.md: ## @@ -21,6 +21,18 @@ disableSidebar: true - limitations under the License. --> +## Introduction + +This

[GitHub] [iceberg] srilman commented on pull request #6745: Python: Use Version Ranges for Various Dependencies

2023-02-08 Thread via GitHub
srilman commented on PR #6745: URL: https://github.com/apache/iceberg/pull/6745#issuecomment-1422714130 Ready for review. @Fokko or someone else? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[GitHub] [iceberg] findepi commented on issue #6758: S3FileIO Can Create Non-Posix Paths

2023-02-08 Thread via GitHub
findepi commented on issue #6758: URL: https://github.com/apache/iceberg/issues/6758#issuecomment-1422756116 I am concerned about security implications of handling `..` in paths. At the same time I don't see any practical implications of that. If we gonna do any form of normalization, we sh

[GitHub] [iceberg] amogh-jahagirdar closed pull request #6660: Flink: Support writes to branches in FlinkSink

2023-02-08 Thread via GitHub
amogh-jahagirdar closed pull request #6660: Flink: Support writes to branches in FlinkSink URL: https://github.com/apache/iceberg/pull/6660 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [iceberg] amogh-jahagirdar opened a new pull request, #6660: Flink: Support writes to branches in FlinkSink

2023-02-08 Thread via GitHub
amogh-jahagirdar opened a new pull request, #6660: URL: https://github.com/apache/iceberg/pull/6660 This change adds support to write to branches in Flink Sink via a FlinkWriteOption "branch" -- This is an automated message from the Apache Git Service. To respond to the message, please lo

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6764: Flink: improve metrics (elapsedSecondsSinceLastSuccessfulCommit) and …

2023-02-08 Thread via GitHub
stevenzwu commented on code in PR #6764: URL: https://github.com/apache/iceberg/pull/6764#discussion_r1100291622 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergFilesCommitterMetrics.java: ## @@ -54,12 +60,38 @@ void commitDuration(long commitDurationMs)

[GitHub] [iceberg] JonasJ-ap commented on issue #6505: Python: Infer Iceberg schema from the Parquet file

2023-02-08 Thread via GitHub
JonasJ-ap commented on issue #6505: URL: https://github.com/apache/iceberg/issues/6505#issuecomment-1422781924 Sorry that I haven't got enough time to work this out. @amogh-jahagirdar please feel free to pick this up if you are interested in. -- This is an automated message from the Apach

[GitHub] [iceberg] jackye1995 opened a new pull request, #6772: Core: enforce writing POSIX compatible paths

2023-02-08 Thread via GitHub
jackye1995 opened a new pull request, #6772: URL: https://github.com/apache/iceberg/pull/6772 fixes #6758 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e

[GitHub] [iceberg] jackye1995 commented on pull request #6772: Core: enforce writing POSIX compatible paths

2023-02-08 Thread via GitHub
jackye1995 commented on PR #6772: URL: https://github.com/apache/iceberg/pull/6772#issuecomment-1422829117 @RussellSpitzer @electrum @findepi @amogh-jahagirdar -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6772: Core: enforce writing POSIX compatible paths

2023-02-08 Thread via GitHub
jackye1995 commented on code in PR #6772: URL: https://github.com/apache/iceberg/pull/6772#discussion_r1100330049 ## core/src/test/java/org/apache/iceberg/TestLocationProvider.java: ## @@ -297,4 +297,17 @@ public void testObjectStorageWithinTableLocation() { Assert.assertEq

[GitHub] [iceberg] deniskuzZ opened a new pull request, #6773: Extended data files abstraction in the table manifest with the modification time

2023-02-08 Thread via GitHub
deniskuzZ opened a new pull request, #6773: URL: https://github.com/apache/iceberg/pull/6773 Iceberg currently does not track the last modification time of a file. That's required to properly set up the LLAP cache key triplet. SyntheticFileId fileId = new SyntheticFileId(path, ta

[GitHub] [iceberg] munendrasn commented on issue #6763: ACL when using DynamoDb based Catalog

2023-02-08 Thread via GitHub
munendrasn commented on issue #6763: URL: https://github.com/apache/iceberg/issues/6763#issuecomment-1422834606 @jackye1995 Pinging you as it relates to PR https://github.com/apache/iceberg/pull/2688. Please share your thoughts/opinion on swapping Values for Namespace entries -- This

[GitHub] [iceberg] jackye1995 commented on issue #6758: S3FileIO Can Create Non-Posix Paths

2023-02-08 Thread via GitHub
jackye1995 commented on issue #6758: URL: https://github.com/apache/iceberg/issues/6758#issuecomment-1422835045 I put up a draft for discussion, there are different ways to achieve that, the idea in the PR is to normalizes all the places that we write a location, which includes table locati

[GitHub] [iceberg] jackye1995 commented on issue #6763: ACL when using DynamoDb based Catalog

2023-02-08 Thread via GitHub
jackye1995 commented on issue #6763: URL: https://github.com/apache/iceberg/issues/6763#issuecomment-1422854166 The plan sounds good to me, limitation on row level security was overlooked during the initial implementation. But how do we ensure backwards compatibility? Feels like if we updat

[GitHub] [iceberg] gaborkaszab commented on a diff in pull request #6621: [HiveCatalog] Support Altering and Dropping Table Ownership

2023-02-08 Thread via GitHub
gaborkaszab commented on code in PR #6621: URL: https://github.com/apache/iceberg/pull/6621#discussion_r1100359365 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java: ## @@ -494,6 +494,17 @@ private void setHmsTableParameters( // remove any pro

<    11   12   13   14   15   16   17   18   19   20   >