[GitHub] [iceberg] nastra opened a new pull request, #6722: Pin openapi-spec-validator version

2023-02-01 Thread via GitHub
nastra opened a new pull request, #6722: URL: https://github.com/apache/iceberg/pull/6722 0.5.3 of openapi-spec-validator was recently released (https://pypi.org/project/openapi-spec-validator/#history) and it fails with the below error: ``` openapi-spec-validator open-api/rest-ca

[GitHub] [iceberg] Fokko commented on pull request #6722: Pin openapi-spec-validator version

2023-02-01 Thread via GitHub
Fokko commented on PR #6722: URL: https://github.com/apache/iceberg/pull/6722#issuecomment-1412278054 Seems that more people have this issue: https://github.com/p1c2u/openapi-spec-validator/issues/192 -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [iceberg] rdblue commented on a diff in pull request #6706: Refactor table metadata snapshot management

2023-02-01 Thread via GitHub
rdblue commented on code in PR #6706: URL: https://github.com/apache/iceberg/pull/6706#discussion_r1093402518 ## core/src/main/java/org/apache/iceberg/SnapshotOperations.java: ## @@ -0,0 +1,202 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more con

[GitHub] [iceberg] nastra commented on issue #6708: Quick start docker-compose demo doesn't work

2023-02-01 Thread via GitHub
nastra commented on issue #6708: URL: https://github.com/apache/iceberg/issues/6708#issuecomment-1412280147 This seems like a network issue. Is any of the components from the quickstart demo not reachable maybe? -- This is an automated message from the Apache Git Service. To respond to th

[GitHub] [iceberg] rdblue commented on a diff in pull request #6706: Refactor table metadata snapshot management

2023-02-01 Thread via GitHub
rdblue commented on code in PR #6706: URL: https://github.com/apache/iceberg/pull/6706#discussion_r1093405956 ## core/src/main/java/org/apache/iceberg/SnapshotOperations.java: ## @@ -0,0 +1,202 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more con

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6682: Bulk delete

2023-02-01 Thread via GitHub
RussellSpitzer commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1093452257 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/actions/BaseSparkAction.java: ## @@ -85,6 +88,7 @@ private static final Logger LOG = LoggerFactory

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6712: Nessie: Support ApiV2 for Nessie client

2023-02-01 Thread via GitHub
ajantha-bhat commented on code in PR #6712: URL: https://github.com/apache/iceberg/pull/6712#discussion_r1093457847 ## nessie/src/test/java/org/apache/iceberg/nessie/TestNamespace.java: ## @@ -63,14 +65,56 @@ public void testListNamespaces() { tables = catalog.listTables(nu

[GitHub] [iceberg] namrathamyske commented on a diff in pull request #6651: Spark 3.3 write to branch snapshot

2023-02-01 Thread via GitHub
namrathamyske commented on code in PR #6651: URL: https://github.com/apache/iceberg/pull/6651#discussion_r1093506943 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java: ## @@ -247,9 +247,6 @@ public ScanBuilder newScanBuilder(CaseInsensitiveStringM

[GitHub] [iceberg] namrathamyske commented on a diff in pull request #6651: Spark 3.3 write to branch snapshot

2023-02-01 Thread via GitHub
namrathamyske commented on code in PR #6651: URL: https://github.com/apache/iceberg/pull/6651#discussion_r1093506943 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java: ## @@ -247,9 +247,6 @@ public ScanBuilder newScanBuilder(CaseInsensitiveStringM

[GitHub] [iceberg] namrathamyske commented on a diff in pull request #6651: Spark 3.3 write to branch snapshot

2023-02-01 Thread via GitHub
namrathamyske commented on code in PR #6651: URL: https://github.com/apache/iceberg/pull/6651#discussion_r1093506943 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java: ## @@ -247,9 +247,6 @@ public ScanBuilder newScanBuilder(CaseInsensitiveStringM

[GitHub] [iceberg] namrathamyske commented on a diff in pull request #6651: Spark 3.3 write to branch snapshot

2023-02-01 Thread via GitHub
namrathamyske commented on code in PR #6651: URL: https://github.com/apache/iceberg/pull/6651#discussion_r1093506943 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java: ## @@ -247,9 +247,6 @@ public ScanBuilder newScanBuilder(CaseInsensitiveStringM

[GitHub] [iceberg] pauetpupa commented on pull request #6571: Data: java api add GenericTaskWriter and add write demo to Doc.

2023-02-01 Thread via GitHub
pauetpupa commented on PR #6571: URL: https://github.com/apache/iceberg/pull/6571#issuecomment-1412417816 This demo really helped me, thank you so much!!! But at this point I have one question: How can I modify a little bit the code (I don't know if the `genericTaskWriter` or what)

[GitHub] [iceberg] aokolnychyi merged pull request #6722: Pin openapi-spec-validator version

2023-02-01 Thread via GitHub
aokolnychyi merged PR #6722: URL: https://github.com/apache/iceberg/pull/6722 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

[GitHub] [iceberg] szehon-ho closed issue #5370: Interpreting the upper/lower bounds column returned from querying the .files metadata

2023-02-01 Thread via GitHub
szehon-ho closed issue #5370: Interpreting the upper/lower bounds column returned from querying the .files metadata URL: https://github.com/apache/iceberg/issues/5370 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

[GitHub] [iceberg] szehon-ho commented on issue #5370: Interpreting the upper/lower bounds column returned from querying the .files metadata

2023-02-01 Thread via GitHub
szehon-ho commented on issue #5370: URL: https://github.com/apache/iceberg/issues/5370#issuecomment-1412428015 Marking as closed as https://github.com/apache/iceberg/pull/5376 is done, unless there's further asks here -- This is an automated message from the Apache Git Service. To respond

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6655: Spark: Handle ResolvingFileIO while determining LocalityPreference

2023-02-01 Thread via GitHub
aokolnychyi commented on code in PR #6655: URL: https://github.com/apache/iceberg/pull/6655#discussion_r1093520619 ## core/src/main/java/org/apache/iceberg/io/ResolvingFileIO.java: ## @@ -164,7 +164,7 @@ private FileIO io(String location) { return io; } - private stat

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6655: Spark: Handle ResolvingFileIO while determining LocalityPreference

2023-02-01 Thread via GitHub
aokolnychyi commented on code in PR #6655: URL: https://github.com/apache/iceberg/pull/6655#discussion_r1093522365 ## core/src/main/java/org/apache/iceberg/hadoop/Util.java: ## @@ -84,10 +84,16 @@ public static String[] blockLocations(FileIO io, ScanTaskGroup taskGroup) {

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6655: Spark: Handle ResolvingFileIO while determining LocalityPreference

2023-02-01 Thread via GitHub
aokolnychyi commented on code in PR #6655: URL: https://github.com/apache/iceberg/pull/6655#discussion_r1093522365 ## core/src/main/java/org/apache/iceberg/hadoop/Util.java: ## @@ -84,10 +84,16 @@ public static String[] blockLocations(FileIO io, ScanTaskGroup taskGroup) {

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6655: Spark: Handle ResolvingFileIO while determining LocalityPreference

2023-02-01 Thread via GitHub
aokolnychyi commented on code in PR #6655: URL: https://github.com/apache/iceberg/pull/6655#discussion_r1093523455 ## core/src/main/java/org/apache/iceberg/hadoop/Util.java: ## @@ -84,10 +84,16 @@ public static String[] blockLocations(FileIO io, ScanTaskGroup taskGroup) {

[GitHub] [iceberg] namrathamyske commented on a diff in pull request #6651: Spark 3.3 write to branch snapshot

2023-02-01 Thread via GitHub
namrathamyske commented on code in PR #6651: URL: https://github.com/apache/iceberg/pull/6651#discussion_r1093506943 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java: ## @@ -247,9 +247,6 @@ public ScanBuilder newScanBuilder(CaseInsensitiveStringM

[GitHub] [iceberg] dimas-b commented on a diff in pull request #6712: Nessie: Support ApiV2 for Nessie client

2023-02-01 Thread via GitHub
dimas-b commented on code in PR #6712: URL: https://github.com/apache/iceberg/pull/6712#discussion_r1093528350 ## nessie/src/test/java/org/apache/iceberg/nessie/TestNamespace.java: ## @@ -73,6 +77,53 @@ public void testListNamespaces() { Assertions.assertThat(namespaces).is

[GitHub] [iceberg] dimas-b commented on a diff in pull request #6712: Nessie: Support ApiV2 for Nessie client

2023-02-01 Thread via GitHub
dimas-b commented on code in PR #6712: URL: https://github.com/apache/iceberg/pull/6712#discussion_r1093525346 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieUtil.java: ## @@ -37,6 +37,8 @@ public final class NessieUtil { public static final String NESSIE_CONFIG_PREF

[GitHub] [iceberg] szehon-ho commented on pull request #6581: Spark 3.3: Add RemoveDanglingDeletes action

2023-02-01 Thread via GitHub
szehon-ho commented on PR #6581: URL: https://github.com/apache/iceberg/pull/6581#issuecomment-1412448278 Thanks all for reviews. After thinking, this will be good to have, but as the complete 100% position delete removal will be done via minor compaction of delete files (first part of the

[GitHub] [iceberg] amogh-jahagirdar opened a new pull request, #6723: Docs: Separate page for Branching and Tagging

2023-02-01 Thread via GitHub
amogh-jahagirdar opened a new pull request, #6723: URL: https://github.com/apache/iceberg/pull/6723 This change adds a separate page in Iceberg docs focused on branching and tagging. cc: @jackye1995 @rdblue @namrathamyske @nastra @singhpk234 @rajarshisarkar -- This is an automated

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6723: Docs: Separate page for Branching and Tagging

2023-02-01 Thread via GitHub
amogh-jahagirdar commented on code in PR #6723: URL: https://github.com/apache/iceberg/pull/6723#discussion_r1093535851 ## docs/branching-and-tagging.md: ## @@ -0,0 +1,218 @@ +--- +title: "Branching and Tagging" +url: configuration +aliases: +- "tables/branching" +menu: +

[GitHub] [iceberg] szehon-ho commented on pull request #4812: Spark 3.2: Support reading position deletes

2023-02-01 Thread via GitHub
szehon-ho commented on PR #4812: URL: https://github.com/apache/iceberg/pull/4812#issuecomment-1412455363 Closing as it's now broken into smaller prs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [iceberg] szehon-ho closed pull request #4812: Spark 3.2: Support reading position deletes

2023-02-01 Thread via GitHub
szehon-ho closed pull request #4812: Spark 3.2: Support reading position deletes URL: https://github.com/apache/iceberg/pull/4812 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6723: Docs: Separate page for Branching and Tagging

2023-02-01 Thread via GitHub
amogh-jahagirdar commented on code in PR #6723: URL: https://github.com/apache/iceberg/pull/6723#discussion_r1093538296 ## docs/branching-and-tagging.md: ## @@ -0,0 +1,218 @@ +--- +title: "Branching and Tagging" +url: configuration +aliases: +- "tables/branching" +menu: +

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6661: Core: Support delete file stats in partitions metadata table

2023-02-01 Thread via GitHub
szehon-ho commented on code in PR #6661: URL: https://github.com/apache/iceberg/pull/6661#discussion_r1093539407 ## core/src/main/java/org/apache/iceberg/PartitionsTable.java: ## @@ -47,7 +48,11 @@ public class PartitionsTable extends BaseMetadataTable { Types.Neste

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6661: Core: Support delete file stats in partitions metadata table

2023-02-01 Thread via GitHub
szehon-ho commented on code in PR #6661: URL: https://github.com/apache/iceberg/pull/6661#discussion_r1093540298 ## core/src/main/java/org/apache/iceberg/PartitionsTable.java: ## @@ -47,7 +48,11 @@ public class PartitionsTable extends BaseMetadataTable { Types.Neste

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6661: Core: Support delete file stats in partitions metadata table

2023-02-01 Thread via GitHub
szehon-ho commented on code in PR #6661: URL: https://github.com/apache/iceberg/pull/6661#discussion_r1093546826 ## core/src/main/java/org/apache/iceberg/PartitionsTable.java: ## @@ -236,5 +273,17 @@ void update(DataFile file) { this.fileCount += 1; this.specId = f

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6661: Core: Support delete file stats in partitions metadata table

2023-02-01 Thread via GitHub
szehon-ho commented on code in PR #6661: URL: https://github.com/apache/iceberg/pull/6661#discussion_r1093546826 ## core/src/main/java/org/apache/iceberg/PartitionsTable.java: ## @@ -236,5 +273,17 @@ void update(DataFile file) { this.fileCount += 1; this.specId = f

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6655: Spark: Handle ResolvingFileIO while determining LocalityPreference

2023-02-01 Thread via GitHub
jackye1995 commented on code in PR #6655: URL: https://github.com/apache/iceberg/pull/6655#discussion_r1093586756 ## core/src/main/java/org/apache/iceberg/io/ResolvingFileIO.java: ## @@ -164,7 +164,7 @@ private FileIO io(String location) { return io; } - private stati

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6655: Spark: Handle ResolvingFileIO while determining LocalityPreference

2023-02-01 Thread via GitHub
jackye1995 commented on code in PR #6655: URL: https://github.com/apache/iceberg/pull/6655#discussion_r1093588435 ## core/src/main/java/org/apache/iceberg/hadoop/Util.java: ## @@ -84,10 +84,16 @@ public static String[] blockLocations(FileIO io, ScanTaskGroup taskGroup) { r

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6169: AWS,Core: Add S3 REST Signer client + REST Spec

2023-02-01 Thread via GitHub
jackye1995 commented on code in PR #6169: URL: https://github.com/apache/iceberg/pull/6169#discussion_r1093622510 ## aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java: ## @@ -1119,6 +1139,54 @@ public void applyS3ServiceConfigurations(T builder) .bui

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6655: Spark: Handle ResolvingFileIO while determining LocalityPreference

2023-02-01 Thread via GitHub
aokolnychyi commented on code in PR #6655: URL: https://github.com/apache/iceberg/pull/6655#discussion_r1093711041 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkReadConf.java: ## @@ -67,7 +67,7 @@ public boolean caseSensitive() { } public boolean locali

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6655: Spark: Handle ResolvingFileIO while determining LocalityPreference

2023-02-01 Thread via GitHub
aokolnychyi commented on code in PR #6655: URL: https://github.com/apache/iceberg/pull/6655#discussion_r1093713311 ## core/src/main/java/org/apache/iceberg/io/ResolvingFileIO.java: ## @@ -164,7 +164,7 @@ private FileIO io(String location) { return io; } - private stat

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6655: Spark: Handle ResolvingFileIO while determining LocalityPreference

2023-02-01 Thread via GitHub
aokolnychyi commented on code in PR #6655: URL: https://github.com/apache/iceberg/pull/6655#discussion_r1093714980 ## core/src/main/java/org/apache/iceberg/hadoop/Util.java: ## @@ -84,10 +84,16 @@ public static String[] blockLocations(FileIO io, ScanTaskGroup taskGroup) {

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6655: Spark: Handle ResolvingFileIO while determining LocalityPreference

2023-02-01 Thread via GitHub
aokolnychyi commented on code in PR #6655: URL: https://github.com/apache/iceberg/pull/6655#discussion_r1093715696 ## core/src/main/java/org/apache/iceberg/io/ResolvingFileIO.java: ## @@ -164,7 +164,7 @@ private FileIO io(String location) { return io; } - private stat

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6655: Spark: Handle ResolvingFileIO while determining LocalityPreference

2023-02-01 Thread via GitHub
aokolnychyi commented on code in PR #6655: URL: https://github.com/apache/iceberg/pull/6655#discussion_r1093717093 ## core/src/main/java/org/apache/iceberg/hadoop/Util.java: ## @@ -84,10 +84,16 @@ public static String[] blockLocations(FileIO io, ScanTaskGroup taskGroup) {

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6655: Spark: Handle ResolvingFileIO while determining LocalityPreference

2023-02-01 Thread via GitHub
aokolnychyi commented on code in PR #6655: URL: https://github.com/apache/iceberg/pull/6655#discussion_r1093717093 ## core/src/main/java/org/apache/iceberg/hadoop/Util.java: ## @@ -84,10 +84,16 @@ public static String[] blockLocations(FileIO io, ScanTaskGroup taskGroup) {

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6723: Docs: Separate page for Branching and Tagging

2023-02-01 Thread via GitHub
jackye1995 commented on code in PR #6723: URL: https://github.com/apache/iceberg/pull/6723#discussion_r1093762783 ## docs/branching-and-tagging.md: ## @@ -0,0 +1,218 @@ +--- +title: "Branching and Tagging" +url: configuration +aliases: +- "tables/branching" +menu: +main:

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6723: Docs: Separate page for Branching and Tagging

2023-02-01 Thread via GitHub
jackye1995 commented on code in PR #6723: URL: https://github.com/apache/iceberg/pull/6723#discussion_r1093763275 ## docs/branching-and-tagging.md: ## @@ -0,0 +1,218 @@ +--- +title: "Branching and Tagging" +url: configuration +aliases: +- "tables/branching" +menu: +main:

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6723: Docs: Separate page for Branching and Tagging

2023-02-01 Thread via GitHub
amogh-jahagirdar commented on code in PR #6723: URL: https://github.com/apache/iceberg/pull/6723#discussion_r1093777227 ## docs/branching-and-tagging.md: ## @@ -0,0 +1,218 @@ +--- +title: "Branching and Tagging" +url: configuration +aliases: +- "tables/branching" +menu: +

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6723: Docs: Separate page for Branching and Tagging

2023-02-01 Thread via GitHub
jackye1995 commented on code in PR #6723: URL: https://github.com/apache/iceberg/pull/6723#discussion_r1093781054 ## docs/branching-and-tagging.md: ## @@ -0,0 +1,218 @@ +--- +title: "Branching and Tagging" +url: configuration +aliases: +- "tables/branching" +menu: +main:

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6651: Spark 3.3 write to branch snapshot

2023-02-01 Thread via GitHub
amogh-jahagirdar commented on code in PR #6651: URL: https://github.com/apache/iceberg/pull/6651#discussion_r1093804317 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java: ## @@ -247,9 +247,6 @@ public ScanBuilder newScanBuilder(CaseInsensitiveStri

[GitHub] [iceberg] stevenzwu commented on issue #6715: AWS: WebIdentityTokenFileCredentialsProvider httpclient issue with EKS service account

2023-02-01 Thread via GitHub
stevenzwu commented on issue #6715: URL: https://github.com/apache/iceberg/issues/6715#issuecomment-1412839741 > I thought when we introduced the config to switch between url and Apache client That config switch code in `AwsProperties` introduced the problem of requiring both url an

[GitHub] [iceberg] nastra commented on a diff in pull request #6169: AWS,Core: Add S3 REST Signer client + REST Spec

2023-02-01 Thread via GitHub
nastra commented on code in PR #6169: URL: https://github.com/apache/iceberg/pull/6169#discussion_r1093810144 ## aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java: ## @@ -1119,6 +1139,54 @@ public void applyS3ServiceConfigurations(T builder) .build()

[GitHub] [iceberg] rdblue commented on a diff in pull request #6405: API: Add Aggregate expression evaluation

2023-02-01 Thread via GitHub
rdblue commented on code in PR #6405: URL: https://github.com/apache/iceberg/pull/6405#discussion_r1093833996 ## api/src/main/java/org/apache/iceberg/expressions/CountNonNull.java: ## @@ -0,0 +1,53 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more

[GitHub] [iceberg] nastra closed pull request #6169: AWS,Core: Add S3 REST Signer client + REST Spec

2023-02-01 Thread via GitHub
nastra closed pull request #6169: AWS,Core: Add S3 REST Signer client + REST Spec URL: https://github.com/apache/iceberg/pull/6169 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[GitHub] [iceberg] rdblue commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-01 Thread via GitHub
rdblue commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1093834615 ## api/src/main/java/org/apache/iceberg/expressions/CountNonNull.java: ## @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more

[GitHub] [iceberg] rdblue commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-01 Thread via GitHub
rdblue commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1093835167 ## api/src/main/java/org/apache/iceberg/expressions/CountNonNull.java: ## @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more

[GitHub] [iceberg] jackye1995 commented on issue #6715: AWS: WebIdentityTokenFileCredentialsProvider httpclient issue with EKS service account

2023-02-01 Thread via GitHub
jackye1995 commented on issue #6715: URL: https://github.com/apache/iceberg/issues/6715#issuecomment-1412889584 > what do you think of the proposal of using reflection to avoid the runtime requirements of both jars Yes that's also the direction I am thinking about. -- This

[GitHub] [iceberg] stevenzwu closed issue #6718: Iceberg doesn't read credentials from WebIdentityTokenCredentialsProvider when using with AWS

2023-02-01 Thread via GitHub
stevenzwu closed issue #6718: Iceberg doesn't read credentials from WebIdentityTokenCredentialsProvider when using with AWS URL: https://github.com/apache/iceberg/issues/6718 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6682: Bulk delete

2023-02-01 Thread via GitHub
amogh-jahagirdar commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1093848968 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/actions/BaseSparkAction.java: ## @@ -85,6 +88,7 @@ private static final Logger LOG = LoggerFacto

[GitHub] [iceberg-docs] scottteal opened a new pull request, #198: Adding 3 new blogs

2023-02-01 Thread via GitHub
scottteal opened a new pull request, #198: URL: https://github.com/apache/iceberg-docs/pull/198 Adding 3 new blogs about Iceberg from Medium: - Getting Started with Apache Iceberg: Creating Iceberg tables out of Parquet on S3 with EMR, performing time travel and schema evolution -

[GitHub] [iceberg] JonasJ-ap commented on issue #6715: AWS: WebIdentityTokenFileCredentialsProvider httpclient issue with EKS service account

2023-02-01 Thread via GitHub
JonasJ-ap commented on issue #6715: URL: https://github.com/apache/iceberg/issues/6715#issuecomment-1412931250 I am interested in picking this up and implementing the proprosal. Could you please assign this to me? -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [iceberg] github-actions[bot] commented on issue #5333: Docs: Add doc of the upsert option

2023-02-01 Thread via GitHub
github-actions[bot] commented on issue #5333: URL: https://github.com/apache/iceberg/issues/5333#issuecomment-1412937999 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

[GitHub] [iceberg] github-actions[bot] closed issue #5333: Docs: Add doc of the upsert option

2023-02-01 Thread via GitHub
github-actions[bot] closed issue #5333: Docs: Add doc of the upsert option URL: https://github.com/apache/iceberg/issues/5333 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [iceberg] github-actions[bot] closed issue #5231: Metadata file always gets created under /user/hive/ dir

2023-02-01 Thread via GitHub
github-actions[bot] closed issue #5231: Metadata file always gets created under /user/hive/ dir URL: https://github.com/apache/iceberg/issues/5231 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [iceberg] github-actions[bot] commented on issue #5231: Metadata file always gets created under /user/hive/ dir

2023-02-01 Thread via GitHub
github-actions[bot] commented on issue #5231: URL: https://github.com/apache/iceberg/issues/5231#issuecomment-1412938081 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-02-01 Thread via GitHub
amogh-jahagirdar commented on code in PR #6638: URL: https://github.com/apache/iceberg/pull/6638#discussion_r1093872484 ## spark/v3.3/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestReplaceBranch.java: ## @@ -0,0 +1,222 @@ +/* + * Licensed to the Apache So

[GitHub] [iceberg-docs] Fokko commented on a diff in pull request #198: Adding 3 new blogs

2023-02-01 Thread via GitHub
Fokko commented on code in PR #198: URL: https://github.com/apache/iceberg-docs/pull/198#discussion_r1093876602 ## landing-page/content/common/blogs.md: ## @@ -24,6 +24,21 @@ disableSidebar: true Here is a list of company blogs that talk about Iceberg. The blogs are ordered

[GitHub] [iceberg] dramaticlly commented on a diff in pull request #6714: Python: Filter on Datafile metrics

2023-02-01 Thread via GitHub
dramaticlly commented on code in PR #6714: URL: https://github.com/apache/iceberg/pull/6714#discussion_r1093848467 ## python/pyiceberg/expressions/literals.py: ## @@ -328,6 +328,9 @@ def __le__(self, other: Any) -> bool: def __ge__(self, other: Any) -> bool: return

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #5029: Flink: Use Tag or Branch to scan data.

2023-02-01 Thread via GitHub
amogh-jahagirdar commented on code in PR #5029: URL: https://github.com/apache/iceberg/pull/5029#discussion_r1093893014 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/source/StreamingMonitorFunction.java: ## @@ -124,11 +126,33 @@ public void initializeState(Function

[GitHub] [iceberg] namrathamyske commented on pull request #6717: spark 3.3 read by snapshot ref schema

2023-02-01 Thread via GitHub
namrathamyske commented on PR #6717: URL: https://github.com/apache/iceberg/pull/6717#issuecomment-1412985751 Thanks @jackieo168 for coming up with this! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #5029: Flink: Use Tag or Branch to scan data.

2023-02-01 Thread via GitHub
amogh-jahagirdar commented on code in PR #5029: URL: https://github.com/apache/iceberg/pull/5029#discussion_r1093893207 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/source/StreamingMonitorFunction.java: ## @@ -124,11 +126,33 @@ public void initializeState(Function

[GitHub] [iceberg] dramaticlly commented on pull request #5412: Spark: Support Bulk deletion in expire-snapshots if fileIO allows

2023-02-01 Thread via GitHub
dramaticlly commented on PR #5412: URL: https://github.com/apache/iceberg/pull/5412#issuecomment-1412990170 closed in favor of #6682, which cover more comprehensive spark actions and also implement bulk deletion for HadoopFileIO -- This is an automated message from the Apache Git Service.

[GitHub] [iceberg] dramaticlly closed pull request #5412: Spark: Support Bulk deletion in expire-snapshots if fileIO allows

2023-02-01 Thread via GitHub
dramaticlly closed pull request #5412: Spark: Support Bulk deletion in expire-snapshots if fileIO allows URL: https://github.com/apache/iceberg/pull/5412 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [iceberg] dramaticlly commented on a diff in pull request #6682: Bulk delete

2023-02-01 Thread via GitHub
dramaticlly commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1093902105 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/actions/BaseSparkAction.java: ## @@ -253,6 +257,39 @@ protected DeleteSummary deleteFiles( return su

[GitHub] [iceberg] Fokko commented on a diff in pull request #6714: Python: Filter on Datafile metrics

2023-02-01 Thread via GitHub
Fokko commented on code in PR #6714: URL: https://github.com/apache/iceberg/pull/6714#discussion_r1093905881 ## python/pyiceberg/expressions/literals.py: ## @@ -328,6 +328,9 @@ def __le__(self, other: Any) -> bool: def __ge__(self, other: Any) -> bool: return self.

[GitHub] [iceberg] Fokko commented on a diff in pull request #6714: Python: Filter on Datafile metrics

2023-02-01 Thread via GitHub
Fokko commented on code in PR #6714: URL: https://github.com/apache/iceberg/pull/6714#discussion_r1093907811 ## python/tests/expressions/test_visitors.py: ## @@ -1472,3 +1478,211 @@ def test_dnf_to_dask(table_schema_simple: Schema) -> None: ), ) assert expres

[GitHub] [iceberg] Fokko commented on a diff in pull request #6714: Python: Filter on Datafile metrics

2023-02-01 Thread via GitHub
Fokko commented on code in PR #6714: URL: https://github.com/apache/iceberg/pull/6714#discussion_r1093908388 ## python/pyiceberg/expressions/visitors.py: ## @@ -986,3 +989,246 @@ def expression_to_plain_format( # In the form of expr1 ∨ expr2 ∨ ... ∨ exprN visitor = Exp

[GitHub] [iceberg] manuzhang commented on pull request #6581: Spark 3.3: Add RemoveDanglingDeletes action

2023-02-01 Thread via GitHub
manuzhang commented on PR #6581: URL: https://github.com/apache/iceberg/pull/6581#issuecomment-1413013311 @szehon-ho yes, integrating removing delete files per partition with partial commits. -- This is an automated message from the Apache Git Service. To respond to the message, please l

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6716: Spark 3.3: Implement Position Deletes Table

2023-02-01 Thread via GitHub
szehon-ho commented on code in PR #6716: URL: https://github.com/apache/iceberg/pull/6716#discussion_r1092774992 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/PositionDeleteRowReader.java: ## @@ -0,0 +1,116 @@ +/* + * Licensed to the Apache Software Foundatio

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6716: Spark 3.3: Implement Position Deletes Table

2023-02-01 Thread via GitHub
szehon-ho commented on code in PR #6716: URL: https://github.com/apache/iceberg/pull/6716#discussion_r1093922634 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/PositionDeleteRowReader.java: ## @@ -0,0 +1,114 @@ +/* + * Licensed to the Apache Software Foundatio

[GitHub] [iceberg] zhangjiuyang1993 commented on issue #6708: Quick start docker-compose demo doesn't work

2023-02-01 Thread via GitHub
zhangjiuyang1993 commented on issue #6708: URL: https://github.com/apache/iceberg/issues/6708#issuecomment-1413050964 No, the quickstart demo sometimes is not reachable. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [iceberg] wypoon commented on issue #6693: [DOC] Incorrect mention of engines tab

2023-02-01 Thread via GitHub
wypoon commented on issue #6693: URL: https://github.com/apache/iceberg/issues/6693#issuecomment-1413075164 I suggest changing "You can also view documentations of using Iceberg with other compute engine under the Engines tab." to "You can also view documentation of using Iceberg with other

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6682: Bulk delete

2023-02-01 Thread via GitHub
RussellSpitzer commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1093958174 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/actions/BaseSparkAction.java: ## @@ -253,6 +257,39 @@ protected DeleteSummary deleteFiles( return

[GitHub] [iceberg] wypoon opened a new issue, #6724: [DOC] Reorder pages under Spark in the nav bar

2023-02-01 Thread via GitHub
wypoon opened a new issue, #6724: URL: https://github.com/apache/iceberg/issues/6724 ### Feature Request / Improvement In https://iceberg.apache.org/docs/latest/, in the left nav bar, under Spark, we have - DDL - Getting Started - Procedures - Queries - Structured Strea

[GitHub] [iceberg] youngxinler commented on pull request #6571: Data: java api add GenericTaskWriter and add write demo to Doc.

2023-02-01 Thread via GitHub
youngxinler commented on PR #6571: URL: https://github.com/apache/iceberg/pull/6571#issuecomment-1413084268 @pauetpupa Glad this helped you, about SortOrder i don't know much, i think GenericTaskWriter not support write order, it use RollingFileWriter to write data directly. -- This is

[GitHub] [iceberg] youngxinler commented on pull request #6571: Data: java api add GenericTaskWriter and add write demo to Doc.

2023-02-01 Thread via GitHub
youngxinler commented on PR #6571: URL: https://github.com/apache/iceberg/pull/6571#issuecomment-1413086378 @jackye1995 Can I trouble you if you have time to do a review? about Data: java api add GenericTaskWriter . thanks. -- This is an automated message from the Apache Git Service. To

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6712: Nessie: Support ApiV2 for Nessie client

2023-02-01 Thread via GitHub
ajantha-bhat commented on code in PR #6712: URL: https://github.com/apache/iceberg/pull/6712#discussion_r1094018996 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieUtil.java: ## @@ -37,6 +37,8 @@ public final class NessieUtil { public static final String NESSIE_CONFIG

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6712: Nessie: Support ApiV2 for Nessie client

2023-02-01 Thread via GitHub
ajantha-bhat commented on code in PR #6712: URL: https://github.com/apache/iceberg/pull/6712#discussion_r1094019269 ## nessie/src/test/java/org/apache/iceberg/nessie/TestNamespace.java: ## @@ -73,6 +77,53 @@ public void testListNamespaces() { Assertions.assertThat(namespace

[GitHub] [iceberg] dimas-b commented on a diff in pull request #6712: Nessie: Support ApiV2 for Nessie client

2023-02-01 Thread via GitHub
dimas-b commented on code in PR #6712: URL: https://github.com/apache/iceberg/pull/6712#discussion_r1094020382 ## nessie/src/test/java/org/apache/iceberg/nessie/TestNamespace.java: ## @@ -73,6 +77,53 @@ public void testListNamespaces() { Assertions.assertThat(namespaces).is

[GitHub] [iceberg] dimas-b commented on a diff in pull request #6712: Nessie: Support ApiV2 for Nessie client

2023-02-01 Thread via GitHub
dimas-b commented on code in PR #6712: URL: https://github.com/apache/iceberg/pull/6712#discussion_r1094033556 ## nessie/src/test/java/org/apache/iceberg/nessie/TestNamespace.java: ## @@ -73,6 +77,51 @@ public void testListNamespaces() { Assertions.assertThat(namespaces).is

[GitHub] [iceberg] stevenzwu commented on issue #6715: AWS: WebIdentityTokenFileCredentialsProvider httpclient issue with EKS service account

2023-02-01 Thread via GitHub
stevenzwu commented on issue #6715: URL: https://github.com/apache/iceberg/issues/6715#issuecomment-1413165879 @JonasJ-ap assigned the issue to you. you should be able to reproduce it with latest code and run Spark/Flink with EKS service account. -- This is an automated message from the A

[GitHub] [iceberg] pvary commented on pull request #6698: Core, Hive: Support pluggable ClientPool

2023-02-01 Thread via GitHub
pvary commented on PR #6698: URL: https://github.com/apache/iceberg/pull/6698#issuecomment-1413198457 For the reference, the previous discussions: - Hive: Add UGI to the key in CachedClientPool #6175 - Hive: More distinctive cached client pool key to avoid conflict #5378 -- This is a

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6682: Bulk delete

2023-02-01 Thread via GitHub
aokolnychyi commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1094075265 ## api/src/main/java/org/apache/iceberg/actions/DeleteOrphanFiles.java: ## @@ -67,7 +67,11 @@ public interface DeleteOrphanFiles extends Action deleteFunc = null;

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6682: Bulk delete

2023-02-01 Thread via GitHub
aokolnychyi commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1094094981 ## api/src/main/java/org/apache/iceberg/actions/DeleteOrphanFiles.java: ## @@ -67,7 +67,11 @@ public interface DeleteOrphanFiles extends Action

[GitHub] [iceberg] aokolnychyi commented on pull request #6682: Bulk delete

2023-02-01 Thread via GitHub
aokolnychyi commented on PR #6682: URL: https://github.com/apache/iceberg/pull/6682#issuecomment-1413222159 I agree with the overall direction but I'd try to support the existing API to avoid massive deprecation and simplify the implementation. It will be hard to test all possible scenarios

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6716: Spark 3.3: Implement Position Deletes Table

2023-02-01 Thread via GitHub
szehon-ho commented on code in PR #6716: URL: https://github.com/apache/iceberg/pull/6716#discussion_r1094106234 ## parquet/src/main/java/org/apache/iceberg/parquet/ParquetMetricsRowGroupFilter.java: ## @@ -50,15 +51,22 @@ public class ParquetMetricsRowGroupFilter { private

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6716: Spark 3.3: Implement Position Deletes Table

2023-02-01 Thread via GitHub
szehon-ho commented on code in PR #6716: URL: https://github.com/apache/iceberg/pull/6716#discussion_r1094106234 ## parquet/src/main/java/org/apache/iceberg/parquet/ParquetMetricsRowGroupFilter.java: ## @@ -50,15 +51,22 @@ public class ParquetMetricsRowGroupFilter { private

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6716: Spark 3.3: Implement Position Deletes Table

2023-02-01 Thread via GitHub
szehon-ho commented on code in PR #6716: URL: https://github.com/apache/iceberg/pull/6716#discussion_r1094108930 ## orc/src/main/java/org/apache/iceberg/orc/OrcIterable.java: ## @@ -84,15 +91,18 @@ public CloseableIterator iterator() { addCloseable(orcFileReader); Ty

[GitHub] [iceberg] kingwind94 commented on issue #5043: Flink import debezium cdc record(delete type) to iceberg(0.13.2+) got IndexOutOfBoundsException

2023-02-01 Thread via GitHub
kingwind94 commented on issue #5043: URL: https://github.com/apache/iceberg/issues/5043#issuecomment-1413262948 got same problem, have any fixs now? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [iceberg] namrathamyske commented on a diff in pull request #6651: Spark 3.3 write to branch snapshot

2023-02-01 Thread via GitHub
namrathamyske commented on code in PR #6651: URL: https://github.com/apache/iceberg/pull/6651#discussion_r1094141737 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java: ## @@ -247,9 +247,6 @@ public ScanBuilder newScanBuilder(CaseInsensitiveStringM

[GitHub] [iceberg] namrathamyske commented on a diff in pull request #6651: Spark 3.3 write to branch snapshot

2023-02-01 Thread via GitHub
namrathamyske commented on code in PR #6651: URL: https://github.com/apache/iceberg/pull/6651#discussion_r1094141737 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java: ## @@ -247,9 +247,6 @@ public ScanBuilder newScanBuilder(CaseInsensitiveStringM

[GitHub] [iceberg] 0xffmeta opened a new issue, #6725: How to detect if the partition's data is ready to consume

2023-02-02 Thread via GitHub
0xffmeta opened a new issue, #6725: URL: https://github.com/apache/iceberg/issues/6725 ### Query engine _No response_ ### Question For the downstream of the iceberg table, how to detect if the daily or hourly partition is ready to consume? We used to leverage `_SUCCE

[GitHub] [iceberg] kingwind94 commented on issue #5043: Flink import debezium cdc record(delete type) to iceberg(0.13.2+) got IndexOutOfBoundsException

2023-02-02 Thread via GitHub
kingwind94 commented on issue #5043: URL: https://github.com/apache/iceberg/issues/5043#issuecomment-1413315464 I fixed it! You must have -D (delete rowkind) rowdata in your cdc datastream. Check your BaseDeltaTaskWriter, fix your write(RowData row) method as below ` case DELET

[GitHub] [iceberg] mjf-89 commented on issue #5977: How to write to a bucket-partitioned table using PySpark?

2023-02-02 Thread via GitHub
mjf-89 commented on issue #5977: URL: https://github.com/apache/iceberg/issues/5977#issuecomment-1413328965 Hi, I don't have a deep understandin of pyspark internals but I think that you can write to a partitioned iceberg table with the following approach: ``` # registering iceberg

<    5   6   7   8   9   10   11   12   13   14   >