Re: [I] EMR 6.10.0 Cannot migrate a table from a non-Iceberg Spark Session Catalog. Found spark_catalog [iceberg]

2023-10-27 Thread via GitHub
tomtongue commented on issue #7317: URL: https://github.com/apache/iceberg/issues/7317#issuecomment-1782418606 Sorry for jumping in. I personally investigated the migrate query issue for GlueCatalog, so let me share my investigation result. ## Result Currently, it’s NOT possible to

Re: [PR] Hive: Refactor TestHiveCatalog tests to use the core CatalogTests [iceberg]

2023-10-27 Thread via GitHub
ajantha-bhat commented on code in PR #8918: URL: https://github.com/apache/iceberg/pull/8918#discussion_r137451 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveCatalog.java: ## @@ -106,12 +152,12 @@ private Schema getTestSchema() { public void testCreateTa

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2023-10-27 Thread via GitHub
ajantha-bhat commented on code in PR #8907: URL: https://github.com/apache/iceberg/pull/8907#discussion_r1374227697 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveViewCatalog.java: ## @@ -0,0 +1,197 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

[I] Slow RewriteManifests due to Validation of Manifest Entries [iceberg]

2023-10-27 Thread via GitHub
mirageyjd opened a new issue, #8932: URL: https://github.com/apache/iceberg/issues/8932 ### Apache Iceberg version 0.13.1 ### Query engine Spark ### Please describe the bug 🐞 We ran `BaseRewriteManifestsSparkAction` action on a large table with 7k+ manifest

Re: [PR] Spec: add nanosecond timestamp types [iceberg]

2023-10-27 Thread via GitHub
findepi commented on PR #8683: URL: https://github.com/apache/iceberg/pull/8683#issuecomment-1782599430 > Now that we have separate types, we can use the type to carry that information, so that you can promote a `long` to `timestamp_ms` or `timestamp` and we know how to interpret the value.

Re: [I] Iceberg streaming streaming-skip-overwrite-snapshots SparkMicroBatchStream only skips over one file per trigger [iceberg]

2023-10-27 Thread via GitHub
cccs-jc commented on issue #8902: URL: https://github.com/apache/iceberg/issues/8902#issuecomment-1782660563 @singhpk234 do you know why the `existingFilesCount` are added to the count. Seems like it should only add the number of `addedFilesCount` . https://github.com/apache/iceberg

Re: [I] Fix typo in `_primitive_to_phyisical` [iceberg-python]

2023-10-27 Thread via GitHub
whisk commented on issue #107: URL: https://github.com/apache/iceberg-python/issues/107#issuecomment-1782706542 Hi @Fokko, please consider PR #108 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] Slow RewriteManifests due to Validation of Manifest Entries [iceberg]

2023-10-27 Thread via GitHub
RussellSpitzer commented on issue #8932: URL: https://github.com/apache/iceberg/issues/8932#issuecomment-1782759994 Do you have a flame graph or some evidence of this? My gut would say that would be a trivial check -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] feat: support ser/deser of value [iceberg-rust]

2023-10-27 Thread via GitHub
ZENOTME commented on code in PR #82: URL: https://github.com/apache/iceberg-rust/pull/82#discussion_r1374447735 ## crates/iceberg/src/avro/schema.rs: ## @@ -30,6 +32,11 @@ use itertools::{Either, Itertools}; use serde_json::{Number, Value}; const FILED_ID_PROP: &str = "field

Re: [I] Flink: Add support for Flink 1.18 [iceberg]

2023-10-27 Thread via GitHub
YesOrNo828 commented on issue #8930: URL: https://github.com/apache/iceberg/issues/8930#issuecomment-1782883251 > then we will have a transitive dependency on Flink 1.18. We can exclude the dependency, but how can we make sure that everything works as expected without running the tests.

Re: [PR] Iceberg 1.3.0 jc streaming [iceberg]

2023-10-27 Thread via GitHub
cccs-jc closed pull request #8934: Iceberg 1.3.0 jc streaming URL: https://github.com/apache/iceberg/pull/8934 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e

Re: [PR] Iceberg 1.3.0 jc streaming [iceberg]

2023-10-27 Thread via GitHub
cccs-jc commented on PR #8934: URL: https://github.com/apache/iceberg/pull/8934#issuecomment-1782901768 not ready -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubsc

[PR] patch: Parquet Column Names with "Special Characters" fix [iceberg-python]

2023-10-27 Thread via GitHub
MarquisC opened a new pull request, #109: URL: https://github.com/apache/iceberg-python/pull/109 We're using PyIceberg to read Iceberg tables stored in S3 as parquet. We have column names in the form of `id:foo` `diagnostic:bar` using `:` as a sort of delimiter to help us do some programati

Re: [PR] patch: Parquet Column Names with "Special Characters" fix [iceberg-python]

2023-10-27 Thread via GitHub
mchamberlain-mdsol commented on PR #109: URL: https://github.com/apache/iceberg-python/pull/109#issuecomment-1783097367 Exception Example: https://github.com/apache/iceberg-python/assets/110425760/74a7f812-333b-45ee-861c-cf581a99f3d3";> -- This is an automated message from the Apach

Re: [PR] Spark: Avoid extra copies of manifests while optimizing V2 tables [iceberg]

2023-10-27 Thread via GitHub
singhpk234 commented on code in PR #8928: URL: https://github.com/apache/iceberg/pull/8928#discussion_r1374737812 ## core/src/test/java/org/apache/iceberg/TestRewriteManifests.java: ## @@ -443,6 +444,14 @@ public void testBasicManifestReplacement() throws IOException { Lis

Re: [PR] Fix Migrate procedure renaming issue for custom catalog [iceberg]

2023-10-27 Thread via GitHub
singhpk234 commented on code in PR #8931: URL: https://github.com/apache/iceberg/pull/8931#discussion_r1374769692 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/MigrateTableSparkAction.java: ## @@ -108,6 +109,23 @@ public MigrateTableSparkAction backupTableNa

Re: [I] Partitioning by Year/Month/Day [iceberg]

2023-10-27 Thread via GitHub
l20DfX35JnKBfRn commented on issue #4129: URL: https://github.com/apache/iceberg/issues/4129#issuecomment-1783179835 If anyone is confused about this error on AWS Athena, unlike with hive-style file partitioning in S3 like `{table-s3-location}/year={}/month={}/day={}/` The equivalent

Re: [PR] Spec: add nanosecond timestamp types [iceberg]

2023-10-27 Thread via GitHub
jacobmarble commented on code in PR #8683: URL: https://github.com/apache/iceberg/pull/8683#discussion_r1374794900 ## format/spec.md: ## @@ -187,10 +189,11 @@ A **`map`** is a collection of key-value pairs with a key type and a value type. Notes: 1. Decimal scale is fixed a

Re: [PR] Spec: add nanosecond timestamp types [iceberg]

2023-10-27 Thread via GitHub
jacobmarble commented on PR #8683: URL: https://github.com/apache/iceberg/pull/8683#issuecomment-1783184731 > > Now that we have separate types, we can use the type to carry that information, so that you can promote a `long` to `timestamp_ms` or `timestamp` and we know how to interpret the

Re: [PR] Spark: Avoid extra copies of manifests while optimizing V2 tables [iceberg]

2023-10-27 Thread via GitHub
aokolnychyi commented on code in PR #8928: URL: https://github.com/apache/iceberg/pull/8928#discussion_r1374826073 ## core/src/test/java/org/apache/iceberg/TestRewriteManifests.java: ## @@ -443,6 +444,14 @@ public void testBasicManifestReplacement() throws IOException { Li

Re: [I] Fix typo in `_primitive_to_phyisical` [iceberg-python]

2023-10-27 Thread via GitHub
Fokko commented on issue #107: URL: https://github.com/apache/iceberg-python/issues/107#issuecomment-1783395139 Nice, thanks @whisk πŸ‘ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Fixed typos [iceberg-python]

2023-10-27 Thread via GitHub
Fokko merged PR #108: URL: https://github.com/apache/iceberg-python/pull/108 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] patch: Parquet Column Names with "Special Characters" fix [iceberg-python]

2023-10-27 Thread via GitHub
Fokko commented on PR #109: URL: https://github.com/apache/iceberg-python/pull/109#issuecomment-1783408073 Thanks for raising this @MarquisC. This looks like https://github.com/apache/iceberg-python/pull/83/, can you check if that also resolves your problem? Otherwise, I think it will be a

Re: [PR] added contributing.md file [iceberg-python]

2023-10-27 Thread via GitHub
Fokko commented on PR #102: URL: https://github.com/apache/iceberg-python/pull/102#issuecomment-1783409103 @onemriganka What do you think of pointing to https://py.iceberg.apache.org/contributing/? Otherwise we have to maintain this at two places -- This is an automated message from the

Re: [PR] Spec: add nanosecond timestamp types [iceberg]

2023-10-27 Thread via GitHub
findepi commented on PR #8683: URL: https://github.com/apache/iceberg/pull/8683#issuecomment-1783460277 @jacobmarble indeed, thanks! do we expect any type promotions being allowed around the new types being added here? -- This is an automated message from the Apache Git Service. To res

Re: [PR] Spec: add nanosecond timestamp types [iceberg]

2023-10-27 Thread via GitHub
Fokko commented on code in PR #8683: URL: https://github.com/apache/iceberg/pull/8683#discussion_r1375028979 ## format/spec.md: ## @@ -187,10 +189,11 @@ A **`map`** is a collection of key-value pairs with a key type and a value type. Notes: 1. Decimal scale is fixed and can

Re: [PR] Core: Ignore split offsets array when split offset is past file length [iceberg]

2023-10-27 Thread via GitHub
amogh-jahagirdar commented on PR #8925: URL: https://github.com/apache/iceberg/pull/8925#issuecomment-1783496227 > @amogh-jahagirdar, maybe this time we should create a test to validate FileScanTask.split when there are bad split offsets? I think that would have caught this in the last PR.

Re: [PR] Core: Ignore split offsets array when split offset is past file length [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8925: URL: https://github.com/apache/iceberg/pull/8925#discussion_r1375034300 ## core/src/test/java/org/apache/iceberg/TestSplitPlanning.java: ## @@ -216,6 +217,34 @@ public void testSplitPlanningWithOffsets() { "We should get one task pe

Re: [PR] Core: Ignore split offsets array when split offset is past file length [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8925: URL: https://github.com/apache/iceberg/pull/8925#discussion_r1375034993 ## core/src/test/java/org/apache/iceberg/TestSplitPlanning.java: ## @@ -216,6 +217,34 @@ public void testSplitPlanningWithOffsets() { "We should get one task pe

Re: [PR] Add latest version to menu [iceberg-docs]

2023-10-27 Thread via GitHub
Fokko merged PR #291: URL: https://github.com/apache/iceberg-docs/pull/291 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Add latest version to menu [iceberg-docs]

2023-10-27 Thread via GitHub
Fokko commented on PR #291: URL: https://github.com/apache/iceberg-docs/pull/291#issuecomment-1783503359 Thanks @nastra for fixing this πŸ‘ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] Core: Ignore split offsets array when split offset is past file length [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8925: URL: https://github.com/apache/iceberg/pull/8925#discussion_r1375037491 ## core/src/test/java/org/apache/iceberg/TestSplitPlanning.java: ## @@ -216,6 +217,34 @@ public void testSplitPlanningWithOffsets() { "We should get one task pe

Re: [PR] Core: Ignore split offsets array when split offset is past file length [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8925: URL: https://github.com/apache/iceberg/pull/8925#discussion_r1375037491 ## core/src/test/java/org/apache/iceberg/TestSplitPlanning.java: ## @@ -216,6 +217,34 @@ public void testSplitPlanningWithOffsets() { "We should get one task pe

Re: [PR] Core: Ignore split offsets array when split offset is past file length [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8925: URL: https://github.com/apache/iceberg/pull/8925#discussion_r1375037491 ## core/src/test/java/org/apache/iceberg/TestSplitPlanning.java: ## @@ -216,6 +217,34 @@ public void testSplitPlanningWithOffsets() { "We should get one task pe

Re: [PR] Spec: Clarify missing fields when writing [iceberg]

2023-10-27 Thread via GitHub
Fokko commented on code in PR #8672: URL: https://github.com/apache/iceberg/pull/8672#discussion_r1375043993 ## format/spec.md: ## @@ -128,13 +128,13 @@ Tables do not require rename, except for tables that use atomic rename to implem Writer requirements -Some tables i

Re: [I] Implementation does not write `schema-id` into Manifest Avro headers [iceberg]

2023-10-27 Thread via GitHub
Fokko commented on issue #8745: URL: https://github.com/apache/iceberg/issues/8745#issuecomment-1783512462 @JFinis I think this was in there so the schema didn't need to be deserialized. > Suggestion: Remove mentioning of field completely from the spec. It's redundant and the implem

[PR] Spark 3.5: Don't cache or reuse manifest entries while rewriting metadata by default [iceberg]

2023-10-27 Thread via GitHub
aokolnychyi opened a new pull request, #8935: URL: https://github.com/apache/iceberg/pull/8935 The action for rewriting manifests caches the manifest entry DF or does an extra shuffle in order to skip reading the actual manifest files twice. We did this assuming it would increase the perfor

[PR] Spark 3.5: Use DataFile constants in SparkDataFile [iceberg]

2023-10-27 Thread via GitHub
aokolnychyi opened a new pull request, #8936: URL: https://github.com/apache/iceberg/pull/8936 This PR makes `SparkDataFile` use constants in `DataFile` instead of hard-coded values. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1375049907 ## kafka-connect/kafka-connect-events/src/main/java/org/apache/iceberg/connect/events/CommitCompletePayload.java: ## @@ -0,0 +1,107 @@ +/* + * Licensed to the Apache Sof

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1375050867 ## kafka-connect/kafka-connect-events/src/main/java/org/apache/iceberg/connect/events/CommitCompletePayload.java: ## @@ -0,0 +1,107 @@ +/* + * Licensed to the Apache Sof

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1375053314 ## kafka-connect/kafka-connect-events/src/main/java/org/apache/iceberg/connect/events/Event.java: ## @@ -0,0 +1,169 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-27 Thread via GitHub
bryanck commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1375054000 ## kafka-connect/kafka-connect-events/src/main/java/org/apache/iceberg/connect/events/Event.java: ## @@ -0,0 +1,169 @@ +/* + * Licensed to the Apache Software Foundatio

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1375054505 ## kafka-connect/kafka-connect-events/src/main/java/org/apache/iceberg/connect/events/TableName.java: ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1375054505 ## kafka-connect/kafka-connect-events/src/main/java/org/apache/iceberg/connect/events/TableName.java: ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-10-27 Thread via GitHub
stevenzwu commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1373440573 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -504,6 +508,27 @@ private static Map toReadableByteBufferMap(Map Map filterColumnsStats( + Map map

Re: [PR] Build: Replace Thread.Sleep() usage with org.Awaitility from Tests. [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8804: URL: https://github.com/apache/iceberg/pull/8804#discussion_r1375068537 ## aws/src/integration/java/org/apache/iceberg/aws/lakeformation/LakeFormationTestBase.java: ## @@ -357,8 +360,20 @@ String getRandomTableName() { return LF_TEST_TA

Re: [PR] Build: Replace Thread.Sleep() usage with org.Awaitility from Tests. [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8804: URL: https://github.com/apache/iceberg/pull/8804#discussion_r1375070758 ## aws/src/integration/java/org/apache/iceberg/aws/lakeformation/LakeFormationTestBase.java: ## @@ -417,7 +432,20 @@ private static void registerResource(String s3Locati

Re: [PR] Build: Replace Thread.Sleep() usage with org.Awaitility from Tests. [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8804: URL: https://github.com/apache/iceberg/pull/8804#discussion_r1375070935 ## aws/src/integration/java/org/apache/iceberg/aws/lakeformation/TestLakeFormationAwsClientFactory.java: ## @@ -128,8 +131,18 @@ public void testLakeFormationEnabledGlue

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-27 Thread via GitHub
bryanck commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1375071931 ## kafka-connect/kafka-connect-events/src/main/java/org/apache/iceberg/connect/events/CommitCompletePayload.java: ## @@ -0,0 +1,107 @@ +/* + * Licensed to the Apache So

Re: [PR] Build: Replace Thread.Sleep() usage with org.Awaitility from Tests. [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8804: URL: https://github.com/apache/iceberg/pull/8804#discussion_r1375072145 ## core/src/test/java/org/apache/iceberg/hadoop/TestHadoopCommits.java: ## @@ -435,13 +437,11 @@ public void testConcurrentFastAppends(@TempDir File dir) throws Excepti

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-27 Thread via GitHub
bryanck commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1375073000 ## kafka-connect/kafka-connect-events/src/main/java/org/apache/iceberg/connect/events/CommitCompletePayload.java: ## @@ -0,0 +1,107 @@ +/* + * Licensed to the Apache So

Re: [PR] Hive: Refactor TestHiveCatalog tests to use the core CatalogTests [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8918: URL: https://github.com/apache/iceberg/pull/8918#discussion_r1375074020 ## hive-metastore/src/test/java/org/apache/iceberg/hive/HiveMetastoreSetup.java: ## @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

Re: [PR] Hive: Refactor TestHiveCatalog tests to use the core CatalogTests [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8918: URL: https://github.com/apache/iceberg/pull/8918#discussion_r1375074725 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveCatalog.java: ## @@ -96,6 +105,43 @@ public class TestHiveCatalog extends HiveMetastoreTest { @TempD

Re: [PR] Hive: Refactor TestHiveCatalog tests to use the core CatalogTests [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8918: URL: https://github.com/apache/iceberg/pull/8918#discussion_r1375075096 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveCatalog.java: ## @@ -349,26 +396,31 @@ public void testCreateTableCustomSortOrder() throws Exception {

Re: [PR] Hive: Refactor TestHiveCatalog tests to use the core CatalogTests [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8918: URL: https://github.com/apache/iceberg/pull/8918#discussion_r1375077284 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveCatalog.java: ## @@ -850,51 +857,46 @@ private void removeNamespaceOwnershipAndVerify( createNamespa

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-27 Thread via GitHub
bryanck commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1375077725 ## kafka-connect/kafka-connect-events/src/main/java/org/apache/iceberg/connect/events/TableName.java: ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apache Software Found

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1375084999 ## kafka-connect/kafka-connect-events/src/main/java/org/apache/iceberg/connect/events/TableName.java: ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1375085697 ## nessie/src/main/java/org/apache/iceberg/nessie/UpdateableReference.java: ## @@ -62,7 +62,7 @@ public Reference getReference() { public void checkMutable() {

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1375086307 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieViewOperations.java: ## @@ -0,0 +1,153 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1375087365 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieViewOperations.java: ## @@ -0,0 +1,153 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-27 Thread via GitHub
bryanck commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1375087754 ## kafka-connect/kafka-connect-events/src/main/java/org/apache/iceberg/connect/events/TableName.java: ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apache Software Found

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1375088374 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieViewOperations.java: ## @@ -0,0 +1,153 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1375089472 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieViewOperations.java: ## @@ -0,0 +1,153 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1375090864 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieCatalog.java: ## @@ -347,4 +348,54 @@ private TableIdentifier identifierWithoutTableReference( protected Map p

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1375094428 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieCatalog.java: ## @@ -347,4 +348,54 @@ private TableIdentifier identifierWithoutTableReference( protected Map p

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1375094889 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -136,15 +143,23 @@ private UpdateableReference loadReference(String requestedRef, Stri

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1375097325 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -355,21 +384,34 @@ public void renameTable(TableIdentifier from, TableIdentifier to) {

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1375097118 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -355,21 +384,34 @@ public void renameTable(TableIdentifier from, TableIdentifier to) {

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1375097962 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -378,27 +420,63 @@ public void renameTable(TableIdentifier from, TableIdentifier to) {

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1375098386 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -540,4 +630,72 @@ public void close() { api.close(); } } + + public voi

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-10-27 Thread via GitHub
nastra commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1375098623 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieTableOperations.java: ## @@ -135,71 +135,26 @@ protected void doCommit(TableMetadata base, TableMetadata metadat

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-27 Thread via GitHub
danielcweeks commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1375098695 ## kafka-connect/kafka-connect-events/src/main/java/org/apache/iceberg/connect/events/CommitReadyPayload.java: ## @@ -0,0 +1,104 @@ +/* + * Licensed to the Apache

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-27 Thread via GitHub
bryanck commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1375099924 ## kafka-connect/kafka-connect-events/src/main/java/org/apache/iceberg/connect/events/CommitReadyPayload.java: ## @@ -0,0 +1,104 @@ +/* + * Licensed to the Apache Softw

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-27 Thread via GitHub
danielcweeks commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1375100608 ## kafka-connect/kafka-connect-events/src/main/java/org/apache/iceberg/connect/events/TableName.java: ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apache Software

Re: [PR] Spark 3.5: Honor Spark conf spark.sql.files.maxPartitionBytes in read split [iceberg]

2023-10-27 Thread via GitHub
jzhuge commented on PR #8922: URL: https://github.com/apache/iceberg/pull/8922#issuecomment-1783584994 The PR is ready for review. If approved, we will follow up with doc update and backports to 3.4, 3.3, etc. -- This is an automated message from the Apache Git Service. To respond

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-27 Thread via GitHub
bryanck commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1375104748 ## kafka-connect/kafka-connect-events/src/main/java/org/apache/iceberg/connect/events/TableName.java: ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apache Software Found

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-27 Thread via GitHub
danielcweeks commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1375105601 ## kafka-connect/kafka-connect-events/src/main/java/org/apache/iceberg/connect/events/CommitCompletePayload.java: ## @@ -0,0 +1,107 @@ +/* + * Licensed to the Apac

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-27 Thread via GitHub
bryanck commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1375106832 ## kafka-connect/kafka-connect-events/src/main/java/org/apache/iceberg/connect/events/CommitCompletePayload.java: ## @@ -0,0 +1,107 @@ +/* + * Licensed to the Apache So

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-27 Thread via GitHub
danielcweeks commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1375106869 ## kafka-connect/kafka-connect-events/src/main/java/org/apache/iceberg/connect/events/CommitCompletePayload.java: ## @@ -0,0 +1,107 @@ +/* + * Licensed to the Apac

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-27 Thread via GitHub
danielcweeks commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1375108167 ## kafka-connect/kafka-connect-events/src/main/java/org/apache/iceberg/connect/events/TableName.java: ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apache Software

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-27 Thread via GitHub
danielcweeks commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1375108603 ## kafka-connect/kafka-connect-events/src/main/java/org/apache/iceberg/connect/events/CommitReadyPayload.java: ## @@ -0,0 +1,104 @@ +/* + * Licensed to the Apache

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-27 Thread via GitHub
danielcweeks commented on PR #8701: URL: https://github.com/apache/iceberg/pull/8701#issuecomment-1783596341 I'm a +1 on moving forward with this. I think there might still be an open question about Iceberg/Avro Schema definitions, but I'm fine with either resolution. -- This is an auto

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-27 Thread via GitHub
rdblue commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1375115365 ## kafka-connect/kafka-connect-events/src/main/java/org/apache/iceberg/connect/events/CommitCompletePayload.java: ## @@ -0,0 +1,107 @@ +/* + * Licensed to the Apache Sof

Re: [PR] Core: Ignore split offsets array when split offset is past file length [iceberg]

2023-10-27 Thread via GitHub
amogh-jahagirdar commented on code in PR #8925: URL: https://github.com/apache/iceberg/pull/8925#discussion_r1375117854 ## core/src/test/java/org/apache/iceberg/TestSplitPlanning.java: ## @@ -216,6 +217,34 @@ public void testSplitPlanningWithOffsets() { "We should get o

Re: [PR] Core: Ignore split offsets array when split offset is past file length [iceberg]

2023-10-27 Thread via GitHub
amogh-jahagirdar commented on code in PR #8925: URL: https://github.com/apache/iceberg/pull/8925#discussion_r1375118142 ## core/src/test/java/org/apache/iceberg/TestSplitPlanning.java: ## @@ -216,6 +217,34 @@ public void testSplitPlanningWithOffsets() { "We should get o

Re: [PR] Spec: add nanosecond timestamp types [iceberg]

2023-10-27 Thread via GitHub
jacobmarble commented on code in PR #8683: URL: https://github.com/apache/iceberg/pull/8683#discussion_r1375124724 ## format/spec.md: ## @@ -187,10 +189,11 @@ A **`map`** is a collection of key-value pairs with a key type and a value type. Notes: 1. Decimal scale is fixed a

Re: [PR] Spec: add nanosecond timestamp types [iceberg]

2023-10-27 Thread via GitHub
jacobmarble commented on code in PR #8683: URL: https://github.com/apache/iceberg/pull/8683#discussion_r1375124724 ## format/spec.md: ## @@ -187,10 +189,11 @@ A **`map`** is a collection of key-value pairs with a key type and a value type. Notes: 1. Decimal scale is fixed a

Re: [I] HadoopCatalog can't list toplevel tables [iceberg]

2023-10-27 Thread via GitHub
github-actions[bot] commented on issue #7130: URL: https://github.com/apache/iceberg/issues/7130#issuecomment-1783628604 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] HadoopCatalog can't list toplevel tables [iceberg]

2023-10-27 Thread via GitHub
github-actions[bot] closed issue #7130: HadoopCatalog can't list toplevel tables URL: https://github.com/apache/iceberg/issues/7130 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [I] Unable to use GlueCatalog in flink environments without hadoop [iceberg]

2023-10-27 Thread via GitHub
rajcoolguy commented on issue #3044: URL: https://github.com/apache/iceberg/issues/3044#issuecomment-1783634807 @mgmarino - how did you make this work in KDA, lacking documentation, i am either landing in to linkage error due to hadoop jars though i have shaded and relocated hadoop classes.

Re: [PR] patch: Parquet Column Names with "Special Characters" fix [iceberg-python]

2023-10-27 Thread via GitHub
MarquisC commented on PR #109: URL: https://github.com/apache/iceberg-python/pull/109#issuecomment-1783661783 @Fokko thanks for the guidance and reference, I'll get things working with CI in a bit. Would I be able to have this assigned to me? -- This is an automated message from the Apach

Re: [PR] Core: Use avro compression properties from table properties when writing manifests and manifest lists [iceberg]

2023-10-27 Thread via GitHub
wypoon commented on code in PR #6799: URL: https://github.com/apache/iceberg/pull/6799#discussion_r1375148065 ## core/src/main/java/org/apache/iceberg/ManifestFiles.java: ## @@ -157,11 +157,34 @@ public static ManifestWriter write(PartitionSpec spec, OutputFile outp */ p

Re: [PR] added contributing.md file [iceberg-python]

2023-10-27 Thread via GitHub
onemriganka commented on PR #102: URL: https://github.com/apache/iceberg-python/pull/102#issuecomment-1783691176 > @onemriganka What do you think of pointing to https://py.iceberg.apache.org/contributing/? Otherwise we have to maintain this at two places YES ,https://py.iceberg.apach

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-10-27 Thread via GitHub
stevenzwu commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1375163433 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -504,6 +508,27 @@ private static Map toReadableByteBufferMap(Map Map filterColumnsStats( + Map map

Re: [PR] Spec: add nanosecond timestamp types [iceberg]

2023-10-27 Thread via GitHub
jacobmarble commented on PR #8683: URL: https://github.com/apache/iceberg/pull/8683#issuecomment-1783700784 > @jacobmarble indeed, thanks! do we expect any type promotions being allowed around the new types being added here? @findepi when we discussed in the last community meeting, th

Re: [I] Flink: Add support for Flink 1.18 [iceberg]

2023-10-27 Thread via GitHub
YesOrNo828 commented on issue #8930: URL: https://github.com/apache/iceberg/issues/8930#issuecomment-1783712015 The list supported Flink versions for each connector: https://flink.apache.org/downloads/#apache-flink-connectors -- This is an automated message from the Apache Git Servi

Re: [I] Spark write abort result in table miss metadata location file [iceberg]

2023-10-27 Thread via GitHub
dyno commented on issue #8927: URL: https://github.com/apache/iceberg/issues/8927#issuecomment-1783720979 probably the error message just means the data file and maniefst files but not the metadata location file? i am sure the medata location file was gone and from s3 access log the deletio

Re: [I] Flink: Add support for Flink 1.18 [iceberg]

2023-10-27 Thread via GitHub
pvary commented on issue #8930: URL: https://github.com/apache/iceberg/issues/8930#issuecomment-1783722052 > > then we will have a transitive dependency on Flink 1.18. We can exclude the dependency, but how can we make sure that everything works as expected without running the tests. >

Re: [I] Spark write abort result in table miss metadata location file [iceberg]

2023-10-27 Thread via GitHub
dyno commented on issue #8927: URL: https://github.com/apache/iceberg/issues/8927#issuecomment-1783724655 or the main problem is iceberg should not update metadata location in hive metastore before the write is actually completed. and the symptom is write is failed but the metadata location