Re: [PR] API, Core: implement types timestamp_ns and timestamptz_ns [iceberg]

2023-11-03 Thread via GitHub
Fokko commented on code in PR #8971: URL: https://github.com/apache/iceberg/pull/8971#discussion_r1381237282 ## api/src/main/java/org/apache/iceberg/transforms/Hours.java: ## @@ -57,15 +58,16 @@ public boolean satisfiesOrderOf(Transform other) { } if (other instanceo

Re: [PR] API, Core: implement types timestamp_ns and timestamptz_ns [iceberg]

2023-11-03 Thread via GitHub
Fokko commented on code in PR #8971: URL: https://github.com/apache/iceberg/pull/8971#discussion_r1381238071 ## api/src/main/java/org/apache/iceberg/transforms/Months.java: ## @@ -55,14 +57,13 @@ public boolean satisfiesOrderOf(Transform other) { } if (other instance

Re: [PR] API, Core: implement types timestamp_ns and timestamptz_ns [iceberg]

2023-11-03 Thread via GitHub
Fokko commented on code in PR #8971: URL: https://github.com/apache/iceberg/pull/8971#discussion_r1381238654 ## api/src/main/java/org/apache/iceberg/transforms/PartitionSpecVisitor.java: ## @@ -121,17 +121,13 @@ static R visit(Schema schema, PartitionField field, PartitionSpec

Re: [PR] API, Core: implement types timestamp_ns and timestamptz_ns [iceberg]

2023-11-03 Thread via GitHub
Fokko commented on code in PR #8971: URL: https://github.com/apache/iceberg/pull/8971#discussion_r1381242855 ## api/src/main/java/org/apache/iceberg/transforms/Transforms.java: ## @@ -129,10 +131,14 @@ public static Transform year(Type type) { case DATE: return

Re: [PR] API, Core: implement types timestamp_ns and timestamptz_ns [iceberg]

2023-11-03 Thread via GitHub
Fokko commented on code in PR #8971: URL: https://github.com/apache/iceberg/pull/8971#discussion_r1381247301 ## api/src/main/java/org/apache/iceberg/types/Types.java: ## @@ -205,27 +208,56 @@ public String toString() { } public static class TimestampType extends Primitiv

Re: [I] Support adding an additional `opType` column when creating a table [iceberg]

2023-11-03 Thread via GitHub
nastra commented on issue #8973: URL: https://github.com/apache/iceberg/issues/8973#issuecomment-1792008595 You might want to take a look at https://iceberg.apache.org/docs/latest/spark-procedures/#change-data-capture as that might provide what you're looking for -- This is an automated

[I] Data duplicate after the partition is modified [iceberg]

2023-11-03 Thread via GitHub
jiamin13579 opened a new issue, #8979: URL: https://github.com/apache/iceberg/issues/8979 ### Query engine Spark ### Question When we talk about writing data to Iceberg, it is common practice to first issue an eqDelete record and then write a data record. However, after

Re: [PR] Nessie: reimplement namespace operations [iceberg]

2023-11-03 Thread via GitHub
adutra commented on PR #8857: URL: https://github.com/apache/iceberg/pull/8857#issuecomment-1792245820 @nastra can I get a review from you please? :-) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[PR] Core: Iceberg streaming streaming-skip-overwrite-snapshots SparkMicroBatchStream only skips over one file per trigger [iceberg]

2023-11-03 Thread via GitHub
cccs-jc opened a new pull request, #8980: URL: https://github.com/apache/iceberg/pull/8980 Closes #8902 @singhpk234 I have fixed the issue https://github.com/apache/iceberg/issues/8902. Could you have a look at it. -- This is an automated message from the Apache Git Service. To re

Re: [I] Data duplicate after the partition is modified [iceberg]

2023-11-03 Thread via GitHub
nastra commented on issue #8979: URL: https://github.com/apache/iceberg/issues/8979#issuecomment-1792399213 @jiamin13579 you might want to take a look at https://iceberg.apache.org/docs/latest/evolution/#partition-evolution. > Partition evolution is a metadata operation and does not e

Re: [PR] Refactor Arrow schema conversion [iceberg-python]

2023-11-03 Thread via GitHub
Fokko commented on code in PR #117: URL: https://github.com/apache/iceberg-python/pull/117#discussion_r1381820029 ## tests/io/test_pyarrow.py: ## @@ -708,15 +709,17 @@ def _write_table_to_file(filepath: str, schema: pa.Schema, table: pa.Table) -> s @pytest.fixture def file_

Re: [PR] Refactor Arrow schema conversion [iceberg-python]

2023-11-03 Thread via GitHub
Fokko commented on code in PR #117: URL: https://github.com/apache/iceberg-python/pull/117#discussion_r1381820634 ## pyiceberg/io/pyarrow.py: ## @@ -435,13 +435,18 @@ def delete(self, location: Union[str, InputFile, OutputFile]) -> None: raise # pragma: no cover -

Re: [PR] Refactor Arrow schema conversion [iceberg-python]

2023-11-03 Thread via GitHub
Fokko commented on PR #117: URL: https://github.com/apache/iceberg-python/pull/117#issuecomment-1792690494 Thanks @bitsondatadev and @amogh-jahagirdar for the review 🥳 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [PR] Refactor Arrow schema conversion [iceberg-python]

2023-11-03 Thread via GitHub
Fokko merged PR #117: URL: https://github.com/apache/iceberg-python/pull/117 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [I] Catalog fails to load table using the table's identifier [iceberg-python]

2023-11-03 Thread via GitHub
pdames commented on issue #123: URL: https://github.com/apache/iceberg-python/issues/123#issuecomment-1792695382 Thanks for the input @danielcweeks and @Fokko. I'll raise a PR to apply the fix recommended by @danielcweeks for review. -- This is an automated message from the Apache Git Ser

Re: [PR] Build: Bump pyarrow from 13.0.0 to 14.0.0 [iceberg-python]

2023-11-03 Thread via GitHub
Fokko commented on PR #121: URL: https://github.com/apache/iceberg-python/pull/121#issuecomment-1792693270 @dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[PR] Build: Bump pyarrow from 13.0.0 to 14.0.0 [iceberg-python]

2023-11-03 Thread via GitHub
Fokko opened a new pull request, #126: URL: https://github.com/apache/iceberg-python/pull/126 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e

Re: [PR] Spark 3.5: Fix Migrate procedure renaming issue for custom catalog [iceberg]

2023-11-03 Thread via GitHub
singhpk234 commented on code in PR #8931: URL: https://github.com/apache/iceberg/pull/8931#discussion_r1381933633 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/MigrateTableSparkAction.java: ## @@ -108,6 +109,23 @@ public MigrateTableSparkAction backupTableNa

[PR] Clarify which columns can be used for equality delete files. [iceberg]

2023-11-03 Thread via GitHub
emkornfield opened a new pull request, #8981: URL: https://github.com/apache/iceberg/pull/8981 based on mailing list discussion: https://lists.apache.org/thread/7w4wyxsnv97trglpcobysy99qf9h2shn -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [I] Support MOR CDC view [iceberg]

2023-11-03 Thread via GitHub
puchengy commented on issue #8975: URL: https://github.com/apache/iceberg/issues/8975#issuecomment-1792865511 @aokolnychyi @flyrain Hello, do you know how much effort this will take to implement? Thanks! -- This is an automated message from the Apache Git Service. To respond to the messag

Re: [PR] Core: Iceberg streaming streaming-skip-overwrite-snapshots SparkMicroBatchStream only skips over one file per trigger [iceberg]

2023-11-03 Thread via GitHub
singhpk234 commented on code in PR #8980: URL: https://github.com/apache/iceberg/pull/8980#discussion_r1382019127 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/source/SparkMicroBatchStream.java: ## @@ -392,8 +405,15 @@ public Offset latestOffset(Offset startOffset,

[PR] Clarify time travel implementation in Iceberg [iceberg]

2023-11-03 Thread via GitHub
emkornfield opened a new pull request, #8982: URL: https://github.com/apache/iceberg/pull/8982 Based on mailing list discussion: https://lists.apache.org/thread/o6gyky3t3b94grfr8t1nvkyxrgwz4737 -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] GCP: Add Iceberg Catalog for GCP BigLake Metastore [iceberg]

2023-11-03 Thread via GitHub
devorbit commented on PR #7412: URL: https://github.com/apache/iceberg/pull/7412#issuecomment-1792967038 Hi @dchristle @coufon When are we expecting it to be available with standard iceberg-spark-runtime? We have some use cases for it to be used with the Iceberg Kafka Connector. I am

Re: [PR] Spark 3.5: Use rolling manifest writers when optimizing metadata [iceberg]

2023-11-03 Thread via GitHub
aokolnychyi commented on code in PR #8972: URL: https://github.com/apache/iceberg/pull/8972#discussion_r1382149201 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteManifestsAction.java: ## @@ -406,12 +406,13 @@ public void testRewriteLargeManifestsPa

Re: [PR] Spark 3.5: Use rolling manifest writers when optimizing metadata [iceberg]

2023-11-03 Thread via GitHub
aokolnychyi commented on PR #8972: URL: https://github.com/apache/iceberg/pull/8972#issuecomment-1793027078 @RussellSpitzer @singhpk234, could you take another look? I fixed the test. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] API, Core: implement types timestamp_ns and timestamptz_ns [iceberg]

2023-11-03 Thread via GitHub
jacobmarble commented on code in PR #8971: URL: https://github.com/apache/iceberg/pull/8971#discussion_r1382165939 ## api/src/main/java/org/apache/iceberg/transforms/Days.java: ## @@ -55,14 +56,14 @@ public boolean satisfiesOrderOf(Transform other) { } if (other inst

Re: [PR] API, Core: implement types timestamp_ns and timestamptz_ns [iceberg]

2023-11-03 Thread via GitHub
jacobmarble commented on code in PR #8971: URL: https://github.com/apache/iceberg/pull/8971#discussion_r1382166328 ## api/src/main/java/org/apache/iceberg/transforms/Months.java: ## @@ -55,14 +57,13 @@ public boolean satisfiesOrderOf(Transform other) { } if (other in

Re: [PR] API, Core: implement types timestamp_ns and timestamptz_ns [iceberg]

2023-11-03 Thread via GitHub
jacobmarble commented on code in PR #8971: URL: https://github.com/apache/iceberg/pull/8971#discussion_r1382167692 ## api/src/main/java/org/apache/iceberg/transforms/Transforms.java: ## @@ -129,10 +131,14 @@ public static Transform year(Type type) { case DATE: r

Re: [PR] API, Core: implement types timestamp_ns and timestamptz_ns [iceberg]

2023-11-03 Thread via GitHub
jacobmarble commented on code in PR #8971: URL: https://github.com/apache/iceberg/pull/8971#discussion_r1382170673 ## api/src/main/java/org/apache/iceberg/types/Types.java: ## @@ -205,27 +208,56 @@ public String toString() { } public static class TimestampType extends Pr

[PR] Replace black by Ruff Formatter [iceberg-python]

2023-11-03 Thread via GitHub
hussein-awala opened a new pull request, #127: URL: https://github.com/apache/iceberg-python/pull/127 This PR replaces black with Ruff Formatter, which is 30x faster (https://astral.sh/blog/the-ruff-formatter). It also removes the `--skip-string-normalization` config to replace single quote

[PR] Build: Bump mkdocs-material-extensions from 1.2 to 1.3 [iceberg-python]

2023-11-03 Thread via GitHub
dependabot[bot] opened a new pull request, #128: URL: https://github.com/apache/iceberg-python/pull/128 Bumps [mkdocs-material-extensions](https://github.com/facelessuser/mkdocs-material-extensions) from 1.2 to 1.3. Release notes Sourced from https://github.com/facelessuser/mkdocs

[PR] Build: Bump ray from 2.7.1 to 2.8.0 [iceberg-python]

2023-11-03 Thread via GitHub
dependabot[bot] opened a new pull request, #129: URL: https://github.com/apache/iceberg-python/pull/129 Bumps [ray](https://github.com/ray-project/ray) from 2.7.1 to 2.8.0. Release notes Sourced from https://github.com/ray-project/ray/releases";>ray's releases. Ray-2.8.0 R

[PR] Build: Bump mypy-boto3-glue from 1.28.63 to 1.28.77 [iceberg-python]

2023-11-03 Thread via GitHub
dependabot[bot] opened a new pull request, #130: URL: https://github.com/apache/iceberg-python/pull/130 Bumps [mypy-boto3-glue](https://github.com/youtype/mypy_boto3_builder) from 1.28.63 to 1.28.77. Commits See full diff in https://github.com/youtype/mypy_boto3_builder/commits

Re: [PR] Spark 3.5: Use rolling manifest writers when optimizing metadata [iceberg]

2023-11-03 Thread via GitHub
aokolnychyi merged PR #8972: URL: https://github.com/apache/iceberg/pull/8972 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Spark 3.5: Use rolling manifest writers when optimizing metadata [iceberg]

2023-11-03 Thread via GitHub
aokolnychyi commented on PR #8972: URL: https://github.com/apache/iceberg/pull/8972#issuecomment-1793306911 Thanks, @RussellSpitzer @singhpk234! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] Spark: Fix Fast forward procedure output for non-main branches [iceberg]

2023-11-03 Thread via GitHub
aokolnychyi commented on code in PR #8854: URL: https://github.com/apache/iceberg/pull/8854#discussion_r1382322125 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/FastForwardBranchProcedure.java: ## @@ -77,9 +77,9 @@ public InternalRow[] call(InternalRow ar

Re: [PR] Spark: Fix Fast forward procedure output for non-main branches [iceberg]

2023-11-03 Thread via GitHub
aokolnychyi commented on code in PR #8854: URL: https://github.com/apache/iceberg/pull/8854#discussion_r1382322425 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/FastForwardBranchProcedure.java: ## @@ -77,9 +77,9 @@ public InternalRow[] call(InternalRow ar

Re: [I] Spark:CALL [rewrite_manifests] error Manifest is missing [iceberg]

2023-11-03 Thread via GitHub
okayhooni commented on issue #4161: URL: https://github.com/apache/iceberg/issues/4161#issuecomment-1793318058 @aokolnychyi I got the same issue on calling `rewrite_data_files()` to the table with S3 bucket storage and Glue catalog backend. I tested this procedure on the EMR `v

Re: [PR] Core: Use ParallelIterable in Deletes::toPositionIndex (6387) [iceberg]

2023-11-03 Thread via GitHub
aokolnychyi commented on PR #8805: URL: https://github.com/apache/iceberg/pull/8805#issuecomment-1793318524 I actually have some concerns about this change, which I outlined in [this](https://docs.google.com/document/d/1M4L6o-qnGRwGhbhkW8BnravoTwvCrJV8VvzVQDRJO5I) doc. I have an alternat

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-11-03 Thread via GitHub
aokolnychyi commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1382325507 ## api/src/main/java/org/apache/iceberg/Scan.java: ## @@ -77,6 +78,21 @@ public interface Scan> { */ ThisT includeColumnStats(); + /** + * Create a new

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-11-03 Thread via GitHub
aokolnychyi commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1382325753 ## api/src/main/java/org/apache/iceberg/Scan.java: ## @@ -77,6 +78,21 @@ public interface Scan> { */ ThisT includeColumnStats(); + /** + * Create a new

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-11-03 Thread via GitHub
aokolnychyi commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1382325507 ## api/src/main/java/org/apache/iceberg/Scan.java: ## @@ -77,6 +78,21 @@ public interface Scan> { */ ThisT includeColumnStats(); + /** + * Create a new

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-11-03 Thread via GitHub
aokolnychyi commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1382326565 ## api/src/main/java/org/apache/iceberg/ContentFile.java: ## @@ -165,6 +166,21 @@ default Long fileSequenceNumber() { */ F copyWithoutStats(); + /** + *

Re: [I] Delete Orphan Files makes metadata inconsistent and table unusable [iceberg]

2023-11-03 Thread via GitHub
okayhooni commented on issue #4194: URL: https://github.com/apache/iceberg/issues/4194#issuecomment-1793326155 @aokolnychyi I got the issue only related to metadata files, on `rewrite_manifests()` procedure call, as I said on the other issue. https://github.com/apache/iceberg/

Re: [I] "Manifest is missing" ValidationException when there have Concurrent applications to rewrite manifests [iceberg]

2023-11-03 Thread via GitHub
okayhooni commented on issue #3466: URL: https://github.com/apache/iceberg/issues/3466#issuecomment-1793330079 I got the same issue.. https://github.com/apache/iceberg/issues/4161#issuecomment-1793318058 -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-11-03 Thread via GitHub
aokolnychyi commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1382331393 ## core/src/main/java/org/apache/iceberg/util/ContentFileUtil.java: ## @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mo

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-11-03 Thread via GitHub
aokolnychyi commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1382331393 ## core/src/main/java/org/apache/iceberg/util/ContentFileUtil.java: ## @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mo

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-11-03 Thread via GitHub
aokolnychyi commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1382331518 ## core/src/main/java/org/apache/iceberg/util/ContentFileUtil.java: ## @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mo

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-11-03 Thread via GitHub
aokolnychyi commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1382331393 ## core/src/main/java/org/apache/iceberg/util/ContentFileUtil.java: ## @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mo

Re: [PR] Build: Bump pyarrow from 13.0.0 to 14.0.0 [iceberg-python]

2023-11-03 Thread via GitHub
Fokko merged PR #126: URL: https://github.com/apache/iceberg-python/pull/126 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] Build: Bump pyarrow from 13.0.0 to 14.0.0 [iceberg-python]

2023-11-03 Thread via GitHub
Fokko commented on PR #126: URL: https://github.com/apache/iceberg-python/pull/126#issuecomment-1793355675 Thanks for the review @nastra -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] Build: Bump pyarrow from 13.0.0 to 14.0.0 [iceberg-python]

2023-11-03 Thread via GitHub
dependabot[bot] commented on PR #121: URL: https://github.com/apache/iceberg-python/pull/121#issuecomment-1793355738 Looks like pyarrow is up-to-date now, so this is no longer needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] Build: Bump pyarrow from 13.0.0 to 14.0.0 [iceberg-python]

2023-11-03 Thread via GitHub
dependabot[bot] closed pull request #121: Build: Bump pyarrow from 13.0.0 to 14.0.0 URL: https://github.com/apache/iceberg-python/pull/121 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] Build: Bump mkdocs-material-extensions from 1.2 to 1.3 [iceberg-python]

2023-11-03 Thread via GitHub
Fokko merged PR #128: URL: https://github.com/apache/iceberg-python/pull/128 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.