[PR] Build: Bump fastavro from 1.8.3 to 1.8.4 in /python [iceberg]

2023-10-07 Thread via GitHub
dependabot[bot] opened a new pull request, #8742: URL: https://github.com/apache/iceberg/pull/8742 Bumps [fastavro](https://github.com/fastavro/fastavro) from 1.8.3 to 1.8.4. Changelog Sourced from https://github.com/fastavro/fastavro/blob/master/ChangeLog";>fastavro's changelog.

[PR] Build: Bump coverage from 7.3.1 to 7.3.2 in /python [iceberg]

2023-10-07 Thread via GitHub
dependabot[bot] opened a new pull request, #8741: URL: https://github.com/apache/iceberg/pull/8741 Bumps [coverage](https://github.com/nedbat/coveragepy) from 7.3.1 to 7.3.2. Changelog Sourced from https://github.com/nedbat/coveragepy/blob/master/CHANGES.rst";>coverage's changelog.

[PR] Build: Bump pydantic from 2.3.0 to 2.4.2 in /python [iceberg]

2023-10-07 Thread via GitHub
dependabot[bot] opened a new pull request, #8740: URL: https://github.com/apache/iceberg/pull/8740 Bumps [pydantic](https://github.com/pydantic/pydantic) from 2.3.0 to 2.4.2. Release notes Sourced from https://github.com/pydantic/pydantic/releases";>pydantic's releases. v2.4

[PR] Build: Bump mkdocstrings-python from 1.7.0 to 1.7.2 in /python [iceberg]

2023-10-07 Thread via GitHub
dependabot[bot] opened a new pull request, #8739: URL: https://github.com/apache/iceberg/pull/8739 Bumps [mkdocstrings-python](https://github.com/mkdocstrings/python) from 1.7.0 to 1.7.2. Release notes Sourced from https://github.com/mkdocstrings/python/releases";>mkdocstrings-pyth

[PR] Build: Bump cython from 3.0.2 to 3.0.3 in /python [iceberg]

2023-10-07 Thread via GitHub
dependabot[bot] opened a new pull request, #8738: URL: https://github.com/apache/iceberg/pull/8738 Bumps [cython](https://github.com/cython/cython) from 3.0.2 to 3.0.3. Changelog Sourced from https://github.com/cython/cython/blob/master/CHANGES.rst";>cython's changelog. 3.0.

Re: [PR] Build: Bump slf4j from 1.7.36 to 2.0.7 [iceberg]

2023-10-07 Thread via GitHub
dependabot[bot] closed pull request #8241: Build: Bump slf4j from 1.7.36 to 2.0.7 URL: https://github.com/apache/iceberg/pull/8241 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] Build: Bump slf4j from 1.7.36 to 2.0.7 [iceberg]

2023-10-07 Thread via GitHub
dependabot[bot] commented on PR #8241: URL: https://github.com/apache/iceberg/pull/8241#issuecomment-1751921108 Superseded by #8737. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[PR] Build: Bump slf4j from 1.7.36 to 2.0.9 [iceberg]

2023-10-07 Thread via GitHub
dependabot[bot] opened a new pull request, #8737: URL: https://github.com/apache/iceberg/pull/8737 Bumps `slf4j` from 1.7.36 to 2.0.9. Updates `org.slf4j:slf4j-api` from 1.7.36 to 2.0.9 Updates `org.slf4j:slf4j-simple` from 1.7.36 to 2.0.9 Dependabot will resolve any conf

[PR] Build: Bump org.immutables:value from 2.9.2 to 2.10.0 [iceberg]

2023-10-07 Thread via GitHub
dependabot[bot] opened a new pull request, #8736: URL: https://github.com/apache/iceberg/pull/8736 Bumps [org.immutables:value](https://github.com/immutables/immutables) from 2.9.2 to 2.10.0. Release notes Sourced from https://github.com/immutables/immutables/releases";>org.immutab

[PR] Build: Bump com.google.cloud:libraries-bom from 26.18.0 to 26.24.0 [iceberg]

2023-10-07 Thread via GitHub
dependabot[bot] opened a new pull request, #8735: URL: https://github.com/apache/iceberg/pull/8735 Bumps [com.google.cloud:libraries-bom](https://github.com/googleapis/java-cloud-bom) from 26.18.0 to 26.24.0. Release notes Sourced from https://github.com/googleapis/java-cloud-bom/

[PR] Build: Bump org.springframework:spring-web from 5.3.9 to 6.0.12 [iceberg]

2023-10-07 Thread via GitHub
dependabot[bot] opened a new pull request, #8734: URL: https://github.com/apache/iceberg/pull/8734 Bumps [org.springframework:spring-web](https://github.com/spring-projects/spring-framework) from 5.3.9 to 6.0.12. Release notes Sourced from https://github.com/spring-projects/spring

[PR] Build: Bump pypa/cibuildwheel from 2.16.0 to 2.16.2 [iceberg]

2023-10-07 Thread via GitHub
dependabot[bot] opened a new pull request, #8733: URL: https://github.com/apache/iceberg/pull/8733 Bumps [pypa/cibuildwheel](https://github.com/pypa/cibuildwheel) from 2.16.0 to 2.16.2. Release notes Sourced from https://github.com/pypa/cibuildwheel/releases";>pypa/cibuildwheel's

[PR] Flink: Support batch modifications via copy-on-write [iceberg]

2023-10-07 Thread via GitHub
linyanghao opened a new pull request, #8732: URL: https://github.com/apache/iceberg/pull/8732 Resolves: https://github.com/apache/iceberg/issues/7311 https://github.com/apache/iceberg/issues/8718 -- This is an automated message from the Apache Git Service. To respond to the message, plea

Re: [I] Manifest List Writer Design [iceberg-rust]

2023-10-07 Thread via GitHub
liurenjie1024 commented on issue #72: URL: https://github.com/apache/iceberg-rust/issues/72#issuecomment-175172 > Should we support appending entries like the Java and Python implementations instead of a single write? The `ManifestList` is a simple wrapper of `Vec`, so I think pro

Re: [PR] feat: add builder to TableMetadata interface [iceberg-rust]

2023-10-07 Thread via GitHub
liurenjie1024 commented on PR #62: URL: https://github.com/apache/iceberg-rust/pull/62#issuecomment-1751887344 > I don't know yet if we should remove the builder as it seems to provide some internal struct builder and default value at least. We will see with what is left at the end under h

Re: [I] Large Iceberg Parquet file writes are (sometimes?) truncated [iceberg]

2023-10-07 Thread via GitHub
holdenk commented on issue #8620: URL: https://github.com/apache/iceberg/issues/8620#issuecomment-1751883211 Oh hmmm I wonder if it's related to some magic we do with staging to HDFS sometime, I'll investigate that angle too if I can't repro with pure S3. -- This is an automated message f

Re: [I] Large Iceberg Parquet file writes are (sometimes?) truncated [iceberg]

2023-10-07 Thread via GitHub
RussellSpitzer commented on issue #8620: URL: https://github.com/apache/iceberg/issues/8620#issuecomment-1751880145 Thanks for the update, the only thing I've seen that's similar is some parquet files written where single column pages are corrupted. Very rare though and only on HDFS --

Re: [PR] Phase 1 - New Docs Deployment [iceberg]

2023-10-07 Thread via GitHub
bitsondatadev commented on code in PR #8659: URL: https://github.com/apache/iceberg/pull/8659#discussion_r1349601992 ## docs-new/.github/bin/deploy_docs.sh: ## @@ -0,0 +1,39 @@ +#!/bin/bash + +while [[ "$#" -gt 0 ]]; do +case $1 in +-v|--version) ICEBERG_VERSION="$2"

Re: [I] identifier-field-ids not supported for Float or double [iceberg]

2023-10-07 Thread via GitHub
github-actions[bot] commented on issue #7302: URL: https://github.com/apache/iceberg/issues/7302#issuecomment-1751871998 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Support writing flink changelogs into a table without equality fields [iceberg]

2023-10-07 Thread via GitHub
github-actions[bot] commented on issue #7314: URL: https://github.com/apache/iceberg/issues/7314#issuecomment-1751871995 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Upsert support for keyless Apache Flink tables [iceberg]

2023-10-07 Thread via GitHub
Ge commented on issue #8719: URL: https://github.com/apache/iceberg/issues/8719#issuecomment-1751842819 Turning off the upsert mode also leads to incorrect results: ``` Flink SQL> SELECT * FROM word_count LIMIT 10; +++--

Re: [PR] Prevent dropping last column. [iceberg]

2023-10-07 Thread via GitHub
RussellSpitzer commented on PR #8523: URL: https://github.com/apache/iceberg/pull/8523#issuecomment-1751826745 > > What’s the problem with dropping all the columns? > > > > Why would someone need a table without columns? Isn't it better to drop the whole table? I don't

Re: [PR] Prevent dropping last column. [iceberg]

2023-10-07 Thread via GitHub
rafoid commented on PR #8523: URL: https://github.com/apache/iceberg/pull/8523#issuecomment-1751822535 > What’s the problem with dropping all the columns? Why would someone need a table without columns? Isn't it better to drop the whole table? -- This is an automated message from t

Re: [I] Large Iceberg Parquet file writes are (sometimes?) truncated [iceberg]

2023-10-07 Thread via GitHub
holdenk commented on issue #8620: URL: https://github.com/apache/iceberg/issues/8620#issuecomment-1751821147 So initial investigation with local writes only didn't find anything so my suspicion is it's in the S3 layer but I haven't had time to try and repro since. -- This is an automated

Re: [PR] Prevent dropping last column. [iceberg]

2023-10-07 Thread via GitHub
RussellSpitzer commented on PR #8523: URL: https://github.com/apache/iceberg/pull/8523#issuecomment-1751820667 What’s the problem with dropping all the columns? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [I] Large Iceberg Parquet file writes are (sometimes?) truncated [iceberg]

2023-10-07 Thread via GitHub
RussellSpitzer commented on issue #8620: URL: https://github.com/apache/iceberg/issues/8620#issuecomment-1751820550 Ping, just to see if you’ve seen this again. I don’t have any local users with quite that large a target file size (I think 2GB is the largest) but I would love to know if you

Re: [I] Spark: Document MergeSchema, AcceptAnySchema and Schema Evolution Code [iceberg]

2023-10-07 Thread via GitHub
RussellSpitzer closed issue #8005: Spark: Document MergeSchema, AcceptAnySchema and Schema Evolution Code URL: https://github.com/apache/iceberg/issues/8005 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] feat: add builder to TableMetadata interface [iceberg-rust]

2023-10-07 Thread via GitHub
y0psolo commented on code in PR #62: URL: https://github.com/apache/iceberg-rust/pull/62#discussion_r1349568117 ## crates/iceberg/src/spec/table_metadata.rs: ## @@ -93,22 +112,291 @@ pub struct TableMetadata { /// previous metadata file location should be added to the list.

Re: [PR] feat: add builder to TableMetadata interface [iceberg-rust]

2023-10-07 Thread via GitHub
y0psolo commented on code in PR #62: URL: https://github.com/apache/iceberg-rust/pull/62#discussion_r1349567865 ## crates/iceberg/src/spec/table_metadata.rs: ## @@ -93,22 +112,291 @@ pub struct TableMetadata { /// previous metadata file location should be added to the list.

Re: [PR] feat: add builder to TableMetadata interface [iceberg-rust]

2023-10-07 Thread via GitHub
y0psolo commented on PR #62: URL: https://github.com/apache/iceberg-rust/pull/62#issuecomment-1751805399 > Hi, @y0psolo Thanks for the effort! But I'm a little worried about the api here, since it's error prone, e.g. it's quite easy to construct an invalid table metadata. My suggestion woul

[PR] Add section on Github releases [iceberg-docs]

2023-10-07 Thread via GitHub
Fokko opened a new pull request, #280: URL: https://github.com/apache/iceberg-docs/pull/280 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

Re: [PR] Spec: Clarify spec_id field in Data File [iceberg]

2023-10-07 Thread via GitHub
Fokko commented on code in PR #8730: URL: https://github.com/apache/iceberg/pull/8730#discussion_r1349555314 ## format/spec.md: ## @@ -443,13 +443,13 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo | _optional_ | _optional_ | **`132 split_o

Re: [I] v1 table data file spec id is None [iceberg-python]

2023-10-07 Thread via GitHub
puchengy commented on issue #46: URL: https://github.com/apache/iceberg-python/issues/46#issuecomment-1751741939 @Fokko And based on the https://github.com/apache/iceberg/pull/8730 it seems that we would like to inherent the spec id from manifest file as well? https://github.com/apache/iceb

Re: [I] v1 table data file spec id is None [iceberg-python]

2023-10-07 Thread via GitHub
puchengy commented on issue #46: URL: https://github.com/apache/iceberg-python/issues/46#issuecomment-1751740812 @Fokko Hi, I thought we already have that https://github.com/apache/iceberg/blob/pyiceberg-0.4.0rc2/python/pyiceberg/manifest.py#L162 or is this not what you meant? -- This is

Re: [I] Manifest List Writer Design [iceberg-rust]

2023-10-07 Thread via GitHub
barronw commented on issue #72: URL: https://github.com/apache/iceberg-rust/issues/72#issuecomment-1751731013 Thanks for the feedback folks! How do we feel about something like this? ```rust struct ManifestListWriter { output_file: OutputFile, format_version: FormatVer

Re: [PR] feat(tables): add basic table implementation [iceberg-go]

2023-10-07 Thread via GitHub
coded9 commented on code in PR #11: URL: https://github.com/apache/iceberg-go/pull/11#discussion_r1349524574 ## table/metadata.go: ## @@ -0,0 +1,401 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file

Re: [PR] Convert the Logical to Physical map to a visitor [iceberg-python]

2023-10-07 Thread via GitHub
Fokko commented on PR #43: URL: https://github.com/apache/iceberg-python/pull/43#issuecomment-1751717169 Thanks @rdblue for the quick review! 🙌 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] Convert the Logical to Physical map to a visitor [iceberg-python]

2023-10-07 Thread via GitHub
Fokko merged PR #43: URL: https://github.com/apache/iceberg-python/pull/43 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [I] configure file size in iceberg table? [iceberg]

2023-10-07 Thread via GitHub
RussellSpitzer closed issue #7797: configure file size in iceberg table? URL: https://github.com/apache/iceberg/issues/7797 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [I] write.target-file-size-bytes isn't respected when writing data [iceberg]

2023-10-07 Thread via GitHub
RussellSpitzer commented on issue #8729: URL: https://github.com/apache/iceberg/issues/8729#issuecomment-1751714232 https://iceberg.apache.org/docs/latest/spark-writes/#writing-distribution-modes -- This is an automated message from the Apache Git Service. To respond to the message, pleas

Re: [I] write.target-file-size-bytes isn't respected when writing data [iceberg]

2023-10-07 Thread via GitHub
RussellSpitzer closed issue #8729: write.target-file-size-bytes isn't respected when writing data URL: https://github.com/apache/iceberg/issues/8729 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] Dell : Migrate Files using TestRule to Junit5. [iceberg]

2023-10-07 Thread via GitHub
ashutosh-roy commented on PR #8707: URL: https://github.com/apache/iceberg/pull/8707#issuecomment-1751713802 Hi @nastra, Please provide your feedbacks on the PR for this issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [I] write.target-file-size-bytes isn't respected when writing data [iceberg]

2023-10-07 Thread via GitHub
RussellSpitzer commented on issue #8729: URL: https://github.com/apache/iceberg/issues/8729#issuecomment-1751713732 Everything Amogh said is correct, write target file size is the max a writer will produce not the minimum. Amount of data written to a file is dependent on the amount of data

Re: [I] Migrate Files using TestRule in dell package to Junit5 [iceberg]

2023-10-07 Thread via GitHub
ashutosh-roy commented on issue #7888: URL: https://github.com/apache/iceberg/issues/7888#issuecomment-1751713675 Hi @nastra, Please provide your feedbacks on the PR for this issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] feat(tables): add basic table implementation [iceberg-go]

2023-10-07 Thread via GitHub
coded9 commented on code in PR #11: URL: https://github.com/apache/iceberg-go/pull/11#discussion_r1349521151 ## table/metadata.go: ## @@ -0,0 +1,401 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file

Re: [PR] Fix the TableIdentifier [iceberg-python]

2023-10-07 Thread via GitHub
Fokko merged PR #44: URL: https://github.com/apache/iceberg-python/pull/44 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [I] Support Hudi `DeltaStreamer` compatible feature [iceberg]

2023-10-07 Thread via GitHub
LittleWat commented on issue #8724: URL: https://github.com/apache/iceberg/issues/8724#issuecomment-1751663468 Yes, Flink is great but still we need to write some code for ingestion, right..? [Hudi Deltastreamer](https://hudi.apache.org/docs/0.13.1/hoodie_deltastreamer/#deltastreamer) is a

Re: [PR] Spark: support replace equality deletes to position deletes [iceberg]

2023-10-07 Thread via GitHub
chenwyi2 commented on PR #2216: URL: https://github.com/apache/iceberg/pull/2216#issuecomment-1751660847 why this mr was closed? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] Fix the TableIdentifier [iceberg-python]

2023-10-07 Thread via GitHub
Fokko commented on code in PR #44: URL: https://github.com/apache/iceberg-python/pull/44#discussion_r1349476395 ## tests/test_integration.py: ## @@ -104,25 +104,25 @@ def table(catalog: Catalog) -> Table: @pytest.mark.integration def test_table_properties(table: Table) -> No

Re: [I] v1 table data file spec id is None [iceberg-python]

2023-10-07 Thread via GitHub
Fokko commented on issue #46: URL: https://github.com/apache/iceberg-python/issues/46#issuecomment-1751628211 Hey @puchengy thanks for raising this! I was unsure about this because `141: spec-id` is not mentioned in the spec, but it looks like we can add it: https://github.com/apache/