Re: [PR] core: use testcontainers-minio [iceberg]

2024-10-22 Thread via GitHub
nastra commented on code in PR #11349: URL: https://github.com/apache/iceberg/pull/11349#discussion_r1810042673 ## aws/src/test/java/org/apache/iceberg/aws/s3/TestMinioUtil.java: ## @@ -0,0 +1,78 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more c

Re: [PR] More accurate estimate on parquet row groups size [iceberg]

2024-10-22 Thread via GitHub
nastra commented on code in PR #11258: URL: https://github.com/apache/iceberg/pull/11258#discussion_r1810073114 ## parquet/src/test/java/org/apache/iceberg/parquet/TestParquet.java: ## @@ -219,6 +222,52 @@ public void testTwoLevelList() throws IOException { assertThat(recor

Re: [PR] More accurate estimate on parquet row groups size [iceberg]

2024-10-22 Thread via GitHub
nastra commented on code in PR #11258: URL: https://github.com/apache/iceberg/pull/11258#discussion_r1810082246 ## parquet/src/test/java/org/apache/iceberg/parquet/TestParquet.java: ## @@ -219,6 +222,52 @@ public void testTwoLevelList() throws IOException { assertThat(recor

Re: [PR] [KafkaConnect] Fix RecordConverter for UUID and Fixed Types [iceberg]

2024-10-22 Thread via GitHub
bryanck commented on code in PR #11346: URL: https://github.com/apache/iceberg/pull/11346#discussion_r1811331398 ## kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/data/RecordConverter.java: ## @@ -126,10 +127,14 @@ private Object convertValue( case S

Re: [PR] [KafkaConnect] Fix RecordConverter for UUID and Fixed Types [iceberg]

2024-10-22 Thread via GitHub
bryanck commented on code in PR #11346: URL: https://github.com/apache/iceberg/pull/11346#discussion_r1811332848 ## kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/data/RecordConverter.java: ## @@ -126,10 +127,14 @@ private Object convertValue( case S

Re: [PR] [KafkaConnect] Fix RecordConverter for UUID and Fixed Types [iceberg]

2024-10-22 Thread via GitHub
singhpk234 commented on code in PR #11346: URL: https://github.com/apache/iceberg/pull/11346#discussion_r1811341710 ## kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/data/RecordConverter.java: ## @@ -126,10 +127,14 @@ private Object convertValue( cas

Re: [PR] API: Add Variant data type [iceberg]

2024-10-22 Thread via GitHub
aihuaxu commented on code in PR #11324: URL: https://github.com/apache/iceberg/pull/11324#discussion_r1811347080 ## api/src/main/java/org/apache/iceberg/VariantLike.java: ## @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contribut

Re: [PR] API: Add Variant data type [iceberg]

2024-10-22 Thread via GitHub
aihuaxu commented on code in PR #11324: URL: https://github.com/apache/iceberg/pull/11324#discussion_r1811345388 ## api/src/main/java/org/apache/iceberg/VariantLike.java: ## @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contribut

Re: [PR] [KafkaConnect] Fix RecordConverter for UUID and Fixed Types [iceberg]

2024-10-22 Thread via GitHub
singhpk234 commented on code in PR #11346: URL: https://github.com/apache/iceberg/pull/11346#discussion_r1811346546 ## kafka-connect/kafka-connect/src/test/java/org/apache/iceberg/connect/data/RecordConverterTest.java: ## @@ -84,11 +93,18 @@ import org.apache.kafka.connect.stor

Re: [PR] Core: Add portable Roaring bitmap for row positions [iceberg]

2024-10-22 Thread via GitHub
amogh-jahagirdar commented on code in PR #11372: URL: https://github.com/apache/iceberg/pull/11372#discussion_r1811329952 ## core/src/main/java/org/apache/iceberg/deletes/RoaringPositionBitmap.java: ## @@ -0,0 +1,309 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

Re: [PR] More accurate estimate on parquet row groups size [iceberg]

2024-10-22 Thread via GitHub
jinyangli34 commented on code in PR #11258: URL: https://github.com/apache/iceberg/pull/11258#discussion_r1811493992 ## parquet/src/test/java/org/apache/iceberg/parquet/TestParquet.java: ## @@ -219,6 +222,52 @@ public void testTwoLevelList() throws IOException { assertThat(

Re: [PR] More accurate estimate on parquet row groups size [iceberg]

2024-10-22 Thread via GitHub
jinyangli34 commented on code in PR #11258: URL: https://github.com/apache/iceberg/pull/11258#discussion_r1811499959 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java: ## @@ -557,8 +557,8 @@ public void testBinPackCombineMixedFile

[I] Iceberg cannot connect to Minio S3 with AWS bundle. [iceberg]

2024-10-22 Thread via GitHub
ctagard opened a new issue, #11376: URL: https://github.com/apache/iceberg/issues/11376 ### Query engine Spark ### Question I was following the following tutorial [here](https://github.com/databricks/docker-spark-iceberg) which uses the same docker-compose as the Quicks

Re: [PR] More accurate estimate on parquet row groups size [iceberg]

2024-10-22 Thread via GitHub
jinyangli34 commented on code in PR #11258: URL: https://github.com/apache/iceberg/pull/11258#discussion_r1811494439 ## parquet/src/test/java/org/apache/iceberg/parquet/TestParquet.java: ## @@ -219,6 +222,52 @@ public void testTwoLevelList() throws IOException { assertThat(

Re: [PR] Core, Spark: Remove dangling deletes as part of RewriteDataFilesAction [iceberg]

2024-10-22 Thread via GitHub
szehon-ho commented on PR #9724: URL: https://github.com/apache/iceberg/pull/9724#issuecomment-2430434532 Merged, thanks @dramaticlly ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] Core, Spark: Remove dangling deletes as part of RewriteDataFilesAction [iceberg]

2024-10-22 Thread via GitHub
szehon-ho merged PR #9724: URL: https://github.com/apache/iceberg/pull/9724 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [PR] feat(catalog/glue): add support for glue catalog namespace operations [iceberg-go]

2024-10-22 Thread via GitHub
natusioe commented on code in PR #173: URL: https://github.com/apache/iceberg-go/pull/173#discussion_r1811534102 ## catalog/glue.go: ## @@ -180,8 +267,155 @@ func (c *GlueCatalog) ListNamespaces(ctx context.Context, parent table.Identifie return icebergNamespaces, nil

Re: [PR] [KafkaConnect] Fix RecordConverter for UUID and Fixed Types [iceberg]

2024-10-22 Thread via GitHub
RussellSpitzer commented on code in PR #11346: URL: https://github.com/apache/iceberg/pull/11346#discussion_r1811295504 ## kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/data/RecordConverter.java: ## @@ -126,10 +127,14 @@ private Object convertValue(

Re: [PR] Snapshot `summary` map must have `operation` key [iceberg]

2024-10-22 Thread via GitHub
kevinjqliu commented on code in PR #11354: URL: https://github.com/apache/iceberg/pull/11354#discussion_r1811248912 ## core/src/main/java/org/apache/iceberg/SnapshotParser.java: ## @@ -140,6 +142,8 @@ static Snapshot fromJson(JsonNode node) { } } summary =

Re: [PR] feat: Add 'Create Namespace' command to CLI [iceberg-go]

2024-10-22 Thread via GitHub
zeroshade commented on code in PR #179: URL: https://github.com/apache/iceberg-go/pull/179#discussion_r1811251550 ## cmd/iceberg/main.go: ## @@ -70,6 +71,7 @@ type Config struct { Uuid bool `docopt:"uuid"` Location bool `docopt:"location"` Propsbo

Re: [I] [Feature Request] Speed up InspectTable.files() [iceberg-python]

2024-10-22 Thread via GitHub
corleyma commented on issue #1229: URL: https://github.com/apache/iceberg-python/issues/1229#issuecomment-2430215685 >IMO it makes sense to wait until it gets implemented or contribute there, rather than writing our own general-purpose Avro-To-Arrow code. We don't need general-purpos

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-10-22 Thread via GitHub
aokolnychyi commented on PR #10935: URL: https://github.com/apache/iceberg/pull/10935#issuecomment-2430250182 Coming back to some questions above. > Do we want to resolve equality deletes and map them into data files? Or should we add a new task and output the content of equality dele

Re: [PR] [KafkaConnect] Fix RecordConverter for UUID and Fixed Types [iceberg]

2024-10-22 Thread via GitHub
bryanck commented on code in PR #11346: URL: https://github.com/apache/iceberg/pull/11346#discussion_r1811313113 ## kafka-connect/kafka-connect/src/test/java/org/apache/iceberg/connect/data/RecordConverterTest.java: ## @@ -84,11 +93,18 @@ import org.apache.kafka.connect.storage

Re: [PR] [KafkaConnect] Fix RecordConverter for UUID and Fixed Types [iceberg]

2024-10-22 Thread via GitHub
RussellSpitzer commented on code in PR #11346: URL: https://github.com/apache/iceberg/pull/11346#discussion_r1811314119 ## kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/data/RecordConverter.java: ## @@ -126,10 +127,14 @@ private Object convertValue(

Re: [PR] API: Add Variant data type [iceberg]

2024-10-22 Thread via GitHub
aihuaxu commented on code in PR #11324: URL: https://github.com/apache/iceberg/pull/11324#discussion_r1811485354 ## api/src/test/java/org/apache/iceberg/TestHelpers.java: ## @@ -402,6 +406,98 @@ public int hashCode() { } } + /** A VariantLike implementation for testin

Re: [PR] API: Add Variant data type [iceberg]

2024-10-22 Thread via GitHub
aihuaxu commented on code in PR #11324: URL: https://github.com/apache/iceberg/pull/11324#discussion_r1811485935 ## api/src/main/java/org/apache/iceberg/VariantLike.java: ## @@ -0,0 +1,36 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contribut

Re: [PR] More accurate estimate on parquet row groups size [iceberg]

2024-10-22 Thread via GitHub
jinyangli34 commented on PR #11258: URL: https://github.com/apache/iceberg/pull/11258#issuecomment-2430386830 > > This makes it difficult to estimate the current row group size, and result in creating much smaller row-group than `write.parquet.row-group-size-bytes` config > > @jinyan

Re: [PR] feat: more builders and writing manifests [iceberg-go]

2024-10-22 Thread via GitHub
zeroshade commented on code in PR #177: URL: https://github.com/apache/iceberg-go/pull/177#discussion_r1811229871 ## manifest.go: ## @@ -567,6 +569,96 @@ func ReadManifestList(in io.Reader) ([]ManifestFile, error) { return out, dec.Error() } +// WriteManifestListV2 w

Re: [I] Add view support to the Rest Catalog [iceberg-python]

2024-10-22 Thread via GitHub
corleyma commented on issue #818: URL: https://github.com/apache/iceberg-python/issues/818#issuecomment-2430157206 @shiv-io It should still be possible to do `load_view` _without_ supporting any scanning functionality yet, and like @sungwy says, that is likely a necessary precursor for othe

Re: [I] [Feature Request] Speed up InspectTable.files() [iceberg-python]

2024-10-22 Thread via GitHub
DieHertz commented on issue #1229: URL: https://github.com/apache/iceberg-python/issues/1229#issuecomment-2430363061 > If the work is done in Cython avro decoder -> pyarrow recordbatches using PyArrow Cython API, then that also leaves room to release the GIL for meaningful threaded concurr

Re: [PR] API, Core: Add scan planning apis to REST Catalog [iceberg]

2024-10-22 Thread via GitHub
rahil-c commented on PR #11180: URL: https://github.com/apache/iceberg/pull/11180#issuecomment-2430491796 @amogh-jahagirdar @rdblue @danielcweeks @nastra Pushed the following commit `Utilize ParallelIterable, and use table prop instead of rest planning`, which addressed some of the plann

Re: [PR] Flink Support for TIMESTAMP_NANOS [iceberg]

2024-10-22 Thread via GitHub
rodmeneses commented on PR #11348: URL: https://github.com/apache/iceberg/pull/11348#issuecomment-2430505790 > Do the schema conversions handle the nano ts correctly? Do we get a `Timestamp(9)` when we convert to Flink schema, or a nano timestamp when we convert from the Iceberg schema?

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-22 Thread via GitHub
rdblue commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1811571673 ## format/spec.md: ## @@ -841,19 +855,45 @@ Notes: ## Delete Formats -This section details how to encode row-level deletes in Iceberg delete files. Row-level del

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-22 Thread via GitHub
rdblue commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1811580892 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the follo

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-22 Thread via GitHub
rdblue commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1811580485 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the follo

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-22 Thread via GitHub
rdblue commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1811581135 ## format/spec.md: ## @@ -51,6 +51,7 @@ Version 3 of the Iceberg spec extends data types and existing metadata structure * New data types: nanosecond timestamp(tz)

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-22 Thread via GitHub
rdblue commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1811581226 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the follo

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-22 Thread via GitHub
rdblue commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1811581790 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct values,

Re: [PR] Core, Spark: Remove dangling deletes as part of RewriteDataFilesAction [iceberg]

2024-10-22 Thread via GitHub
dramaticlly commented on PR #9724: URL: https://github.com/apache/iceberg/pull/9724#issuecomment-2430496861 Thanks everyone for the review and input, special thanks to @aokolnychyi for optimized algorithm and @szehon-ho as original author and detailed review! -- This is an automated messa

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-22 Thread via GitHub
rdblue commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1811561158 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct values,

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-22 Thread via GitHub
rdblue commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1811564750 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct values,

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-22 Thread via GitHub
rdblue commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1811564153 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct values,

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-22 Thread via GitHub
rdblue commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1811569312 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct values,

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-22 Thread via GitHub
rdblue commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1811567559 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct values,

Re: [PR] Spec: Adds Row Lineage [iceberg]

2024-10-22 Thread via GitHub
rdblue commented on PR #11130: URL: https://github.com/apache/iceberg/pull/11130#issuecomment-2430515800 +1 to merge this since the vote has passed. We can do minor cleanup as we go right? -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-22 Thread via GitHub
rdblue commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1811583507 ## format/spec.md: ## @@ -841,19 +855,45 @@ Notes: ## Delete Formats -This section details how to encode row-level deletes in Iceberg delete files. Row-level del

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-22 Thread via GitHub
rdblue commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1811582684 ## format/spec.md: ## @@ -51,6 +51,7 @@ Version 3 of the Iceberg spec extends data types and existing metadata structure * New data types: nanosecond timestamp(tz)

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-22 Thread via GitHub
rdblue commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1811584252 ## format/spec.md: ## @@ -841,19 +855,45 @@ Notes: ## Delete Formats -This section details how to encode row-level deletes in Iceberg delete files. Row-level del

Re: [I] Is there any way to define Iceberg catalog and share it between DataStream API and Table/SQL API? [iceberg]

2024-10-22 Thread via GitHub
github-actions[bot] commented on issue #9954: URL: https://github.com/apache/iceberg/issues/9954#issuecomment-2430545073 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Track uncompressed data size for column metrics [iceberg]

2024-10-22 Thread via GitHub
github-actions[bot] commented on issue #9966: URL: https://github.com/apache/iceberg/issues/9966#issuecomment-2430545148 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] core: Filter on live entries when reading the manifest [iceberg]

2024-10-22 Thread via GitHub
github-actions[bot] commented on PR #9996: URL: https://github.com/apache/iceberg/pull/9996#issuecomment-2430545302 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] Spark can not delete table metadata and data when drop table [iceberg]

2024-10-22 Thread via GitHub
github-actions[bot] commented on issue #9990: URL: https://github.com/apache/iceberg/issues/9990#issuecomment-2430545215 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] An error occurred when the iceberg parquet file was loaded in the hive external table [iceberg]

2024-10-22 Thread via GitHub
github-actions[bot] commented on issue #10005: URL: https://github.com/apache/iceberg/issues/10005#issuecomment-2430545354 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] Docs: Fix inconsistency in branching and tagging scenario [iceberg]

2024-10-22 Thread via GitHub
github-actions[bot] commented on PR #9968: URL: https://github.com/apache/iceberg/pull/9968#issuecomment-2430545185 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] Improvements for manifest file caching [iceberg]

2024-10-22 Thread via GitHub
github-actions[bot] commented on issue #9991: URL: https://github.com/apache/iceberg/issues/9991#issuecomment-2430545239 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] core,api: Refactor code with `hasLiveEntries` [iceberg]

2024-10-22 Thread via GitHub
github-actions[bot] commented on PR #9993: URL: https://github.com/apache/iceberg/pull/9993#issuecomment-2430545274 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] AWS: Glue table operations hang when aws authentication parameters are illegal [iceberg]

2024-10-22 Thread via GitHub
github-actions[bot] commented on PR #10002: URL: https://github.com/apache/iceberg/pull/10002#issuecomment-2430545327 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [I] `system.add_files` utility does not support updated Partition Spec [iceberg]

2024-10-22 Thread via GitHub
github-actions[bot] commented on issue #10008: URL: https://github.com/apache/iceberg/issues/10008#issuecomment-2430545372 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] File size in bytes tracking with deleted files in expire snapshots [iceberg]

2024-10-22 Thread via GitHub
github-actions[bot] commented on PR #10036: URL: https://github.com/apache/iceberg/pull/10036#issuecomment-2430545454 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [I] OR condition does not leverage all parquet metadata (metrics, dictionary, bloom filter) causing inefficient queries [iceberg]

2024-10-22 Thread via GitHub
github-actions[bot] commented on issue #10029: URL: https://github.com/apache/iceberg/issues/10029#issuecomment-2430545428 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] Core: Uncached files not be materialized [iceberg]

2024-10-22 Thread via GitHub
github-actions[bot] commented on PR #10041: URL: https://github.com/apache/iceberg/pull/10041#issuecomment-2430545471 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] [Draft] Fixing #9923 updating partitioned table with more than 1k columns fails [iceberg]

2024-10-22 Thread via GitHub
github-actions[bot] commented on PR #9957: URL: https://github.com/apache/iceberg/pull/9957#issuecomment-2430545092 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] Weird behavior struct fields in Spark entries metadata table [iceberg]

2024-10-22 Thread via GitHub
github-actions[bot] commented on issue #10044: URL: https://github.com/apache/iceberg/issues/10044#issuecomment-2430545502 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [I] Metadata file is not getting created when Iceberg table is created using Hive with catalog as GlueCatalog [iceberg]

2024-10-22 Thread via GitHub
github-actions[bot] commented on issue #10025: URL: https://github.com/apache/iceberg/issues/10025#issuecomment-2430545401 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] AWS: Support S3 directory bucket listing [iceberg]

2024-10-22 Thread via GitHub
jackye1995 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1810991250 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ## @@ -428,6 +428,12 @@ public class S3FileIOProperties implements Serializable { publ

Re: [PR] AWS: Support S3 directory bucket listing [iceberg]

2024-10-22 Thread via GitHub
jackye1995 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1810992523 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ## @@ -428,6 +428,12 @@ public class S3FileIOProperties implements Serializable { publ

Re: [PR] AWS: Support S3 directory bucket listing [iceberg]

2024-10-22 Thread via GitHub
jackye1995 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1810974270 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java: ## @@ -298,8 +301,15 @@ private List deleteBatch(String bucket, Collection keysToDelete) @Overrid

Re: [PR] Spark 3.5: Update Spark to use planned Avro reads [iceberg]

2024-10-22 Thread via GitHub
rdblue merged PR #11299: URL: https://github.com/apache/iceberg/pull/11299 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Spark 3.5: Update Spark to use planned Avro reads [iceberg]

2024-10-22 Thread via GitHub
rdblue commented on code in PR #11299: URL: https://github.com/apache/iceberg/pull/11299#discussion_r1811023288 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/data/SparkPlannedAvroReader.java: ## @@ -0,0 +1,190 @@ +/* + * Licensed to the Apache Software Foundation (A

Re: [PR] AWS: Support S3 directory bucket listing [iceberg]

2024-10-22 Thread via GitHub
stubz151 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1811021800 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java: ## @@ -443,6 +453,16 @@ public boolean recoverFile(String path) { return recoverVersion.map(versio

Re: [PR] IO Implementation using Go CDK [iceberg-go]

2024-10-22 Thread via GitHub
zeroshade commented on code in PR #176: URL: https://github.com/apache/iceberg-go/pull/176#discussion_r1811073199 ## io/gcs.go: ## @@ -0,0 +1,63 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +//

Re: [PR] Impl rest catalog + table updates & requirements [iceberg-go]

2024-10-22 Thread via GitHub
zeroshade commented on PR #146: URL: https://github.com/apache/iceberg-go/pull/146#issuecomment-2429784559 @jwtryg any updates? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [I] Add view support to the Rest Catalog [iceberg-python]

2024-10-22 Thread via GitHub
shiv-io commented on issue #818: URL: https://github.com/apache/iceberg-python/issues/818#issuecomment-2429788539 I'm fairly new to the Iceberg ecosystem -- thanks for the insightful discussion, looks like I have some reading to do before I can weigh in. `load_view` aside though, I'd

Re: [PR] AWS: Support S3 directory bucket listing [iceberg]

2024-10-22 Thread via GitHub
jackye1995 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1810979748 ## aws/src/integration/java/org/apache/iceberg/aws/AwsIntegTestUtil.java: ## @@ -127,6 +129,21 @@ public static void cleanS3Bucket(S3Client s3, String bucketName,

Re: [PR] AWS: Support S3 directory bucket listing [iceberg]

2024-10-22 Thread via GitHub
jackye1995 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1810981500 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java: ## @@ -443,6 +453,16 @@ public boolean recoverFile(String path) { return recoverVersion.map(vers

Re: [PR] REST: AuthManager API [iceberg]

2024-10-22 Thread via GitHub
adutra commented on code in PR #10753: URL: https://github.com/apache/iceberg/pull/10753#discussion_r1809236938 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -273,35 +227,11 @@ public void initialize(String name, Map unresolved) { this.endp

Re: [PR] AWS: Support S3 directory bucket listing [iceberg]

2024-10-22 Thread via GitHub
jackye1995 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1810981999 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java: ## @@ -94,6 +94,9 @@ public class S3FileIO implements CredentialSupplier, DelegateFileIO, SupportsRe

Re: [PR] AWS: Support S3 directory bucket listing [iceberg]

2024-10-22 Thread via GitHub
jackye1995 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1811002437 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java: ## @@ -443,6 +453,16 @@ public boolean recoverFile(String path) { return recoverVersion.map(vers

Re: [PR] Spec: Adds Row Lineage [iceberg]

2024-10-22 Thread via GitHub
RussellSpitzer commented on PR #11130: URL: https://github.com/apache/iceberg/pull/11130#issuecomment-2429761860 @rdblue @nastra @sumedhsakdeo @flyrain @stevenzwu @wgtmac @aokolnychyi @ashvina @amogh-jahagirdar Ping everyone, we've had the vote on the Mailing l

Re: [PR] API: Add Variant data type [iceberg]

2024-10-22 Thread via GitHub
RussellSpitzer commented on code in PR #11324: URL: https://github.com/apache/iceberg/pull/11324#discussion_r188395 ## api/src/main/java/org/apache/iceberg/VariantLike.java: ## @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more co

Re: [PR] AWS: Support S3 directory bucket listing [iceberg]

2024-10-22 Thread via GitHub
jackye1995 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1810952792 ## aws/src/integration/java/org/apache/iceberg/aws/AwsIntegTestUtil.java: ## @@ -127,6 +129,21 @@ public static void cleanS3Bucket(S3Client s3, String bucketName,

Re: [PR] AWS: Support S3 directory bucket listing [iceberg]

2024-10-22 Thread via GitHub
stubz151 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1811161067 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java: ## @@ -443,6 +453,16 @@ public boolean recoverFile(String path) { return recoverVersion.map(versio

Re: [PR] API: Add Variant data type [iceberg]

2024-10-22 Thread via GitHub
RussellSpitzer commented on code in PR #11324: URL: https://github.com/apache/iceberg/pull/11324#discussion_r1811179041 ## api/src/main/java/org/apache/iceberg/VariantLike.java: ## @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more co

Re: [PR] API: Add Variant data type [iceberg]

2024-10-22 Thread via GitHub
RussellSpitzer commented on code in PR #11324: URL: https://github.com/apache/iceberg/pull/11324#discussion_r1811183149 ## api/src/test/java/org/apache/iceberg/TestHelpers.java: ## @@ -402,6 +406,98 @@ public int hashCode() { } } + /** A VariantLike implementation for

Re: [PR] AWS: Support S3 directory bucket listing [iceberg]

2024-10-22 Thread via GitHub
stubz151 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1810971888 ## aws/src/integration/java/org/apache/iceberg/aws/AwsIntegTestUtil.java: ## @@ -127,6 +129,21 @@ public static void cleanS3Bucket(S3Client s3, String bucketName, S

Re: [PR] API: Add Variant data type [iceberg]

2024-10-22 Thread via GitHub
aihuaxu commented on code in PR #11324: URL: https://github.com/apache/iceberg/pull/11324#discussion_r1810974129 ## api/src/main/java/org/apache/iceberg/VariantLike.java: ## @@ -0,0 +1,36 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contribut

Re: [PR] AWS: Support S3 directory bucket listing [iceberg]

2024-10-22 Thread via GitHub
jackye1995 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1810974270 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java: ## @@ -298,8 +301,15 @@ private List deleteBatch(String bucket, Collection keysToDelete) @Overrid

Re: [PR] AWS: Support S3 directory bucket listing [iceberg]

2024-10-22 Thread via GitHub
jackye1995 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1810998054 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ## @@ -498,6 +506,8 @@ public S3FileIOProperties() { this.s3RetryNumRetries = S3_RETRY

Re: [PR] API: Add Variant data type [iceberg]

2024-10-22 Thread via GitHub
aihuaxu commented on code in PR #11324: URL: https://github.com/apache/iceberg/pull/11324#discussion_r1810974129 ## api/src/main/java/org/apache/iceberg/VariantLike.java: ## @@ -0,0 +1,36 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contribut

Re: [PR] Spark 3.5: Update Spark to use planned Avro reads [iceberg]

2024-10-22 Thread via GitHub
aokolnychyi commented on code in PR #11299: URL: https://github.com/apache/iceberg/pull/11299#discussion_r1811003876 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/data/SparkPlannedAvroReader.java: ## @@ -0,0 +1,190 @@ +/* + * Licensed to the Apache Software Foundati

Re: [PR] AWS: Support S3 directory bucket listing [iceberg]

2024-10-22 Thread via GitHub
jackye1995 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1810998409 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ## @@ -837,6 +852,16 @@ public long s3RetryTotalWaitMs() { return (long) s3RetryNumRet

Re: [PR] AWS: Support S3 directory bucket listing [iceberg]

2024-10-22 Thread via GitHub
jackye1995 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1811146654 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java: ## @@ -443,6 +453,16 @@ public boolean recoverFile(String path) { return recoverVersion.map(vers

Re: [PR] Spec: Adds Row Lineage [iceberg]

2024-10-22 Thread via GitHub
RussellSpitzer commented on code in PR #11130: URL: https://github.com/apache/iceberg/pull/11130#discussion_r1811038409 ## format/spec.md: ## @@ -298,16 +298,101 @@ Iceberg tables must not use field ids greater than 2147483447 (`Integer.MAX_VALU The set of metadata columns i

Re: [PR] feat(table/scanner): Implement Arrow type promotion and conversion [iceberg-go]

2024-10-22 Thread via GitHub
zeroshade commented on PR #174: URL: https://github.com/apache/iceberg-go/pull/174#issuecomment-2430012150 @Fokko @nastra rebased and updated, ready for review now! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Support partitioning spec during data file rewrites in Spark. [iceberg]

2024-10-22 Thread via GitHub
danielcweeks commented on code in PR #11368: URL: https://github.com/apache/iceberg/pull/11368#discussion_r1811223257 ## api/src/main/java/org/apache/iceberg/UpdatePartitionSpec.java: ## @@ -133,4 +133,16 @@ default UpdatePartitionSpec addNonDefaultSpec() { throw new Unsupp

Re: [PR] REST: AuthManager API [iceberg]

2024-10-22 Thread via GitHub
adutra commented on code in PR #10753: URL: https://github.com/apache/iceberg/pull/10753#discussion_r1808965097 ## core/src/main/java/org/apache/iceberg/rest/auth/OAuth2Manager.java: ## @@ -0,0 +1,220 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [PR] Core: Add portable Roaring bitmap for row positions [iceberg]

2024-10-22 Thread via GitHub
nastra commented on code in PR #11372: URL: https://github.com/apache/iceberg/pull/11372#discussion_r1810468550 ## core/src/main/java/org/apache/iceberg/deletes/RoaringPositionBitmap.java: ## @@ -0,0 +1,317 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

Re: [PR] Core: Add portable Roaring bitmap for row positions [iceberg]

2024-10-22 Thread via GitHub
nastra commented on code in PR #11372: URL: https://github.com/apache/iceberg/pull/11372#discussion_r1810460827 ## core/src/main/java/org/apache/iceberg/deletes/RoaringPositionBitmap.java: ## @@ -0,0 +1,309 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

Re: [PR] Snapshot `summary` map must have `operation` key [iceberg]

2024-10-22 Thread via GitHub
nastra commented on code in PR #11354: URL: https://github.com/apache/iceberg/pull/11354#discussion_r1810499592 ## core/src/main/java/org/apache/iceberg/SnapshotParser.java: ## @@ -140,6 +142,8 @@ static Snapshot fromJson(JsonNode node) { } } summary = bui

Re: [PR] [docs] Fix broken links to config reference [iceberg]

2024-10-22 Thread via GitHub
rmoff closed pull request #9939: [docs] Fix broken links to config reference URL: https://github.com/apache/iceberg/pull/9939 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

  1   2   >