[PR] Avro: Add internal writer [iceberg]

2025-01-06 Thread via GitHub
ajantha-bhat opened a new pull request, #11919: URL: https://github.com/apache/iceberg/pull/11919 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

Re: [PR] Avro: Add internal writer [iceberg]

2025-01-06 Thread via GitHub
ajantha-bhat commented on code in PR #11919: URL: https://github.com/apache/iceberg/pull/11919#discussion_r1904018273 ## core/src/main/java/org/apache/iceberg/avro/InternalReader.java: ## @@ -205,7 +205,6 @@ public ValueReader primitive(Pair partner, Schema primitive) {

Re: [PR] Avro: Add internal writer [iceberg]

2025-01-06 Thread via GitHub
ajantha-bhat commented on code in PR #11919: URL: https://github.com/apache/iceberg/pull/11919#discussion_r1904018273 ## core/src/main/java/org/apache/iceberg/avro/InternalReader.java: ## @@ -205,7 +205,6 @@ public ValueReader primitive(Pair partner, Schema primitive) {

[I] How do I know that the bloom filter configuration is successful? [iceberg]

2025-01-06 Thread via GitHub
madeirak opened a new issue, #11918: URL: https://github.com/apache/iceberg/issues/11918 ### Query engine Spark=3.3.1 Iceberg=1.4.3 ### Question After setting new table properties 'write.parquet.bloom-filter-enabled.column.xxx' = 'true', 'write.parquet.bloom-filt

Re: [PR] Backport #11557 to FLink1.19 and 1.18 [iceberg]

2025-01-06 Thread via GitHub
huyuanfeng2018 commented on PR #11834: URL: https://github.com/apache/iceberg/pull/11834#issuecomment-2572635561 > @huyuanfeng2018: If you create a backport ticket please always highlight if there were any changes compared to the original code, or the PR was a clean backport. This greatly h

Re: [I] How do I know that the bloom filter configuration is successful? [iceberg]

2025-01-06 Thread via GitHub
madeirak commented on issue #11918: URL: https://github.com/apache/iceberg/issues/11918#issuecomment-2572732344 > How about using [parquet-cli](https://github.com/apache/parquet-java/tree/master/parquet-cli)? `footer` option provides bloom filter's offset and length. Also, we can use`bloom

Re: [PR] Change dot notation in add column documentation to tuple [iceberg-python]

2025-01-06 Thread via GitHub
jeppe-dos commented on PR #1433: URL: https://github.com/apache/iceberg-python/pull/1433#issuecomment-2572756873 I assume so. I'll test and update accordingly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] Impl rest catalog + table updates & requirements [iceberg-go]

2025-01-06 Thread via GitHub
jwtryg commented on PR #146: URL: https://github.com/apache/iceberg-go/pull/146#issuecomment-2572532941 > @jwtryg Is there anything else outstanding on this or is this ready for review again? @zeroshade I will look through all of your previous comments today to make sure it is ready

Re: [PR] Backport #11557 to FLink1.19 and 1.18 [iceberg]

2025-01-06 Thread via GitHub
pvary commented on PR #11834: URL: https://github.com/apache/iceberg/pull/11834#issuecomment-2572549766 @huyuanfeng2018: If you create a backport ticket please always highlight if there were any changes compared to the original code, or the PR was a clean backport. This greatly helps the

Re: [PR] Avro: Add internal writer [iceberg]

2025-01-06 Thread via GitHub
ajantha-bhat commented on code in PR #11919: URL: https://github.com/apache/iceberg/pull/11919#discussion_r1904025164 ## core/src/test/java/org/apache/iceberg/avro/TestInternalWriter.java: ## @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

Re: [I] Variant Data Type Support [iceberg]

2025-01-06 Thread via GitHub
stym06 commented on issue #10392: URL: https://github.com/apache/iceberg/issues/10392#issuecomment-2572893261 @aihuaxu Will this also work for protobuf encoded columns? I have a dataset with event_bytes | event_name ___ 101010100 | e1 101010100 | e2

[I] pyiceberg hanging on multiprocessing [iceberg-python]

2025-01-06 Thread via GitHub
frankliee opened a new issue, #1488: URL: https://github.com/apache/iceberg-python/issues/1488 ### Apache Iceberg version None ### Please describe the bug 🐞 the bad code : load table in the sub process ```python from multiprocessing import Process worker_num = 2

[PR] feat(datafusion): Expose DataFusion statistics on an IcebergTableScan [iceberg-rust]

2025-01-06 Thread via GitHub
gruuya opened a new pull request, #880: URL: https://github.com/apache/iceberg-rust/pull/880 Closes #869. Provide detailed statistics via DataFusion's `ExecutionPlan::statistics` for more efficient join planning. The statistics is accumulated from the snapshot's manifests, and

<    1   2