Re: [PR] Snapshot `summary` map must have `operation` key [iceberg]

2024-10-23 Thread via GitHub
nastra commented on code in PR #11354: URL: https://github.com/apache/iceberg/pull/11354#discussion_r1812056517 ## core/src/test/java/org/apache/iceberg/TestSnapshotJson.java: ## @@ -35,6 +40,58 @@ public class TestSnapshotJson { public TableOperations ops = new LocalTableO

Re: [PR] ci: Fix CI for bindings python [iceberg-rust]

2024-10-23 Thread via GitHub
Xuanwo merged PR #678: URL: https://github.com/apache/iceberg-rust/pull/678 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [PR] feat(table/scanner): Implement Arrow type promotion and conversion [iceberg-go]

2024-10-23 Thread via GitHub
nastra merged PR #174: URL: https://github.com/apache/iceberg-go/pull/174 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [PR] Snapshot `summary` map must have `operation` key [iceberg]

2024-10-23 Thread via GitHub
nastra commented on code in PR #11354: URL: https://github.com/apache/iceberg/pull/11354#discussion_r1812057643 ## core/src/test/java/org/apache/iceberg/TestSnapshotJson.java: ## @@ -35,6 +40,58 @@ public class TestSnapshotJson { public TableOperations ops = new LocalTableO

Re: [PR] Snapshot `summary` map must have `operation` key [iceberg]

2024-10-23 Thread via GitHub
nastra commented on code in PR #11354: URL: https://github.com/apache/iceberg/pull/11354#discussion_r1812068112 ## core/src/test/java/org/apache/iceberg/TestSnapshotJson.java: ## @@ -35,6 +40,58 @@ public class TestSnapshotJson { public TableOperations ops = new LocalTableO

Re: [PR] More accurate estimate on parquet row groups size [iceberg]

2024-10-23 Thread via GitHub
nastra commented on code in PR #11258: URL: https://github.com/apache/iceberg/pull/11258#discussion_r1812072515 ## parquet/src/test/java/org/apache/iceberg/parquet/TestParquet.java: ## @@ -219,6 +222,52 @@ public void testTwoLevelList() throws IOException { assertThat(recor

Re: [PR] feat(catalog/glue): add support for glue catalog namespace operations [iceberg-go]

2024-10-23 Thread via GitHub
nastra commented on code in PR #173: URL: https://github.com/apache/iceberg-go/pull/173#discussion_r1812088531 ## catalog/glue.go: ## @@ -122,38 +165,104 @@ func (c *GlueCatalog) LoadTable(ctx context.Context, identifier table.Identifier return icebergTable, nil } -f

Re: [PR] feat: Implement Decimal from/to bytes represents [iceberg-rust]

2024-10-23 Thread via GitHub
Xuanwo commented on code in PR #665: URL: https://github.com/apache/iceberg-rust/pull/665#discussion_r1812068075 ## crates/iceberg/src/spec/values.rs: ## @@ -449,7 +456,30 @@ impl Datum { PrimitiveLiteral::String(val) => ByteBuf::from(val.as_bytes()), P

Re: [PR] build(deps): bump github.com/apache/arrow-go/v18 from 18.0.0-20240924011512-14844aea3205 to 18.0.0-rc0 [iceberg-go]

2024-10-23 Thread via GitHub
dependabot[bot] commented on PR #180: URL: https://github.com/apache/iceberg-go/pull/180#issuecomment-2431172832 Looks like github.com/apache/arrow-go/v18 is up-to-date now, so this is no longer needed. -- This is an automated message from the Apache Git Service. To respond to the message

Re: [PR] feat: allow empty projection in table scan [iceberg-rust]

2024-10-23 Thread via GitHub
sundy-li commented on code in PR #677: URL: https://github.com/apache/iceberg-rust/pull/677#discussion_r1812226077 ## crates/iceberg/src/arrow/record_batch_transformer.rs: ## @@ -154,10 +142,15 @@ impl RecordBatchTransformer { Some(BatchTransform::Modify {

Re: [PR] Build: Enable errorprone PatternMatchingInstanceof [iceberg]

2024-10-23 Thread via GitHub
nastra commented on PR #11374: URL: https://github.com/apache/iceberg/pull/11374#issuecomment-2431399539 @ebyhr can you please elaborate why we'd want to error out on this one? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] Spec: Support geo type [iceberg]

2024-10-23 Thread via GitHub
szehon-ho commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1812299087 ## format/spec.md: ## @@ -1102,6 +1105,7 @@ Hash results are not dependent on decimal scale, which is part of the type, not 4. UUIDs are encoded using big endian.

Re: [PR] More accurate estimate on parquet row groups size [iceberg]

2024-10-23 Thread via GitHub
nastra commented on code in PR #11258: URL: https://github.com/apache/iceberg/pull/11258#discussion_r1812448383 ## parquet/src/main/java/org/apache/iceberg/parquet/ParquetWriter.java: ## @@ -66,6 +66,9 @@ class ParquetWriter implements FileAppender, Closeable { private boolea

Re: [PR] feat: allow empty projection in table scan [iceberg-rust]

2024-10-23 Thread via GitHub
Xuanwo commented on code in PR #677: URL: https://github.com/apache/iceberg-rust/pull/677#discussion_r1812233893 ## crates/iceberg/src/scan.rs: ## @@ -394,6 +411,11 @@ impl TableScan { return Ok(file_scan_task_rx.boxed()); } +/// Returns an [`ManifestList`] +

Re: [I] How to create an iceberg table under a custom catalog name like iceberg instead of hive, using HiveCatalog [iceberg]

2024-10-23 Thread via GitHub
crazyzhou commented on issue #10786: URL: https://github.com/apache/iceberg/issues/10786#issuecomment-2431332058 Hope this helps. https://github.com/apache/iceberg/pull/7441#discussion_r1179924760 -- This is an automated message from the Apache Git Service. To respond to the message, plea

Re: [PR] feat: allow empty projection in table scan [iceberg-rust]

2024-10-23 Thread via GitHub
sundy-li commented on code in PR #677: URL: https://github.com/apache/iceberg-rust/pull/677#discussion_r1812219171 ## crates/iceberg/src/scan.rs: ## @@ -394,6 +411,11 @@ impl TableScan { return Ok(file_scan_task_rx.boxed()); } +/// Returns an [`ManifestList`]

Re: [PR] feat: allow empty projection in table scan [iceberg-rust]

2024-10-23 Thread via GitHub
sundy-li commented on PR #677: URL: https://github.com/apache/iceberg-rust/pull/677#issuecomment-2431341044 > This looks good to me, thanks! I'm assuming that the reason why someone would want to perform a scan with an empty projection is to get a row count? As I described in the issu

Re: [PR] feat: Safer PartitionSpec & SchemalessPartitionSpec [iceberg-rust]

2024-10-23 Thread via GitHub
liurenjie1024 commented on code in PR #645: URL: https://github.com/apache/iceberg-rust/pull/645#discussion_r1812191734 ## crates/iceberg/src/spec/partition.rs: ## @@ -54,22 +54,51 @@ impl PartitionField { } } -/// Partition spec that defines how to produce a tuple of p

Re: [PR] feat: allow empty projection in table scan [iceberg-rust]

2024-10-23 Thread via GitHub
liurenjie1024 commented on code in PR #677: URL: https://github.com/apache/iceberg-rust/pull/677#discussion_r1812387339 ## crates/iceberg/src/scan.rs: ## @@ -394,6 +411,11 @@ impl TableScan { return Ok(file_scan_task_rx.boxed()); } +/// Returns an [`ManifestL

Re: [PR] Snapshot `summary` map must have `operation` key [iceberg]

2024-10-23 Thread via GitHub
nastra commented on code in PR #11354: URL: https://github.com/apache/iceberg/pull/11354#discussion_r1812034910 ## core/src/main/java/org/apache/iceberg/SnapshotParser.java: ## @@ -75,6 +76,7 @@ static void toJson(Snapshot snapshot, JsonGenerator generator) throws IOExceptio

Re: [PR] Snapshot `summary` map must have `operation` key [iceberg]

2024-10-23 Thread via GitHub
nastra commented on code in PR #11354: URL: https://github.com/apache/iceberg/pull/11354#discussion_r1812035950 ## core/src/main/java/org/apache/iceberg/SnapshotParser.java: ## @@ -62,6 +62,7 @@ static void toJson(Snapshot snapshot, JsonGenerator generator) throws IOExceptio

Re: [PR] feat: allow empty projection in table scan [iceberg-rust]

2024-10-23 Thread via GitHub
liurenjie1024 commented on code in PR #677: URL: https://github.com/apache/iceberg-rust/pull/677#discussion_r1812028265 ## crates/iceberg/src/scan.rs: ## @@ -394,6 +411,11 @@ impl TableScan { return Ok(file_scan_task_rx.boxed()); } +/// Returns an [`ManifestL

Re: [PR] Snapshot `summary` map must have `operation` key [iceberg]

2024-10-23 Thread via GitHub
nastra commented on code in PR #11354: URL: https://github.com/apache/iceberg/pull/11354#discussion_r1812044323 ## core/src/main/java/org/apache/iceberg/SnapshotParser.java: ## @@ -140,6 +142,8 @@ static Snapshot fromJson(JsonNode node) { } } summary = bui

Re: [PR] feat: Safer PartitionSpec & SchemalessPartitionSpec [iceberg-rust]

2024-10-23 Thread via GitHub
c-thiel commented on PR #645: URL: https://github.com/apache/iceberg-rust/pull/645#issuecomment-2431923524 Hey @liurenjie1024 , thanks for giving it a shot. Regarding 1) they are not identical (see also my argument from the start. FieldId is optional for unbound spec, but it is not for "p

Re: [PR] [KafkaConnect] Fix RecordConverter for UUID and Fixed Types [iceberg]

2024-10-23 Thread via GitHub
jbonofre commented on PR #11346: URL: https://github.com/apache/iceberg/pull/11346#issuecomment-2432596526 This change looks good, I'm just wondering about the test. I would have kept the original test and create a new one dedicated for the writer. -- This is an automated message from the

Re: [PR] [KafkaConnect] Fix RecordConverter for UUID and Fixed Types [iceberg]

2024-10-23 Thread via GitHub
jbonofre commented on code in PR #11346: URL: https://github.com/apache/iceberg/pull/11346#discussion_r1813021695 ## kafka-connect/kafka-connect/src/test/java/org/apache/iceberg/connect/data/RecordConverterTest.java: ## @@ -84,11 +93,18 @@ import org.apache.kafka.connect.storag

Re: [PR] Flink Support for TIMESTAMP_NANOS [iceberg]

2024-10-23 Thread via GitHub
pvary commented on code in PR #11348: URL: https://github.com/apache/iceberg/pull/11348#discussion_r1812930067 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/data/FlinkParquetReaders.java: ## @@ -411,6 +416,43 @@ public DecimalData read(DecimalData ignored) { }

Re: [PR] feat(catalog/glue): add support for glue catalog namespace operations [iceberg-go]

2024-10-23 Thread via GitHub
natusioe commented on code in PR #173: URL: https://github.com/apache/iceberg-go/pull/173#discussion_r1813036157 ## catalog/glue.go: ## @@ -122,38 +165,104 @@ func (c *GlueCatalog) LoadTable(ctx context.Context, identifier table.Identifier return icebergTable, nil }

Re: [PR] Spark: Randomize view/function names in testing [iceberg]

2024-10-23 Thread via GitHub
nastra merged PR #11381: URL: https://github.com/apache/iceberg/pull/11381 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] AWS: Support S3 directory bucket listing [iceberg]

2024-10-23 Thread via GitHub
jackye1995 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1813050472 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3URI.java: ## @@ -115,4 +118,25 @@ public String scheme() { public String toString() { return location;

Re: [PR] Remove iceberg-pig [iceberg]

2024-10-23 Thread via GitHub
manuzhang commented on PR #11380: URL: https://github.com/apache/iceberg/pull/11380#issuecomment-2432645637 When I try removing Hive 2 in https://github.com/apache/iceberg/pull/10996, I find both Hive and Pig are referenced in https://github.com/apache/iceberg/blob/main/mr/src/main/java/or

Re: [PR] AWS: Support S3 directory bucket listing [iceberg]

2024-10-23 Thread via GitHub
jackye1995 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1813053013 ## aws/src/test/java/org/apache/iceberg/aws/s3/TestS3FileIO.java: ## @@ -101,6 +116,9 @@ public class TestS3FileIO { private final int batchDeletionSize = 5;

Re: [PR] AWS: Support S3 directory bucket listing [iceberg]

2024-10-23 Thread via GitHub
jackye1995 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1813071789 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3URI.java: ## @@ -115,4 +118,25 @@ public String scheme() { public String toString() { return location;

Re: [PR] AWS: Support S3 directory bucket listing [iceberg]

2024-10-23 Thread via GitHub
jackye1995 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1813073025 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3URI.java: ## @@ -115,4 +118,25 @@ public String scheme() { public String toString() { return location;

Re: [PR] AWS: Use testcontainers-minio instead of S3Mock [iceberg]

2024-10-23 Thread via GitHub
sullis commented on code in PR #11349: URL: https://github.com/apache/iceberg/pull/11349#discussion_r1813146387 ## aws/src/test/java/org/apache/iceberg/aws/s3/MinioUtil.java: ## @@ -0,0 +1,64 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contr

Re: [PR] AWS: Use testcontainers-minio instead of S3Mock [iceberg]

2024-10-23 Thread via GitHub
sullis commented on code in PR #11349: URL: https://github.com/apache/iceberg/pull/11349#discussion_r1813154671 ## aws/src/test/java/org/apache/iceberg/aws/s3/MinioUtil.java: ## @@ -0,0 +1,64 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contr

Re: [PR] AWS: Use testcontainers-minio instead of S3Mock [iceberg]

2024-10-23 Thread via GitHub
nastra commented on code in PR #11349: URL: https://github.com/apache/iceberg/pull/11349#discussion_r1813150009 ## aws/src/test/java/org/apache/iceberg/aws/s3/MinioUtil.java: ## @@ -0,0 +1,64 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contr

Re: [PR] Aliyun: Remove spring-boot dependency [iceberg]

2024-10-23 Thread via GitHub
jbonofre commented on PR #11291: URL: https://github.com/apache/iceberg/pull/11291#issuecomment-2432562122 I'm checking why one check failed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] AWS: Support S3 directory bucket listing [iceberg]

2024-10-23 Thread via GitHub
jackye1995 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1813117694 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3URI.java: ## @@ -115,4 +118,25 @@ public String scheme() { public String toString() { return location;

Re: [PR] Data: Add partition stats writer and reader [iceberg]

2024-10-23 Thread via GitHub
ajantha-bhat commented on PR #11216: URL: https://github.com/apache/iceberg/pull/11216#issuecomment-2432771734 @RussellSpitzer: It would be good to have this in 1.7.0. I am waiting from a month for a review. -- This is an automated message from the Apache Git Service. To respond to th

Re: [PR] AWS: Support S3 directory bucket listing [iceberg]

2024-10-23 Thread via GitHub
jackye1995 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1813140179 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ## @@ -428,6 +428,21 @@ public class S3FileIOProperties implements Serializable { publ

Re: [PR] Move snapshot history expire table properties to constants [iceberg-python]

2024-10-23 Thread via GitHub
sungwy merged PR #1217: URL: https://github.com/apache/iceberg-python/pull/1217 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@icebe

Re: [PR] Deprecate iceberg-pig [iceberg]

2024-10-23 Thread via GitHub
nastra commented on code in PR #11379: URL: https://github.com/apache/iceberg/pull/11379#discussion_r1812850887 ## pig/src/main/java/org/apache/iceberg/pig/IcebergPigInputFormat.java: ## @@ -68,6 +68,7 @@ public class IcebergPigInputFormat extends InputFormat { private List

Re: [PR] Flink Support for TIMESTAMP_NANOS [iceberg]

2024-10-23 Thread via GitHub
pvary commented on code in PR #11348: URL: https://github.com/apache/iceberg/pull/11348#discussion_r1812936226 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/data/FlinkParquetReaders.java: ## @@ -411,6 +416,43 @@ public DecimalData read(DecimalData ignored) { }

Re: [PR] Flink Support for TIMESTAMP_NANOS [iceberg]

2024-10-23 Thread via GitHub
pvary commented on code in PR #11348: URL: https://github.com/apache/iceberg/pull/11348#discussion_r1812930067 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/data/FlinkParquetReaders.java: ## @@ -411,6 +416,43 @@ public DecimalData read(DecimalData ignored) { }

Re: [PR] AWS: Support S3 directory bucket listing [iceberg]

2024-10-23 Thread via GitHub
jackye1995 commented on PR #11021: URL: https://github.com/apache/iceberg/pull/11021#issuecomment-2432810694 This mostly looks good to me now, just a few very nit comments. And I think we should update the `aws.md` about using directory buckets. But that can also be a separated PR, up to yo

Re: [PR] Spark 3.4: Randomize view/function names in testing [iceberg]

2024-10-23 Thread via GitHub
amogh-jahagirdar merged PR #11382: URL: https://github.com/apache/iceberg/pull/11382 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

[PR] Deprecate iceberg-pig [iceberg]

2024-10-23 Thread via GitHub
jbonofre opened a new pull request, #11379: URL: https://github.com/apache/iceberg/pull/11379 As discussed on the dev mailing list (https://lists.apache.org/thread/mw0v24nj9h7b0wyrbw7gn0ldd4m3c8kw), this PR logs a warning to inform the users about deprecation. -- This is an automated mes

Re: [PR] Flink Support for TIMESTAMP_NANOS [iceberg]

2024-10-23 Thread via GitHub
pvary commented on code in PR #11348: URL: https://github.com/apache/iceberg/pull/11348#discussion_r1812916340 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/TypeToFlinkType.java: ## @@ -113,6 +113,15 @@ public LogicalType primitive(Type.PrimitiveType primitive) {

[PR] Spark: Randomize view/function names in testing [iceberg]

2024-10-23 Thread via GitHub
nastra opened a new pull request, #11381: URL: https://github.com/apache/iceberg/pull/11381 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

Re: [PR] Support partitioning spec during data file rewrites in Spark. [iceberg]

2024-10-23 Thread via GitHub
rdsarvar commented on code in PR #11368: URL: https://github.com/apache/iceberg/pull/11368#discussion_r1812745126 ## api/src/main/java/org/apache/iceberg/UpdatePartitionSpec.java: ## @@ -133,4 +133,16 @@ default UpdatePartitionSpec addNonDefaultSpec() { throw new Unsupporte

Re: [PR] Core: Add portable Roaring bitmap for row positions [iceberg]

2024-10-23 Thread via GitHub
aokolnychyi commented on code in PR #11372: URL: https://github.com/apache/iceberg/pull/11372#discussion_r1812903771 ## core/src/main/java/org/apache/iceberg/deletes/RoaringPositionBitmap.java: ## @@ -0,0 +1,309 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] Core: Add portable Roaring bitmap for row positions [iceberg]

2024-10-23 Thread via GitHub
aokolnychyi commented on code in PR #11372: URL: https://github.com/apache/iceberg/pull/11372#discussion_r1812903771 ## core/src/main/java/org/apache/iceberg/deletes/RoaringPositionBitmap.java: ## @@ -0,0 +1,309 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] Flink Support for TIMESTAMP_NANOS [iceberg]

2024-10-23 Thread via GitHub
pvary commented on code in PR #11348: URL: https://github.com/apache/iceberg/pull/11348#discussion_r1812947524 ## flink/v1.20/flink/src/test/java/org/apache/iceberg/flink/source/reader/TestColumnStatsWatermarkExtractorNano.java: ## @@ -0,0 +1,95 @@ +/* + * Licensed to the Apache

Re: [PR] Flink Support for TIMESTAMP_NANOS [iceberg]

2024-10-23 Thread via GitHub
pvary commented on PR #11348: URL: https://github.com/apache/iceberg/pull/11348#issuecomment-2432497304 There are several tests which are using `DataGenerator` to generate test data - do we want to add nanos to them? -- This is an automated message from the Apache Git Service. To respond

Re: [PR] Core: Add portable Roaring bitmap for row positions [iceberg]

2024-10-23 Thread via GitHub
aokolnychyi commented on code in PR #11372: URL: https://github.com/apache/iceberg/pull/11372#discussion_r1812892153 ## core/src/main/java/org/apache/iceberg/deletes/RoaringPositionBitmap.java: ## @@ -0,0 +1,309 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [I] Serialization of the org.apache.iceberg.io.WriteResult class. [iceberg]

2024-10-23 Thread via GitHub
pvary commented on issue #10710: URL: https://github.com/apache/iceberg/issues/10710#issuecomment-2432586094 In Flink, it is possible to create a new type, like: ``` class WriteResultType extends TypeInformation ``` This can implement the `createSerializer` method, ike: ```

Re: [PR] AWS: Support S3 directory bucket listing [iceberg]

2024-10-23 Thread via GitHub
jackye1995 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1813052151 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3URI.java: ## @@ -37,6 +37,9 @@ class S3URI { private static final String QUERY_DELIM = "\\?"; private static

Re: [PR] Core: Add portable Roaring bitmap for row positions [iceberg]

2024-10-23 Thread via GitHub
aokolnychyi commented on code in PR #11372: URL: https://github.com/apache/iceberg/pull/11372#discussion_r1813314555 ## core/src/main/java/org/apache/iceberg/deletes/RoaringPositionBitmap.java: ## @@ -0,0 +1,317 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] Core: Add portable Roaring bitmap for row positions [iceberg]

2024-10-23 Thread via GitHub
aokolnychyi commented on code in PR #11372: URL: https://github.com/apache/iceberg/pull/11372#discussion_r1813322036 ## core/src/test/java/org/apache/iceberg/deletes/TestRoaringPositionBitmap.java: ## @@ -0,0 +1,323 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

Re: [PR] Core: Add portable Roaring bitmap for row positions [iceberg]

2024-10-23 Thread via GitHub
aokolnychyi commented on code in PR #11372: URL: https://github.com/apache/iceberg/pull/11372#discussion_r1813325394 ## core/src/test/java/org/apache/iceberg/deletes/TestRoaringPositionBitmap.java: ## @@ -0,0 +1,516 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

Re: [PR] Core: Add portable Roaring bitmap for row positions [iceberg]

2024-10-23 Thread via GitHub
aokolnychyi commented on code in PR #11372: URL: https://github.com/apache/iceberg/pull/11372#discussion_r1813327452 ## core/src/main/java/org/apache/iceberg/deletes/RoaringPositionBitmap.java: ## @@ -0,0 +1,309 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

[PR] Remove iceberg-pig [iceberg]

2024-10-23 Thread via GitHub
jbonofre opened a new pull request, #11380: URL: https://github.com/apache/iceberg/pull/11380 Following #11379 this PR completely removes iceberg-pig (for Iceberg 1.8.0). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] build(deps): bump github.com/apache/arrow-go/v18 from 18.0.0-20240924011512-14844aea3205 to 18.0.0-rc0 [iceberg-go]

2024-10-23 Thread via GitHub
dependabot[bot] closed pull request #180: build(deps): bump github.com/apache/arrow-go/v18 from 18.0.0-20240924011512-14844aea3205 to 18.0.0-rc0 URL: https://github.com/apache/iceberg-go/pull/180 -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [PR] Aliyun: Remove spring-boot dependency [iceberg]

2024-10-23 Thread via GitHub
jbonofre commented on PR #11291: URL: https://github.com/apache/iceberg/pull/11291#issuecomment-2432883972 @findepi @manuzhang I fixed the spotless. Sorry for the inconvenience. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] ci: Fix CI for bindings python [iceberg-rust]

2024-10-23 Thread via GitHub
kevinjqliu commented on PR #678: URL: https://github.com/apache/iceberg-rust/pull/678#issuecomment-2433021197 FYI, CI failing on main https://github.com/apache/iceberg-rust/commits/main/ Is it related to this PR? error message, ``` error: use of deprecated method `open

Re: [PR] AWS: Support S3 directory bucket listing [iceberg]

2024-10-23 Thread via GitHub
jackye1995 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1813521623 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java: ## @@ -297,7 +297,14 @@ private List deleteBatch(String bucket, Collection keysToDelete) @Overr

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-23 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1813980941 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -323,11 +345,15 @@ private ManifestFile filterManifest(Schema tableSchema, Manife

Re: [PR] Flink Support for TIMESTAMP_NANOS [iceberg]

2024-10-23 Thread via GitHub
rodmeneses commented on code in PR #11348: URL: https://github.com/apache/iceberg/pull/11348#discussion_r1813237820 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/TypeToFlinkType.java: ## @@ -113,6 +113,15 @@ public LogicalType primitive(Type.PrimitiveType primitive

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-23 Thread via GitHub
rdblue commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1813944812 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -323,11 +345,15 @@ private ManifestFile filterManifest(Schema tableSchema, ManifestFile man

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-23 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1813980941 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -323,11 +345,15 @@ private ManifestFile filterManifest(Schema tableSchema, Manife

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-23 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1813980941 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -323,11 +345,15 @@ private ManifestFile filterManifest(Schema tableSchema, Manife

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-23 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1813980941 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -323,11 +345,15 @@ private ManifestFile filterManifest(Schema tableSchema, Manife

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-23 Thread via GitHub
rdblue commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1813954048 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -323,11 +345,15 @@ private ManifestFile filterManifest(Schema tableSchema, ManifestFile man

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-23 Thread via GitHub
rdblue commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1813956145 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -341,43 +367,44 @@ private ManifestFile filterManifest(Schema tableSchema, ManifestFile man

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-23 Thread via GitHub
rdblue commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1813931853 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -323,11 +345,15 @@ private ManifestFile filterManifest(Schema tableSchema, ManifestFile man

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-23 Thread via GitHub
rdblue commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1813944812 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -323,11 +345,15 @@ private ManifestFile filterManifest(Schema tableSchema, ManifestFile man

Re: [PR] feat: more builders and writing manifests [iceberg-go]

2024-10-23 Thread via GitHub
dwilson1988 commented on code in PR #177: URL: https://github.com/apache/iceberg-go/pull/177#discussion_r1813443609 ## manifest.go: ## @@ -876,7 +1030,140 @@ func (m *manifestEntryV2) FileSequenceNum() *int64 { return m.FileSeqNum } -func (m *manifestEntryV2) DataFile

Re: [PR] Flink: Maintenance - TableManager + ExpireSnapshots [iceberg]

2024-10-23 Thread via GitHub
stevenzwu commented on code in PR #11144: URL: https://github.com/apache/iceberg/pull/11144#discussion_r1813326096 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/api/MaintenanceTaskBuilder.java: ## @@ -0,0 +1,220 @@ +/* + * Licensed to the Apache Softwar

Re: [PR] [Views] Update view spec with table identifier requirements [iceberg]

2024-10-23 Thread via GitHub
wmoustafa commented on PR #11365: URL: https://github.com/apache/iceberg/pull/11365#issuecomment-2433322970 @rdblue @danielcweeks @stevenzwu @RussellSpitzer @bennychow Would be great to take a look. -- This is an automated message from the Apache Git Service. To respond to the message, pl

Re: [PR] AWS: Support S3 directory bucket listing [iceberg]

2024-10-23 Thread via GitHub
jackye1995 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1813045425 ## aws/src/integration/java/org/apache/iceberg/aws/AwsIntegTestUtil.java: ## @@ -127,6 +129,47 @@ public static void cleanS3Bucket(S3Client s3, String bucketName,

Re: [PR] feat: more builders and writing manifests [iceberg-go]

2024-10-23 Thread via GitHub
dwilson1988 commented on code in PR #177: URL: https://github.com/apache/iceberg-go/pull/177#discussion_r1813467513 ## manifest.go: ## @@ -567,6 +569,96 @@ func ReadManifestList(in io.Reader) ([]ManifestFile, error) { return out, dec.Error() } +// WriteManifestListV2

Re: [PR] feat: more builders and writing manifests [iceberg-go]

2024-10-23 Thread via GitHub
dwilson1988 commented on code in PR #177: URL: https://github.com/apache/iceberg-go/pull/177#discussion_r1813528711 ## manifest.go: ## @@ -831,14 +946,53 @@ func (m *manifestEntryV1) FileSequenceNum() *int64 { return m.FileSeqNum } -func (m *manifestEntryV1) DataFile(

Re: [I] Update Table Error: UPDATE TABLE is not supported temporarily. [iceberg]

2024-10-23 Thread via GitHub
soumilshah1995 commented on issue #9960: URL: https://github.com/apache/iceberg/issues/9960#issuecomment-2433444077 +1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] Build: Enable errorprone PatternMatchingInstanceof [iceberg]

2024-10-23 Thread via GitHub
ebyhr commented on PR #11374: URL: https://github.com/apache/iceberg/pull/11374#issuecomment-2433488950 @nastra This is mainly for enforcing styles. I will close if we don't use errorprone for such purposes. -- This is an automated message from the Apache Git Service. To respond to the m

Re: [PR] Core: Add portable Roaring bitmap for row positions [iceberg]

2024-10-23 Thread via GitHub
aokolnychyi commented on code in PR #11372: URL: https://github.com/apache/iceberg/pull/11372#discussion_r1813326978 ## core/src/main/java/org/apache/iceberg/deletes/RoaringPositionBitmap.java: ## @@ -0,0 +1,309 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] AWS: Use testcontainers-minio instead of S3Mock [iceberg]

2024-10-23 Thread via GitHub
sullis commented on code in PR #11349: URL: https://github.com/apache/iceberg/pull/11349#discussion_r1813136835 ## aws/src/test/java/org/apache/iceberg/aws/s3/MinioUtil.java: ## @@ -0,0 +1,64 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contr

Re: [PR] Spec: Adds Row Lineage [iceberg]

2024-10-23 Thread via GitHub
rdblue commented on code in PR #11130: URL: https://github.com/apache/iceberg/pull/11130#discussion_r1813670115 ## format/spec.md: ## @@ -298,16 +298,101 @@ Iceberg tables must not use field ids greater than 2147483447 (`Integer.MAX_VALU The set of metadata columns is: -|

Re: [I] Row Lineage for V3 [iceberg]

2024-10-23 Thread via GitHub
rdblue closed issue #11129: Row Lineage for V3 URL: https://github.com/apache/iceberg/issues/11129 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issue

Re: [PR] feat: more builders and writing manifests [iceberg-go]

2024-10-23 Thread via GitHub
zeroshade commented on code in PR #177: URL: https://github.com/apache/iceberg-go/pull/177#discussion_r1813481928 ## manifest.go: ## @@ -831,14 +946,53 @@ func (m *manifestEntryV1) FileSequenceNum() *int64 { return m.FileSeqNum } -func (m *manifestEntryV1) DataFile()

[I] Iceberg Extensions [iceberg-go]

2024-10-23 Thread via GitHub
dwilson1988 opened a new issue, #183: URL: https://github.com/apache/iceberg-go/issues/183 ### Feature Request / Improvement There are some various table formats that extend Iceberg by allowing additional metadata to be added to various components, for example, [Havasu](https://githu

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-23 Thread via GitHub
rdblue commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1813814216 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the follo

Re: [PR] ci: Fix CI for bindings python [iceberg-rust]

2024-10-23 Thread via GitHub
Xuanwo commented on PR #678: URL: https://github.com/apache/iceberg-rust/pull/678#issuecomment-2433240259 > FYI, CI failing on main > > https://github.com/apache/iceberg-rust/commits/main/ > > > > Is it related to this PR? > > > > error message, >

Re: [PR] feat: more builders and writing manifests [iceberg-go]

2024-10-23 Thread via GitHub
dwilson1988 commented on code in PR #177: URL: https://github.com/apache/iceberg-go/pull/177#discussion_r1813442346 ## manifest.go: ## @@ -567,6 +569,96 @@ func ReadManifestList(in io.Reader) ([]ManifestFile, error) { return out, dec.Error() } +// WriteManifestListV2

Re: [PR] feat: more builders and writing manifests [iceberg-go]

2024-10-23 Thread via GitHub
dwilson1988 commented on code in PR #177: URL: https://github.com/apache/iceberg-go/pull/177#discussion_r1813446168 ## manifest.go: ## @@ -876,7 +1030,140 @@ func (m *manifestEntryV2) FileSequenceNum() *int64 { return m.FileSeqNum } -func (m *manifestEntryV2) DataFile

Re: [PR] feat: more builders and writing manifests [iceberg-go]

2024-10-23 Thread via GitHub
dwilson1988 commented on code in PR #177: URL: https://github.com/apache/iceberg-go/pull/177#discussion_r1813445471 ## manifest.go: ## @@ -876,7 +1030,140 @@ func (m *manifestEntryV2) FileSequenceNum() *int64 { return m.FileSeqNum } -func (m *manifestEntryV2) DataFile

Re: [PR] Reset Spark Conf for each test in TestCompressionSettings [iceberg]

2024-10-23 Thread via GitHub
RussellSpitzer commented on PR #11333: URL: https://github.com/apache/iceberg/pull/11333#issuecomment-2433510526 Thank you @huaxingao ! It's always great to fix those hidden bad tests! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Spec: Adds Row Lineage [iceberg]

2024-10-23 Thread via GitHub
rdblue commented on code in PR #11130: URL: https://github.com/apache/iceberg/pull/11130#discussion_r1813684620 ## format/spec.md: ## @@ -298,16 +298,101 @@ Iceberg tables must not use field ids greater than 2147483447 (`Integer.MAX_VALU The set of metadata columns is: -|

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-23 Thread via GitHub
rdblue commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1813656198 ## format/spec.md: ## @@ -841,19 +855,45 @@ Notes: ## Delete Formats -This section details how to encode row-level deletes in Iceberg delete files. Row-level del

Re: [PR] Spec: Adds Row Lineage [iceberg]

2024-10-23 Thread via GitHub
rdblue commented on code in PR #11130: URL: https://github.com/apache/iceberg/pull/11130#discussion_r1813660233 ## format/spec.md: ## @@ -298,16 +298,101 @@ Iceberg tables must not use field ids greater than 2147483447 (`Integer.MAX_VALU The set of metadata columns is: -|

  1   2   >