Fokko merged PR #6033:
URL: https://github.com/apache/iceberg/pull/6033
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach
Fokko commented on code in PR #6010:
URL: https://github.com/apache/iceberg/pull/6010#discussion_r1003109634
##
python/pyiceberg/io/pyarrow.py:
##
@@ -66,10 +74,14 @@ class PyArrowFile(InputFile, OutputFile):
>>> # output_file.create().write(b'foobytes')
"""
-
gaborkaszab opened a new pull request, #6035:
URL: https://github.com/apache/iceberg/pull/6035
This patch seems to be present in 1.0.0 but missing in master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL abov
ajantha-bhat commented on PR #6035:
URL: https://github.com/apache/iceberg/pull/6035#issuecomment-1288971855
Already merged yesterday?
https://github.com/apache/iceberg/pull/5916
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
gaborkaszab commented on PR #6035:
URL: https://github.com/apache/iceberg/pull/6035#issuecomment-1288977162
Thanks for letting me know! I need a rebase then :) Closing this
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and u
gaborkaszab closed pull request #6035: Core: Increase inferred column metrics
limit to 100
URL: https://github.com/apache/iceberg/pull/6035
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specif
jackye1995 commented on code in PR #6034:
URL: https://github.com/apache/iceberg/pull/6034#discussion_r1003415094
##
python/tests/catalog/test_glue.py:
##
@@ -0,0 +1,252 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.
huaxingao commented on PR #5961:
URL: https://github.com/apache/iceberg/pull/5961#issuecomment-1289199629
@rdblue Thank you very much for your review! I have addressed the comments.
Could you please take one more look when you have time? Thanks!
--
This is an automated message from the Ap
gaborkaszab commented on PR #6036:
URL: https://github.com/apache/iceberg/pull/6036#issuecomment-1289206280
@samredai @nastra
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment
ismailsimsek commented on issue #5997:
URL: https://github.com/apache/iceberg/issues/5997#issuecomment-1289283783
@vshel any reason you are not [using Athena to do
compaction](https://docs.aws.amazon.com/athena/latest/ug/querying-iceberg-data-optimization.html)?
--
This is an au
nastra opened a new pull request, #6037:
URL: https://github.com/apache/iceberg/pull/6037
The motivation behind moving `ScanReport` to `iceberg-core` is because we
don't actually need it in the `iceberg-api`, since `MetricsReporter` only
requires to have `MetricsReport` in the `iceberg-api`
nastra commented on code in PR #5968:
URL: https://github.com/apache/iceberg/pull/5968#discussion_r1003528420
##
core/src/main/java/org/apache/iceberg/rest/requests/CreateNamespaceRequest.java:
##
@@ -19,80 +19,24 @@
package org.apache.iceberg.rest.requests;
import java.util
rdblue commented on PR #6037:
URL: https://github.com/apache/iceberg/pull/6037#issuecomment-1289304312
Looks good to me. I like not having so many nested interfaces!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
wypoon commented on PR #6026:
URL: https://github.com/apache/iceberg/pull/6026#issuecomment-1289315807
@flyrain for the test part, I followed your suggestion and added a test in
`TestSparkReaderDeletes` instead (removing the earlier one).
--
This is an automated message from the Apache Gi
wypoon commented on code in PR #6026:
URL: https://github.com/apache/iceberg/pull/6026#discussion_r1003542296
##
spark/v3.3/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkReaderDeletes.java:
##
@@ -508,6 +530,75 @@ public void testIsDeletedColumnWithoutDeleteFile()
wypoon commented on code in PR #6026:
URL: https://github.com/apache/iceberg/pull/6026#discussion_r1003550199
##
spark/v3.3/build.gradle:
##
@@ -140,6 +140,9 @@
project(":iceberg-spark:iceberg-spark-extensions-${sparkMajorVersion}_${scalaVer
exclude group: 'org.roaringbi
wypoon commented on code in PR #6026:
URL: https://github.com/apache/iceberg/pull/6026#discussion_r1003550728
##
spark/v3.3/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestParquetMergeOnRead.java:
##
@@ -0,0 +1,196 @@
+/*
+ * Licensed to the Apache Softwar
wypoon commented on code in PR #6026:
URL: https://github.com/apache/iceberg/pull/6026#discussion_r1003542296
##
spark/v3.3/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkReaderDeletes.java:
##
@@ -508,6 +530,75 @@ public void testIsDeletedColumnWithoutDeleteFile()
Fokko opened a new pull request, #6038:
URL: https://github.com/apache/iceberg/pull/6038
Github pages break every time we do a force push because it removes the
`CNAME` file. This will fix it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please l
ahshahid opened a new issue, #6039:
URL: https://github.com/apache/iceberg/issues/6039
Spark has Partition Pruning rule which under right condition can fetch all
the join keys of one side of the table, and pass it as an In Clause filter to
other table.
For eg if the query is select
jzhuge commented on code in PR #4925:
URL: https://github.com/apache/iceberg/pull/4925#discussion_r1003609858
##
api/src/main/java/org/apache/iceberg/view/ViewBuilder.java:
##
@@ -0,0 +1,133 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contri
jzhuge commented on code in PR #4925:
URL: https://github.com/apache/iceberg/pull/4925#discussion_r1003616023
##
api/src/main/java/org/apache/iceberg/view/ViewBuilder.java:
##
@@ -0,0 +1,133 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contri
jzhuge commented on code in PR #4925:
URL: https://github.com/apache/iceberg/pull/4925#discussion_r1003621379
##
api/src/main/java/org/apache/iceberg/view/ViewBuilder.java:
##
@@ -0,0 +1,133 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contri
jzhuge commented on PR #4925:
URL: https://github.com/apache/iceberg/pull/4925#issuecomment-1289434209
@wmoustafa Thanks for the valuable questions and feedbacks. Echo your point
that it is confusing to store representations in different view versions.
I'd suggest this API contract:
JiJiTang commented on PR #5539:
URL: https://github.com/apache/iceberg/pull/5539#issuecomment-1289454522
cc @flyrain for review
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific commen
rdblue merged PR #6037:
URL: https://github.com/apache/iceberg/pull/6037
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac
JonasJ-ap opened a new pull request, #6040:
URL: https://github.com/apache/iceberg/pull/6040
- Add `AwsKmsClient`, which implements the `KmsClient` interface.
- Add unit tests
- Add integration tests
--
This is an automated message from the Apache Git Service.
To respond to the messa
wypoon opened a new pull request, #6041:
URL: https://github.com/apache/iceberg/pull/6041
This is a port of https://github.com/apache/iceberg/pull/6026 to spark/v3.2.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
wypoon commented on PR #6041:
URL: https://github.com/apache/iceberg/pull/6041#issuecomment-1289706498
@flyrain @chenjunjiedada this is a direct port of
https://github.com/apache/iceberg/pull/6026 to spark/v3.2.
--
This is an automated message from the Apache Git Service.
To respond to th
szehon-ho commented on code in PR #6025:
URL: https://github.com/apache/iceberg/pull/6025#discussion_r1003844542
##
docs/spark-procedures.md:
##
@@ -421,12 +421,18 @@ Existing data files are added to the Iceberg table's
metadata and can be read us
To leave the original table
szehon-ho commented on code in PR #6025:
URL: https://github.com/apache/iceberg/pull/6025#discussion_r1003844542
##
docs/spark-procedures.md:
##
@@ -421,12 +421,18 @@ Existing data files are added to the Iceberg table's
metadata and can be read us
To leave the original table
szehon-ho opened a new issue, #6042:
URL: https://github.com/apache/iceberg/issues/6042
### Feature Request / Improvement
@ajantha-bhat brought up that Partitions table fields 'file_count' and
'record_count' are not reflecting delete files, and was interested to fix it.
One pos
flyrain merged PR #6026:
URL: https://github.com/apache/iceberg/pull/6026
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa
flyrain commented on PR #6026:
URL: https://github.com/apache/iceberg/pull/6026#issuecomment-1289765350
Merged. Thanks @wypoon. Thanks @chenjunjiedada for the review.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
rzhang10 commented on code in PR #4110:
URL: https://github.com/apache/iceberg/pull/4110#discussion_r1003864020
##
spark/v2.4/build.gradle:
##
@@ -121,6 +121,7 @@ project(':iceberg-spark:iceberg-spark-runtime') {
exclude group: 'org.xerial.snappy'
exclude group: 'j
github-actions[bot] commented on issue #4628:
URL: https://github.com/apache/iceberg/issues/4628#issuecomment-1289809920
This issue has been automatically marked as stale because it has been open
for 180 days with no activity. It will be closed in next 14 days if no further
activity occurs.
github-actions[bot] commented on issue #4549:
URL: https://github.com/apache/iceberg/issues/4549#issuecomment-1289809974
This issue has been closed because it has not received any activity in the
last 14 days since being marked as 'stale'
--
This is an automated message from the Apache Gi
github-actions[bot] closed issue #4549: HIVE_METASTORE_ERROR: Table storage
descriptor is missing SerDe info - when query a view using an Iceberg table on
Athena
URL: https://github.com/apache/iceberg/issues/4549
--
This is an automated message from the Apache Git Service.
To respond to the
github-actions[bot] commented on issue #4542:
URL: https://github.com/apache/iceberg/issues/4542#issuecomment-1289809998
This issue has been closed because it has not received any activity in the
last 14 days since being marked as 'stale'
--
This is an automated message from the Apache Gi
github-actions[bot] closed issue #4542: Schema Evolution exception: too many
data columns
URL: https://github.com/apache/iceberg/issues/4542
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the speci
JonasJ-ap commented on code in PR #5994:
URL: https://github.com/apache/iceberg/pull/5994#discussion_r1003915535
##
docs/aws.md:
##
@@ -435,48 +437,23 @@ This is turned off by default.
### S3 Tags
Custom
[tags](https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-ta
hililiwei opened a new pull request, #6043:
URL: https://github.com/apache/iceberg/pull/6043
# Proposal: Partial Updates
## motivation
Take feature engineering as an example, there are thousands or even tens of
thousands of columns in the table, but the task will update
ajantha-bhat commented on issue #6042:
URL: https://github.com/apache/iceberg/issues/6042#issuecomment-1289876866
@szehon-ho :
Say for `partition-a` I have `record_count`=6 and `file_count`=2. [3 records
in each file]
Now, I do position delete which marks 3 records in file1 as deleted
szehon-ho commented on issue #6042:
URL: https://github.com/apache/iceberg/issues/6042#issuecomment-1289920910
Yea I think , we cant do any arithmetic otherwise it becomes a matter of
applying the delete file, which shouldn't be done in metadata table. This
should be coming just from file
flyrain merged PR #6041:
URL: https://github.com/apache/iceberg/pull/6041
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa
flyrain commented on PR #6041:
URL: https://github.com/apache/iceberg/pull/6041#issuecomment-1289925072
Thanks @wypoon for the PR. Thanks @chenjunjiedada for the review.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
hililiwei commented on code in PR #5984:
URL: https://github.com/apache/iceberg/pull/5984#discussion_r1003963515
##
api/src/main/java/org/apache/iceberg/IncrementalScan.java:
##
@@ -21,6 +21,23 @@
/** API for configuring an incremental scan. */
public interface IncrementalScan
lvyanquan commented on PR #5561:
URL: https://github.com/apache/iceberg/pull/5561#issuecomment-1289931212
Have the same needs as this pr, and we may need to add notes explaining this
parameter 'flink.max-continuous-empty-commits'.
--
This is an automated message from the Apache Git Servic
hililiwei commented on code in PR #5984:
URL: https://github.com/apache/iceberg/pull/5984#discussion_r1003963515
##
api/src/main/java/org/apache/iceberg/IncrementalScan.java:
##
@@ -21,6 +21,23 @@
/** API for configuring an incremental scan. */
public interface IncrementalScan
hililiwei commented on code in PR #5984:
URL: https://github.com/apache/iceberg/pull/5984#discussion_r1003963515
##
api/src/main/java/org/apache/iceberg/IncrementalScan.java:
##
@@ -21,6 +21,23 @@
/** API for configuring an incremental scan. */
public interface IncrementalScan
rbalamohan opened a new issue, #6044:
URL: https://github.com/apache/iceberg/issues/6044
### Apache Iceberg version
0.14.0
### Query engine
Spark
### Please describe the bug 🐞
Column projection/pruning is not happening in iceberg tables for inner
queries.
chenwyi2 commented on issue #4137:
URL: https://github.com/apache/iceberg/issues/4137#issuecomment-1289942511
this problem has been solved? i also meet this problem, when iceberg was
commited sucessfully but flink flush snapshot state to state backend was
failed, then i restart task, it can
ajantha-bhat commented on issue #6042:
URL: https://github.com/apache/iceberg/issues/6042#issuecomment-1289959557
> Yea I think , we cant do any resolution of deletes, otherwise it becomes a
matter of applying the delete file, which shouldn't be done in metadata table.
This should be coming
haizhou-zhao opened a new pull request, #6045:
URL: https://github.com/apache/iceberg/pull/6045
Build on top of this PR: https://github.com/apache/iceberg/pull/5763
This is to add group ownership support to iceberg-hive-metastore
--
This is an automated message from the Apache Git S
wypoon opened a new pull request, #6046:
URL: https://github.com/apache/iceberg/pull/6046
This is a port of https://github.com/apache/iceberg/pull/6026 to spark/v3.1.
This is not a direct port, as the 3.1 code base lags the 3.3 and 3.2 code
base, but it is fairly straightforward.
pvary commented on PR #6043:
URL: https://github.com/apache/iceberg/pull/6043#issuecomment-1290016237
When we were developing updates for Hive tables, the first version of the
ACID implementation was to store only the updated data, which is very similar
to the partial updates suggested here
wypoon commented on code in PR #6046:
URL: https://github.com/apache/iceberg/pull/6046#discussion_r1004028589
##
spark/v3.1/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkReaderDeletes.java:
##
@@ -117,16 +131,26 @@ public static void stopMetastoreAndSpark() throws
hililiwei commented on code in PR #5967:
URL: https://github.com/apache/iceberg/pull/5967#discussion_r1004060008
##
docs/flink-getting-started.md:
##
@@ -683,7 +683,47 @@ env.execute("Test Iceberg DataStream");
OVERWRITE and UPSERT can't be set together. In UPSERT mode, if the
hililiwei commented on code in PR #5967:
URL: https://github.com/apache/iceberg/pull/5967#discussion_r1004060861
##
docs/flink-getting-started.md:
##
@@ -683,7 +683,47 @@ env.execute("Test Iceberg DataStream");
OVERWRITE and UPSERT can't be set together. In UPSERT mode, if the
59 matches
Mail list logo