manuzhang commented on PR #9584:
URL: https://github.com/apache/iceberg/pull/9584#issuecomment-1925616713
Rebased on #9605
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
liurenjie1024 commented on issue #184:
URL: https://github.com/apache/iceberg-rust/issues/184#issuecomment-1925613604
I agree that this may be a little wasteful, but the cost maybe small: the
heavy ones are hold by `Arc`. We can do this optimization when necessary.
--
This is an automated
liurenjie1024 commented on code in PR #186:
URL: https://github.com/apache/iceberg-rust/pull/186#discussion_r1477195324
##
crates/iceberg/src/catalog/mod.rs:
##
@@ -25,16 +25,16 @@ use crate::spec::{
};
use crate::table::Table;
use crate::{Error, ErrorKind, Result};
-use asyn
liurenjie1024 commented on PR #185:
URL: https://github.com/apache/iceberg-rust/pull/185#issuecomment-1925611867
cc @ZENOTME @Xuanwo @Fokko PTAL
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to th
dependabot[bot] opened a new pull request, #9639:
URL: https://github.com/apache/iceberg/pull/9639
Bumps
[datamodel-code-generator](https://github.com/koxudaxi/datamodel-code-generator)
from 0.25.2 to 0.25.3.
Release notes
Sourced from https://github.com/koxudaxi/datamodel-code-ge
dependabot[bot] closed pull request #9566: Build: Bump mkdocs-material from
9.5.3 to 9.5.5
URL: https://github.com/apache/iceberg/pull/9566
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specif
dependabot[bot] opened a new pull request, #9638:
URL: https://github.com/apache/iceberg/pull/9638
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from
9.5.3 to 9.5.7.
Release notes
Sourced from https://github.com/squidfunk/mkdocs-material/releases";>mkdocs-ma
dependabot[bot] commented on PR #9566:
URL: https://github.com/apache/iceberg/pull/9566#issuecomment-1925580877
Superseded by #9638.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific c
manuzhang commented on PR #9611:
URL: https://github.com/apache/iceberg/pull/9611#issuecomment-1925577015
@amogh-jahagirdar This is based on our discussion in
[#9400](https://github.com/apache/iceberg/pull/9400#discussion_r1442236190),
but I'd like to go one step further. Throwing exception
dependabot[bot] opened a new pull request, #9637:
URL: https://github.com/apache/iceberg/pull/9637
Bumps
[com.palantir.baseline:gradle-baseline-java](https://github.com/palantir/gradle-baseline)
from 4.42.0 to 5.37.0.
Release notes
Sourced from https://github.com/palantir/gradle-b
dependabot[bot] commented on PR #9540:
URL: https://github.com/apache/iceberg/pull/9540#issuecomment-1925575022
Superseded by #9634.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific c
dependabot[bot] opened a new pull request, #9635:
URL: https://github.com/apache/iceberg/pull/9635
Bumps
[com.google.cloud:libraries-bom](https://github.com/googleapis/java-cloud-bom)
from 26.28.0 to 26.31.0.
Release notes
Sourced from https://github.com/googleapis/java-cloud-bom/
dependabot[bot] commented on PR #9567:
URL: https://github.com/apache/iceberg/pull/9567#issuecomment-1925575127
Superseded by #9637.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific c
dependabot[bot] opened a new pull request, #9633:
URL: https://github.com/apache/iceberg/pull/9633
Bumps software.amazon.awssdk:bom from 2.23.12 to 2.23.17.
[
from 0.6.0 to 3.1.0.
Release notes
Sourced from https://github.com/delta-io/delta/releases";>io.delta:delta-s
dependabot[bot] commented on PR #9534:
URL: https://github.com/apache/iceberg/pull/9534#issuecomment-1925575046
Superseded by #9635.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific c
dependabot[bot] closed pull request #9534: Build: Bump
com.google.cloud:libraries-bom from 26.28.0 to 26.30.0
URL: https://github.com/apache/iceberg/pull/9534
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
dependabot[bot] closed pull request #9540: Build: Bump org.xerial:sqlite-jdbc
from 3.44.0.0 to 3.45.0.0
URL: https://github.com/apache/iceberg/pull/9540
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
dependabot[bot] opened a new pull request, #9634:
URL: https://github.com/apache/iceberg/pull/9634
Bumps [org.xerial:sqlite-jdbc](https://github.com/xerial/sqlite-jdbc) from
3.44.0.0 to 3.45.1.0.
Release notes
Sourced from https://github.com/xerial/sqlite-jdbc/releases";>org.xerial
dependabot[bot] opened a new pull request, #9632:
URL: https://github.com/apache/iceberg/pull/9632
Bumps `jetty` from 9.4.53.v20231009 to 11.0.20.
Updates `org.eclipse.jetty:jetty-server` from 9.4.53.v20231009 to 11.0.20
Updates `org.eclipse.jetty:jetty-servlet` from 9.4.53.v2023100
dependabot[bot] opened a new pull request, #9631:
URL: https://github.com/apache/iceberg/pull/9631
Bumps [io.delta:delta-spark_2.12](https://github.com/delta-io/delta) from
3.0.0 to 3.1.0.
Release notes
Sourced from https://github.com/delta-io/delta/releases";>io.delta:delta-spark_
manuzhang commented on PR #9585:
URL: https://github.com/apache/iceberg/pull/9585#issuecomment-1925574651
@rdblue Thanks for review. I've updated the PR accordingly.
I have a question on what is a recommended usage and what's not. How is it
conveyed? For example, I don't find much inf
advancedxy commented on code in PR #9629:
URL: https://github.com/apache/iceberg/pull/9629#discussion_r1477165092
##
core/src/main/java/org/apache/iceberg/PartitionData.java:
##
@@ -171,6 +169,10 @@ public PartitionData copy() {
return new PartitionData(this);
}
+ pub
link3280 commented on issue #9071:
URL: https://github.com/apache/iceberg/issues/9071#issuecomment-1925554243
We get the same problem here with Iceberg 1.3.0.
The bug affects not only data files but also metadata.json and .avro files.
The files created twice could be corrupted (1-2% c
wgtmac commented on PR #9630:
URL: https://github.com/apache/iceberg/pull/9630#issuecomment-1925552879
Thanks @amogh-jahagirdar for the quick review!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
github-actions[bot] commented on issue #835:
URL: https://github.com/apache/iceberg/issues/835#issuecomment-1925495813
This issue has been automatically marked as stale because it has been open
for 180 days with no activity. It will be closed in next 14 days if no further
activity occurs. T
github-actions[bot] commented on issue #274:
URL: https://github.com/apache/iceberg/issues/274#issuecomment-1925495705
This issue has been closed because it has not received any activity in the
last 14 days since being marked as 'stale'
--
This is an automated message from the Apache Git
github-actions[bot] closed issue #274: Error while using bucket partitions
URL: https://github.com/apache/iceberg/issues/274
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
github-actions[bot] commented on issue #834:
URL: https://github.com/apache/iceberg/issues/834#issuecomment-1925495805
This issue has been automatically marked as stale because it has been open
for 180 days with no activity. It will be closed in next 14 days if no further
activity occurs. T
github-actions[bot] commented on issue #822:
URL: https://github.com/apache/iceberg/issues/822#issuecomment-1925495799
This issue has been automatically marked as stale because it has been open
for 180 days with no activity. It will be closed in next 14 days if no further
activity occurs. T
github-actions[bot] commented on issue #798:
URL: https://github.com/apache/iceberg/issues/798#issuecomment-1925495790
This issue has been automatically marked as stale because it has been open
for 180 days with no activity. It will be closed in next 14 days if no further
activity occurs. T
amogh-jahagirdar commented on code in PR #9620:
URL: https://github.com/apache/iceberg/pull/9620#discussion_r1477127468
##
core/src/main/java/org/apache/iceberg/rest/RESTViewOperations.java:
##
@@ -59,6 +60,8 @@ public void commit(ViewMetadata base, ViewMetadata metadata) {
syun64 commented on code in PR #358:
URL: https://github.com/apache/iceberg-python/pull/358#discussion_r1477128027
##
pyiceberg/catalog/rest.py:
##
@@ -450,6 +450,10 @@ def create_table(
iceberg_schema = self._convert_schema_if_needed(schema)
iceberg_schema = a
syun64 opened a new issue, #362:
URL: https://github.com/apache/iceberg-python/issues/362
### Feature Request / Improvement
In the java code base, catalog configuration includes catalog table-default
and table-override properties:
Catalog Property Key | Description
-- | --
syun64 commented on code in PR #358:
URL: https://github.com/apache/iceberg-python/pull/358#discussion_r1477128027
##
pyiceberg/catalog/rest.py:
##
@@ -450,6 +450,10 @@ def create_table(
iceberg_schema = self._convert_schema_if_needed(schema)
iceberg_schema = a
syun64 commented on code in PR #358:
URL: https://github.com/apache/iceberg-python/pull/358#discussion_r1477128678
##
pyiceberg/catalog/rest.py:
##
@@ -450,6 +450,10 @@ def create_table(
iceberg_schema = self._convert_schema_if_needed(schema)
iceberg_schema = a
syun64 commented on code in PR #358:
URL: https://github.com/apache/iceberg-python/pull/358#discussion_r1477127850
##
pyiceberg/io/pyarrow.py:
##
@@ -1720,13 +1720,22 @@ def write_file(table: Table, tasks:
Iterator[WriteTask]) -> Iterator[DataFile]:
except StopIteration:
syun64 commented on code in PR #358:
URL: https://github.com/apache/iceberg-python/pull/358#discussion_r1477127850
##
pyiceberg/io/pyarrow.py:
##
@@ -1720,13 +1720,22 @@ def write_file(table: Table, tasks:
Iterator[WriteTask]) -> Iterator[DataFile]:
except StopIteration:
syun64 commented on code in PR #358:
URL: https://github.com/apache/iceberg-python/pull/358#discussion_r1477128027
##
pyiceberg/catalog/rest.py:
##
@@ -450,6 +450,10 @@ def create_table(
iceberg_schema = self._convert_schema_if_needed(schema)
iceberg_schema = a
syun64 commented on code in PR #358:
URL: https://github.com/apache/iceberg-python/pull/358#discussion_r1477127850
##
pyiceberg/io/pyarrow.py:
##
@@ -1720,13 +1720,22 @@ def write_file(table: Table, tasks:
Iterator[WriteTask]) -> Iterator[DataFile]:
except StopIteration:
rdblue commented on code in PR #9585:
URL: https://github.com/apache/iceberg/pull/9585#discussion_r1477126737
##
docs/java-api-quickstart.md:
##
@@ -38,37 +38,42 @@ The Hive catalog connects to a Hive metastore to keep track
of Iceberg tables.
You can initialize a Hive catalog
rdblue commented on code in PR #9585:
URL: https://github.com/apache/iceberg/pull/9585#discussion_r1477126762
##
docs/java-api-quickstart.md:
##
@@ -38,37 +38,42 @@ The Hive catalog connects to a Hive metastore to keep track
of Iceberg tables.
You can initialize a Hive catalog
rdblue commented on code in PR #9585:
URL: https://github.com/apache/iceberg/pull/9585#discussion_r1477126396
##
docs/java-api-quickstart.md:
##
@@ -38,37 +38,42 @@ The Hive catalog connects to a Hive metastore to keep track
of Iceberg tables.
You can initialize a Hive catalog
rdblue commented on code in PR #9585:
URL: https://github.com/apache/iceberg/pull/9585#discussion_r1477126293
##
docs/java-api-quickstart.md:
##
@@ -38,37 +38,42 @@ The Hive catalog connects to a Hive metastore to keep track
of Iceberg tables.
You can initialize a Hive catalog
rdblue commented on code in PR #9585:
URL: https://github.com/apache/iceberg/pull/9585#discussion_r1477126245
##
docs/java-api-quickstart.md:
##
@@ -38,37 +38,42 @@ The Hive catalog connects to a Hive metastore to keep track
of Iceberg tables.
You can initialize a Hive catalog
rdblue commented on code in PR #9620:
URL: https://github.com/apache/iceberg/pull/9620#discussion_r1477126112
##
core/src/main/java/org/apache/iceberg/view/ViewProperties.java:
##
@@ -26,6 +26,8 @@ public class ViewProperties {
public static final String METADATA_COMPRESSION
rdblue commented on code in PR #9620:
URL: https://github.com/apache/iceberg/pull/9620#discussion_r1477125566
##
core/src/main/java/org/apache/iceberg/rest/RESTViewOperations.java:
##
@@ -59,6 +60,8 @@ public void commit(ViewMetadata base, ViewMetadata metadata) {
// this i
rdblue commented on code in PR #9323:
URL: https://github.com/apache/iceberg/pull/9323#discussion_r1477125444
##
core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java:
##
@@ -221,34 +223,52 @@ protected boolean addsDeleteFiles() {
/** Add a data file to the new s
rdblue commented on code in PR #9323:
URL: https://github.com/apache/iceberg/pull/9323#discussion_r1477125444
##
core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java:
##
@@ -221,34 +223,52 @@ protected boolean addsDeleteFiles() {
/** Add a data file to the new s
rdblue commented on PR #9629:
URL: https://github.com/apache/iceberg/pull/9629#issuecomment-1925450899
Thanks for the fix, @aokolnychyi! I think it's important to get this into
1.5 so I merged this. The method name should be okay.
--
This is an automated message from the Apache Git Servic
rdblue commented on PR #9466:
URL: https://github.com/apache/iceberg/pull/9466#issuecomment-1925450414
Thanks, @bryanck! This is looking great and I'm excited to get the next
steps in.
Also thanks to @fqaiser94 for reviewing!
--
This is an automated message from the Apache Git Serv
rdblue merged PR #9629:
URL: https://github.com/apache/iceberg/pull/9629
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac
rdblue merged PR #9466:
URL: https://github.com/apache/iceberg/pull/9466
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac
amogh-jahagirdar commented on PR #358:
URL: https://github.com/apache/iceberg-python/pull/358#issuecomment-1925449043
> For the case (no compression specified) the tests currently pass
locally but they shouldn't as we never set zstd as the default
The default parquet compression is Z
syun64 commented on code in PR #358:
URL: https://github.com/apache/iceberg-python/pull/358#discussion_r1477123601
##
pyiceberg/io/pyarrow.py:
##
@@ -1720,13 +1720,22 @@ def write_file(table: Table, tasks:
Iterator[WriteTask]) -> Iterator[DataFile]:
except StopIteration:
danielcweeks merged PR #9628:
URL: https://github.com/apache/iceberg/pull/9628
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@iceber
jonashaag commented on PR #358:
URL: https://github.com/apache/iceberg-python/pull/358#issuecomment-1925434691
Can you start CI @syun64?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specif
jonashaag commented on code in PR #358:
URL: https://github.com/apache/iceberg-python/pull/358#discussion_r1477115866
##
pyiceberg/io/pyarrow.py:
##
@@ -1720,13 +1720,22 @@ def write_file(table: Table, tasks:
Iterator[WriteTask]) -> Iterator[DataFile]:
except StopIteration
jonashaag commented on code in PR #358:
URL: https://github.com/apache/iceberg-python/pull/358#discussion_r1477115723
##
pyiceberg/io/pyarrow.py:
##
@@ -1720,13 +1720,22 @@ def write_file(table: Table, tasks:
Iterator[WriteTask]) -> Iterator[DataFile]:
except StopIteration
syun64 commented on PR #358:
URL: https://github.com/apache/iceberg-python/pull/358#issuecomment-1925432345
@jonashaag thank you for raising the issue and putting this PR together so
quickly! We are very excited to group this fix in with the impending 0.6.0
release. I've left some comments
syun64 commented on code in PR #358:
URL: https://github.com/apache/iceberg-python/pull/358#discussion_r1477114478
##
tests/integration/test_writes.py:
##
@@ -489,6 +492,50 @@ def test_data_files(spark: SparkSession, session_catalog:
Catalog, arrow_table_w
assert [row.dele
syun64 commented on code in PR #358:
URL: https://github.com/apache/iceberg-python/pull/358#discussion_r1477114250
##
pyiceberg/io/pyarrow.py:
##
@@ -1720,13 +1720,22 @@ def write_file(table: Table, tasks:
Iterator[WriteTask]) -> Iterator[DataFile]:
except StopIteration:
kevinjqliu commented on issue #326:
URL: https://github.com/apache/iceberg-python/issues/326#issuecomment-1925431063
Added example to "getting started" in #361
Didn't use `tempfile` there since I think it's useful to be able to see what
data and metadata files are generated by Iceber
kevinjqliu opened a new pull request, #361:
URL: https://github.com/apache/iceberg-python/pull/361
Related to #326 [Add example of using PyIceberg with minimal external
dependencies]
--
This is an automated message from the Apache Git Service.
To respond to the message, please
syun64 commented on code in PR #358:
URL: https://github.com/apache/iceberg-python/pull/358#discussion_r1477113970
##
tests/integration/test_writes.py:
##
@@ -489,6 +492,50 @@ def test_data_files(spark: SparkSession, session_catalog:
Catalog, arrow_table_w
assert [row.dele
kevinjqliu opened a new pull request, #360:
URL: https://github.com/apache/iceberg-python/pull/360
`create_engine`'s `echo` is useful for debugging purposes.
Otherwise, it exposes a lot of internal SQLite information when it's not
needed.
![Screenshot 2024-02-03 at 11 01 53
syun64 commented on code in PR #358:
URL: https://github.com/apache/iceberg-python/pull/358#discussion_r1477113742
##
pyiceberg/io/pyarrow.py:
##
@@ -1720,13 +1720,22 @@ def write_file(table: Table, tasks:
Iterator[WriteTask]) -> Iterator[DataFile]:
except StopIteration:
jonashaag opened a new pull request, #358:
URL: https://github.com/apache/iceberg-python/pull/358
I had to change the `metadata_collector` code due to
https://github.com/dask/dask/issues/7977.
For the `` case (no compression specified) the tests currently pass
locally but they should
jonashaag commented on issue #345:
URL: https://github.com/apache/iceberg-python/issues/345#issuecomment-1925385537
Update: Hm, now it doesn't seem to be the case anymore. Not sure what
happened there...
--
This is an automated message from the Apache Git Service.
To respond to the messag
amogh-jahagirdar merged PR #9630:
URL: https://github.com/apache/iceberg/pull/9630
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@ic
jonashaag commented on issue #345:
URL: https://github.com/apache/iceberg-python/issues/345#issuecomment-1925368819
I think the REST catalog ignores the `write.parquet.compression-codec`
option. No matter what options I set for the catalog, it always responds here
https://github.com/apache/
odysa opened a new pull request, #186:
URL: https://github.com/apache/iceberg-rust/pull/186
Close #139
Use `trait_variant::make` to support async fn in pub traits. It creates 2
traits `LocalCatalog` for single thread and `Catalog` with `Send` for
multithreaded runtime.
--
This i
wgtmac opened a new pull request, #9630:
URL: https://github.com/apache/iceberg/pull/9630
I believe the BagePageReader class got its name due to a typo. Fortunately
it is not a public class so we have the chance to fix it to BasePageReader.
--
This is an automated message from the Apache
manuzhang commented on PR #9584:
URL: https://github.com/apache/iceberg/pull/9584#issuecomment-1925339993
@amogh-jahagirdar In my case of Flink upserting Iceberg table, I'd rolled
back table state to the previous snapshot and asked the upstream user to replay
from the corresponding checkpoi
zinking commented on code in PR #6581:
URL: https://github.com/apache/iceberg/pull/6581#discussion_r1477056633
##
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/actions/RemoveDanglingDeletesSparkAction.java:
##
@@ -0,0 +1,227 @@
+/*
+ * Licensed to the Apache Software F
zinking commented on code in PR #6581:
URL: https://github.com/apache/iceberg/pull/6581#discussion_r1477056187
##
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/actions/RemoveDanglingDeletesSparkAction.java:
##
@@ -0,0 +1,227 @@
+/*
+ * Licensed to the Apache Software F
BsoBird commented on PR #9546:
URL: https://github.com/apache/iceberg/pull/9546#issuecomment-1925291024
@RussellSpitzer
Hello. sir. I've optimised the hadoopCatalog implementation and I now
believe that its execution behaviour is basically SPEC compliant. We don't need
the CommitStateUn
rahij opened a new issue, #357:
URL: https://github.com/apache/iceberg-python/issues/357
### Feature Request / Improvement
I am trying to understand how the new arrow write API can work with
distributed writes similar to spark. I have a use case where from different
machines, I would
rahij commented on PR #356:
URL: https://github.com/apache/iceberg-python/pull/356#issuecomment-1925250020
@Fokko would you be the right person to review this?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL ab
rahij opened a new pull request, #356:
URL: https://github.com/apache/iceberg-python/pull/356
When sqlalchemy encounters an error if the table does not exist, it raises a
different exception from sqlite. Hence, when using postgres, it is not possible
to even create the catalog, as the excep
zeodtr commented on issue #177:
URL: https://github.com/apache/iceberg-rust/issues/177#issuecomment-1925228247
@odysa Oh sorry, I totally misunderstood the code. Upon seeing the
`expect()` message `"current_schema_id not found in schemas"`, I thought it's
the case of `'current_schema_id ==
82 matches
Mail list logo