huaxingao commented on code in PR #11390:
URL: https://github.com/apache/iceberg/pull/11390#discussion_r1817603690
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/BaseBatchReader.java:
##
@@ -81,14 +84,15 @@ private CloseableIterable newParquetIterable(
huaxingao commented on code in PR #11390:
URL: https://github.com/apache/iceberg/pull/11390#discussion_r1817447914
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/BaseBatchReader.java:
##
@@ -81,14 +84,15 @@ private CloseableIterable newParquetIterable(
viirya commented on code in PR #11390:
URL: https://github.com/apache/iceberg/pull/11390#discussion_r1817444036
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/BaseBatchReader.java:
##
@@ -81,14 +84,15 @@ private CloseableIterable newParquetIterable(
Spa
huaxingao commented on code in PR #11390:
URL: https://github.com/apache/iceberg/pull/11390#discussion_r1817436902
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/BaseBatchReader.java:
##
@@ -125,4 +129,28 @@ private CloseableIterable newOrcIterable(
.
viirya commented on code in PR #11390:
URL: https://github.com/apache/iceberg/pull/11390#discussion_r1817429106
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/BaseBatchReader.java:
##
@@ -125,4 +129,28 @@ private CloseableIterable newOrcIterable(
.wit
huaxingao commented on PR #11390:
URL: https://github.com/apache/iceberg/pull/11390#issuecomment-2438965356
@pvary Thank you for your suggestion! You're correct that adding such a test
would help prevent future changes from inadvertently affecting this behavior
without notice. Currently, Sp
dramaticlly commented on PR #11390:
URL: https://github.com/apache/iceberg/pull/11390#issuecomment-2438914981
> @huaxingao its a good find, im just wondering, where do we add _pos to the
schema? Can we just not do it there? Just curious if its possible
I think it might be from here
h
huaxingao commented on PR #11390:
URL: https://github.com/apache/iceberg/pull/11390#issuecomment-2438938087
@szehon-ho I think we still need the `_pos` in the `requiredSchema` to build
[`posAccessor`](https://github.com/apache/iceberg/blob/main/data/src/main/java/org/apache/iceberg/data/Dele
szehon-ho commented on PR #11390:
URL: https://github.com/apache/iceberg/pull/11390#issuecomment-2438907910
@huaxingao its a good find, im just wondering, where do we add _pos to the
schema? Can we just not do it there? Just curious if its possible
--
This is an automated message from t
pvary commented on PR #11390:
URL: https://github.com/apache/iceberg/pull/11390#issuecomment-2437760336
@huaxingao: I'm not an expert in the Spark codebase, but I think having a
test which fails before the change and succeeds after the change would be nice.
Otherwise we risk future PRs chan
huaxingao commented on PR #11390:
URL: https://github.com/apache/iceberg/pull/11390#issuecomment-2436567828
also cc @flyrain
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
huaxingao commented on PR #11390:
URL: https://github.com/apache/iceberg/pull/11390#issuecomment-2436559370
cc @szehon-ho @pvary @viirya
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the speci
huaxingao opened a new pull request, #11390:
URL: https://github.com/apache/iceberg/pull/11390
In Spark batch reading, Iceberg reads additional columns when there are
delete files. For instance, if we have a table
`test (int id, string data)` and a query `SELECT id FROM test`, the
reques
13 matches
Mail list logo