Re: [PR] Exclude reading _pos column if it's not in the scan list [iceberg]

2024-11-08 Thread via GitHub
szehon-ho merged PR #11390: URL: https://github.com/apache/iceberg/pull/11390 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Exclude reading _pos column if it's not in the scan list [iceberg]

2024-11-08 Thread via GitHub
huaxingao commented on PR #11390: URL: https://github.com/apache/iceberg/pull/11390#issuecomment-2465917313 Thanks a lot! @flyrain @szehon-ho @viirya @dramaticlly @pvary -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] Exclude reading _pos column if it's not in the scan list [iceberg]

2024-11-08 Thread via GitHub
szehon-ho commented on PR #11390: URL: https://github.com/apache/iceberg/pull/11390#issuecomment-2465915805 Merged, thanks @huaxingao , and also @flyrain for review, and @pvary @dramaticlly @viirya for other reviews -- This is an automated message from the Apache Git Service. To respond t

Re: [PR] Exclude reading _pos column if it's not in the scan list [iceberg]

2024-11-08 Thread via GitHub
flyrain commented on code in PR #11390: URL: https://github.com/apache/iceberg/pull/11390#discussion_r1830093248 ## .palantir/revapi.yml: ## @@ -1145,6 +1145,15 @@ acceptedBreaks: new: "method org.apache.iceberg.BaseMetastoreOperations.CommitStatus org.apache.iceberg.Bas

Re: [PR] Exclude reading _pos column if it's not in the scan list [iceberg]

2024-11-08 Thread via GitHub
flyrain commented on code in PR #11390: URL: https://github.com/apache/iceberg/pull/11390#discussion_r1830093248 ## .palantir/revapi.yml: ## @@ -1145,6 +1145,15 @@ acceptedBreaks: new: "method org.apache.iceberg.BaseMetastoreOperations.CommitStatus org.apache.iceberg.Bas

Re: [PR] Exclude reading _pos column if it's not in the scan list [iceberg]

2024-11-08 Thread via GitHub
sfc-gh-ygu commented on code in PR #11390: URL: https://github.com/apache/iceberg/pull/11390#discussion_r1834857267 ## data/src/main/java/org/apache/iceberg/data/DeleteFilter.java: ## @@ -251,13 +253,14 @@ private static Schema fileProjection( Schema tableSchema, S

Re: [PR] Exclude reading _pos column if it's not in the scan list [iceberg]

2024-11-08 Thread via GitHub
sfc-gh-ygu commented on code in PR #11390: URL: https://github.com/apache/iceberg/pull/11390#discussion_r1834857267 ## data/src/main/java/org/apache/iceberg/data/DeleteFilter.java: ## @@ -251,13 +253,14 @@ private static Schema fileProjection( Schema tableSchema, S

Re: [PR] Exclude reading _pos column if it's not in the scan list [iceberg]

2024-11-02 Thread via GitHub
flyrain commented on code in PR #11390: URL: https://github.com/apache/iceberg/pull/11390#discussion_r1826877729 ## parquet/src/main/java/org/apache/iceberg/parquet/ReadConf.java: ## @@ -181,8 +184,8 @@ boolean[] shouldSkip() { return shouldSkip; } - private Map gener

Re: [PR] Exclude reading _pos column if it's not in the scan list [iceberg]

2024-11-02 Thread via GitHub
huaxingao commented on PR #11390: URL: https://github.com/apache/iceberg/pull/11390#issuecomment-2453099693 Thanks a lot @flyrain and @szehon-ho for your review! I've thought this over: I feel the [original change](https://github.com/apache/iceberg/pull/11390/commits/9b4e56d7dd9a6571655fd96

Re: [PR] Exclude reading _pos column if it's not in the scan list [iceberg]

2024-11-02 Thread via GitHub
flyrain commented on code in PR #11390: URL: https://github.com/apache/iceberg/pull/11390#discussion_r1826603841 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/BatchDataReader.java: ## @@ -93,10 +93,10 @@ protected CloseableIterator open(FileScanTask task) {

Re: [PR] Exclude reading _pos column if it's not in the scan list [iceberg]

2024-11-02 Thread via GitHub
flyrain commented on code in PR #11390: URL: https://github.com/apache/iceberg/pull/11390#discussion_r1826606339 ## data/src/main/java/org/apache/iceberg/data/DeleteFilter.java: ## @@ -93,7 +94,8 @@ protected DeleteFilter( this.posDeletes = posDeleteBuilder.build();

Re: [PR] Exclude reading _pos column if it's not in the scan list [iceberg]

2024-11-02 Thread via GitHub
flyrain commented on code in PR #11390: URL: https://github.com/apache/iceberg/pull/11390#discussion_r1826604505 ## data/src/main/java/org/apache/iceberg/data/DeleteFilter.java: ## @@ -69,7 +69,8 @@ protected DeleteFilter( List deletes, Schema tableSchema, S

Re: [PR] Exclude reading _pos column if it's not in the scan list [iceberg]

2024-10-31 Thread via GitHub
szehon-ho commented on PR #11390: URL: https://github.com/apache/iceberg/pull/11390#issuecomment-2451380014 Hm then can we we just add _pos to requiredSchema (with a comment)? Probably cleaner with a flag to ReadConf but not sure if its feasible. fyi @aokolnychyi -- This is an

Re: [PR] Exclude reading _pos column if it's not in the scan list [iceberg]

2024-10-31 Thread via GitHub
huaxingao commented on PR #11390: URL: https://github.com/apache/iceberg/pull/11390#issuecomment-2451241576 @szehon-ho Thanks for the comment. We actually also use the [requiredSchema](https://github.com/apache/iceberg/blob/fda2b3a5706fd580b0371e8a7c4b31d536eac0a3/spark/v3.5/spark/src

Re: [PR] Exclude reading _pos column if it's not in the scan list [iceberg]

2024-10-31 Thread via GitHub
szehon-ho commented on PR #11390: URL: https://github.com/apache/iceberg/pull/11390#issuecomment-2451120638 Sorry I still wanted to see if it can be done earlier, what do you think https://github.com/apache/iceberg/blob/main/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/Batc