Re: [I] DatasourceV2 does not prune columns after V2ScanRelationPushDown [iceberg]

2025-03-27 Thread via GitHub
IgorBerman commented on issue #9268: URL: https://github.com/apache/iceberg/issues/9268#issuecomment-2759241563 thanks @Akeron-Zhu for the update! this improvement will be valuable for the community imo. ps: Our problem is more general due to highly nested schemas which spark not han

Re: [I] DatasourceV2 does not prune columns after V2ScanRelationPushDown [iceberg]

2025-03-26 Thread via GitHub
Akeron-Zhu commented on issue #9268: URL: https://github.com/apache/iceberg/issues/9268#issuecomment-2756506856 Hi, @IgorBerman @akshayakp97 @rdblue , I also encountered this problem in last year, it is because the Spark3 DSV2 only prune column at V2ScanRelationPushDown, but the later Rewri

Re: [I] DatasourceV2 does not prune columns after V2ScanRelationPushDown [iceberg]

2025-03-23 Thread via GitHub
github-actions[bot] commented on issue #9268: URL: https://github.com/apache/iceberg/issues/9268#issuecomment-2746574876 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] DatasourceV2 does not prune columns after V2ScanRelationPushDown [iceberg]

2024-09-23 Thread via GitHub
IgorBerman commented on issue #9268: URL: https://github.com/apache/iceberg/issues/9268#issuecomment-2370296642 Hi @rdblue and @aokolnychyi Do you have new ideas regarding this issue and in general maybe you can provide pointers if Iceberg implements column pruning for highly nested sch

Re: [I] DatasourceV2 does not prune columns after V2ScanRelationPushDown [iceberg]

2023-12-12 Thread via GitHub
rdblue commented on issue #9268: URL: https://github.com/apache/iceberg/issues/9268#issuecomment-1852568441 @aokolnychyi are you aware of this issue? It looks like some additional pruning may be done after pushdown happens? -- This is an automated message from the Apache Git Service. To r

Re: [I] DatasourceV2 does not prune columns after V2ScanRelationPushDown [iceberg]

2023-12-11 Thread via GitHub
akshayakp97 commented on issue #9268: URL: https://github.com/apache/iceberg/issues/9268#issuecomment-1850822846 In general, if a `Project` is added after the execution of `V2ScanRelationPushDown` rule - how do the columns get pruned? Or, do we not expect any new `Project`'s? -- This is

Re: [I] DatasourceV2 does not prune columns after V2ScanRelationPushDown [iceberg]

2023-12-11 Thread via GitHub
akshayakp97 commented on issue #9268: URL: https://github.com/apache/iceberg/issues/9268#issuecomment-1850797730 After `ColumnPruning` adds the new `Project [cs_warehouse_sk#54840, cs_order_number#54843L]`, when `V2ScanRelationPushDown` rule triggers, it doesn't match the [`ScanOperation`]

Re: [I] DatasourceV2 does not prune columns after V2ScanRelationPushDown [iceberg]

2023-12-11 Thread via GitHub
akshayakp97 commented on issue #9268: URL: https://github.com/apache/iceberg/issues/9268#issuecomment-1850779601 Thanks for your response. I am looking at TPCDS q16 physical plan for Iceberg on EMR. Link to q16 - https://github.com/apache/spark/blob/a78d6ce376edf2a8836e01f47b

Re: [I] DatasourceV2 does not prune columns after V2ScanRelationPushDown [iceberg]

2023-12-11 Thread via GitHub
rdblue commented on issue #9268: URL: https://github.com/apache/iceberg/issues/9268#issuecomment-1850720036 I don't think I'm following the logic here. Is there a case where you're not seeing columns being properly pruned? -- This is an automated message from the Apache Git Service. To re

[I] DatasourceV2 does not prune columns after V2ScanRelationPushDown [iceberg]

2023-12-10 Thread via GitHub
akshayakp97 opened a new issue, #9268: URL: https://github.com/apache/iceberg/issues/9268 ### Query engine Query Engine: Spark 3.5.0 Apache Iceberg: 1.4.2 ### Question Hi, My understanding is that Spark Optimizer can add new `Project` operator even after V2 Re