Re: [PR] feat: Push down hashes to probe side in HashJoinExec [datafusion]

2025-10-03 Thread via GitHub
LiaCastaneda commented on code in PR #17529: URL: https://github.com/apache/datafusion/pull/17529#discussion_r2401473105 ## datafusion/physical-plan/src/joins/hash_join/information_passing.rs: ## @@ -0,0 +1,612 @@ +// Licensed to the Apache Software Foundation (ASF) under one +/

Re: [PR] Improve `InListExpr` plan display [datafusion]

2025-10-03 Thread via GitHub
pepijnve commented on code in PR #17884: URL: https://github.com/apache/datafusion/pull/17884#discussion_r2401508429 ## datafusion/physical-expr/src/expressions/in_list.rs: ## @@ -1453,31 +1464,31 @@ mod tests { let sql_string = fmt_sql(expr.as_ref()).to_string();

Re: [I] `SortMergeJoinExec` fails to allocate memory but should spill instead [datafusion-comet]

2025-10-03 Thread via GitHub
comphead commented on issue #2452: URL: https://github.com/apache/datafusion-comet/issues/2452#issuecomment-3358647566 ``` ExternalSorterMerge[9]#1169(can spill: false) consumed 5.0 MB, peak 10.0 MB, ExternalSorterMerge[9]#1171(can spill: false) consumed 866.9 KB, peak 10.0 MB,

Re: [PR] optimizer: allow projection pushdown through aliased recursive CTE references [datafusion]

2025-10-03 Thread via GitHub
kosiew commented on code in PR #17875: URL: https://github.com/apache/datafusion/pull/17875#discussion_r2401562160 ## datafusion/core/tests/sql/explain_analyze.rs: ## @@ -727,6 +727,98 @@ async fn parquet_explain_analyze() { assert_contains!(&formatted, "row_groups_pruned_s

Re: [I] Performance of `distinct on (columns)` [datafusion]

2025-10-03 Thread via GitHub
alamb commented on issue #16620: URL: https://github.com/apache/datafusion/issues/16620#issuecomment-3365481120 @debajyoti-truefoundry -- could you possible provide a self-contained reproducer (e.g. the data you used for the queries above, or some syntehtic version that has the same proper

Re: [PR] feat: optimize and unparse grouping [datafusion]

2025-10-03 Thread via GitHub
chenkovsky commented on PR #16161: URL: https://github.com/apache/datafusion/pull/16161#issuecomment-3367555224 > Hi, I ran my code using this branch and unfortunately it did not solve my issue (https://github.com/apache/datafusion/issues/16590). hi,could you please push your code, th

Re: [I] Java.lang.NoSuchMethodError thrown by Comet Shim [datafusion-comet]

2025-10-03 Thread via GitHub
parthchandra commented on issue #2520: URL: https://github.com/apache/datafusion-comet/issues/2520#issuecomment-3366404371 Also, FWIW, that specific method has been historically problematic because it is one of the rare places where Comet uses a private Spark method. Because it is private,

Re: [PR] Push Down Filter Subexpressions in Nested Loop Joins as Projections [datafusion]

2025-10-03 Thread via GitHub
alamb commented on PR #17906: URL: https://github.com/apache/datafusion/pull/17906#issuecomment-3367082486 🤖 `./gh_compare_branch.sh` [Benchmark Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch.sh) Running Linux aal-dev 6.14.0-1016-gcp #17~24.04.1-Ubun

Re: [PR] chore: Add memory pool trace logging [datafusion-comet]

2025-10-03 Thread via GitHub
andygrove commented on PR #2521: URL: https://github.com/apache/datafusion-comet/pull/2521#issuecomment-3367237253 Still experimenting... https://github.com/user-attachments/assets/951feeba-e2d9-4c5d-b8db-5fac93f1cbc2"; /> -- This is an automated message from the Apache Git Se

[PR] resolve projection against `ListingTable` table_schema incl. partitio… [datafusion]

2025-10-03 Thread via GitHub
mach-kernel opened a new pull request, #17911: URL: https://github.com/apache/datafusion/pull/17911 ## Which issue does this PR close? #15718, though I can file a new bug report. Should be easy to on a round-tripped plan against a Hive-partitioned table. ## Rationale for this c

Re: [PR] Added support for SQLite triggers [datafusion-sqlparser-rs]

2025-10-03 Thread via GitHub
LucaCappelletti94 commented on PR #2037: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/2037#issuecomment-3364727585 Is there anything else needed to be done for this PR? @iffyio -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [I] PROPOSAL Hash Join Spilling Proposal [datafusion]

2025-10-03 Thread via GitHub
Dandandan commented on issue #17267: URL: https://github.com/apache/datafusion/issues/17267#issuecomment-3364471092 > From my tests SMJ requires more memory than HJ with a large number of partitions (like 32), so switching from HJ to SMJ makes memory situation worse not better in that case.

[PR] chore: Update artifacts to 0.10.0 [datafusion-comet]

2025-10-03 Thread via GitHub
comphead opened a new pull request, #2500: URL: https://github.com/apache/datafusion-comet/pull/2500 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes

[I] Hello, can I contribute some keywords(CATALOGS, SWITCH) [datafusion-sqlparser-rs]

2025-10-03 Thread via GitHub
Smith-Cruise opened a new issue, #2055: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/2055 I'm implementing my own database now, my sql's grammar likes `starrocks`, `doris`, or `oceanbase` I want to add `CATALOGS` and `SWITCH` keywords. Can I contribute code directly?

Re: [I] Datafusion-cli installation problem [datafusion]

2025-10-03 Thread via GitHub
albertoRamon commented on issue #17895: URL: https://github.com/apache/datafusion/issues/17895#issuecomment-3365402091 I think that this issue is coming from Rust installation ( I had the same issue) As part of Rust install , automatically try to install "Visual Build Tools" and if y

[I] Converting `TableConstraint` struct enum variants into separated structs [datafusion-sqlparser-rs]

2025-10-03 Thread via GitHub
LucaCappelletti94 opened a new issue, #2053: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/2053 Hi, I need to implement traits describing for instance `CheckConstraint` and `UniqueConstraint` and so on for various crates, including sqlparser. These objects are c

[I] Quadratic runtime in MinMaxBytesAccumulator [datafusion]

2025-10-03 Thread via GitHub
ctsk opened a new issue, #17897: URL: https://github.com/apache/datafusion/issues/17897 ### Describe the bug MinMaxBytesAccumulator's update_batch function has runtime that quadratic in the number of groups accumulated: On each update_batch call, the implementation allocates a new ve

[PR] Moved constraint variant outside of `TableConstraint` enum [datafusion-sqlparser-rs]

2025-10-03 Thread via GitHub
LucaCappelletti94 opened a new pull request, #2054: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/2054 As per title, this PR moved all of the struct variants out of the `TableConstraint` enum. This is done to allow for implementing traits in dependent crates which only apply t

Re: [PR] [WIP] Upgrade to arrow/parquet 57.0.0 [datafusion]

2025-10-03 Thread via GitHub
alamb commented on PR #17888: URL: https://github.com/apache/datafusion/pull/17888#issuecomment-3367208762 Ok, the tests are now looking good enough to test with the new thrift decoder -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [Branch-50] Backport: Support Decimal32/64 types (#17501) [datafusion]

2025-10-03 Thread via GitHub
alamb commented on code in PR #17907: URL: https://github.com/apache/datafusion/pull/17907#discussion_r2403374748 ## datafusion/common/src/scalar/mod.rs: ## @@ -231,6 +233,10 @@ pub enum ScalarValue { Float32(Option), /// 64bit float Float64(Option), +/// 32bi

Re: [PR] minor: Add trace logging to MemoryReservation [datafusion]

2025-10-03 Thread via GitHub
andygrove commented on PR #17902: URL: https://github.com/apache/datafusion/pull/17902#issuecomment-3366089143 Actually, I have a better idea ... closing this for now -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Consolidate `apply_schema_adapter_tests` [datafusion]

2025-10-03 Thread via GitHub
adriangb merged PR #17905: URL: https://github.com/apache/datafusion/pull/17905 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

[PR] test: verify handling of groupings in Substrait [datafusion]

2025-10-03 Thread via GitHub
vbarua opened a new pull request, #17909: URL: https://github.com/apache/datafusion/pull/17909 ## Which issue does this PR close? - Closes https://github.com/apache/datafusion/issues/16590. ## Rationale for this change ## What changes are included in this

[I] Add docstring for `Window::try_new_with_schema` [datafusion]

2025-10-03 Thread via GitHub
Jefffrey opened a new issue, #17912: URL: https://github.com/apache/datafusion/issues/17912 ### Is your feature request related to a problem or challenge? https://github.com/apache/datafusion/blob/daeb6597a0c7344735460bb2dce13879fd89d7bd/datafusion/expr/src/logical_plan/plan.rs#L2546-

Re: [PR] feat: Parquet Modular Encryption with Spark KMS for native readers [datafusion-comet]

2025-10-03 Thread via GitHub
parthchandra commented on code in PR #2447: URL: https://github.com/apache/datafusion-comet/pull/2447#discussion_r2403518726 ## common/src/main/java/org/apache/comet/parquet/CometFileKeyUnwrapper.java: ## @@ -0,0 +1,145 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] Feat: [datafusion-spark] Migrate avg from comet to datafusion-spark and add tests. [datafusion]

2025-10-03 Thread via GitHub
Jefffrey commented on code in PR #17871: URL: https://github.com/apache/datafusion/pull/17871#discussion_r2403660919 ## datafusion/spark/src/function/aggregate/avg.rs: ## @@ -0,0 +1,351 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor lic

Re: [PR] Unify Table representations [datafusion-python]

2025-10-03 Thread via GitHub
kosiew commented on PR #1256: URL: https://github.com/apache/datafusion-python/pull/1256#issuecomment-3364588624 @timsaucer, here's an example that shows the NotImplementedError examples/table_capsule_failure.py ```python """Demonstrate how missing __datafusion_table_

Re: [PR] Relax constraint that file sort order must only reference individual columns [datafusion]

2025-10-03 Thread via GitHub
pepijnve commented on PR #17419: URL: https://github.com/apache/datafusion/pull/17419#issuecomment-3364630280 > `samply record -- target/profiling/deps/sql_planner-1adcb045f71bd635 --bench physical_plan_clickbench_q43` I'll try samply next time. Was that on your macOS machine or on

[I] Limit is not pushed down SortPreservingMergeExec [datafusion]

2025-10-03 Thread via GitHub
Dandandan opened a new issue, #17894: URL: https://github.com/apache/datafusion/issues/17894 ### Describe the bug Currently, the limit can be pushed down: ``` --GlobalLimitExec BoundedWindowAggExec: wdw= --SortPreservingMergeExec: [c1@2 ASC NULLS LAST, c2@3 ASC

Re: [PR] infra: macos-13 is deprecated [datafusion-ballista]

2025-10-03 Thread via GitHub
milenkovicm commented on PR #1324: URL: https://github.com/apache/datafusion-ballista/pull/1324#issuecomment-3364708273 thanks @kevinjqliu -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [I] Parser error with GROUP BY with multiple filters on DataFusion 45 [datafusion]

2025-10-03 Thread via GitHub
Jefffrey commented on issue #14633: URL: https://github.com/apache/datafusion/issues/14633#issuecomment-3364438632 Filter clause for aggregations is now supported in generic dialect, see #17807 (and its references #16516 and #15719) e.g. on main: ```sh datafusion-cli$ cargo

[PR] chore(deps): bump taiki-e/install-action from 2.62.16 to 2.62.17 [datafusion]

2025-10-03 Thread via GitHub
dependabot[bot] opened a new pull request, #17896: URL: https://github.com/apache/datafusion/pull/17896 Bumps [taiki-e/install-action](https://github.com/taiki-e/install-action) from 2.62.16 to 2.62.17. Release notes Sourced from https://github.com/taiki-e/install-action/releases";

[I] Datafusion-cli installation problem [datafusion]

2025-10-03 Thread via GitHub
HeWhoHeWho opened a new issue, #17895: URL: https://github.com/apache/datafusion/issues/17895 Hi, I have been trying to get `datafusion-cli` installed via `cargo install datafusion-cli` but kept getting the error: `CMake Error at CMakeLists.txt:13 (project): Failed to run MSBuild c

Re: [PR] Add JoinType preservation helpers and `dynamic_filter_side`; enable dynamic filter pushdown in HashJoinExec [datafusion]

2025-10-03 Thread via GitHub
kosiew commented on PR #17518: URL: https://github.com/apache/datafusion/pull/17518#issuecomment-3364631364 @crystalxyz, > I left some comments about the design here, but if you have other priorities, let me know and I'm happy to help on this as well! By all means. I would love

[I] Implement `GroupsAccumulator` for `first_value` aggregate (speed up `first_value` and `DISTINCT ON` queries) [datafusion]

2025-10-03 Thread via GitHub
alamb opened a new issue, #17899: URL: https://github.com/apache/datafusion/issues/17899 ### Is your feature request related to a problem or challenge? As reported in https://github.com/apache/datafusion/issues/16620 by @debajyoti-truefoundry, evaluting `DISTINCT ON` results in a quer

Re: [I] Performance of `distinct on (columns)` [datafusion]

2025-10-03 Thread via GitHub
alamb commented on issue #16620: URL: https://github.com/apache/datafusion/issues/16620#issuecomment-3365500499 Filed - https://github.com/apache/datafusion/issues/17899 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [I] Datafusion-cli installation problem [datafusion]

2025-10-03 Thread via GitHub
HeWhoHeWho commented on issue #17895: URL: https://github.com/apache/datafusion/issues/17895#issuecomment-3365518176 Thanks for the tip! Let me try that out in a bit. Just curious, did you also add `path/to/bin/CMake.exe` to Path? Because I did that, otherwise it will trigger error `m

[PR] chore(deps): bump taiki-e/install-action from 2.61.8 to 2.62.17 [datafusion-sandbox]

2025-10-03 Thread via GitHub
dependabot[bot] opened a new pull request, #24: URL: https://github.com/apache/datafusion-sandbox/pull/24 Bumps [taiki-e/install-action](https://github.com/taiki-e/install-action) from 2.61.8 to 2.62.17. Release notes Sourced from https://github.com/taiki-e/install-action/releases"

Re: [PR] chore(deps): bump taiki-e/install-action from 2.61.8 to 2.62.16 [datafusion-sandbox]

2025-10-03 Thread via GitHub
dependabot[bot] closed pull request #23: chore(deps): bump taiki-e/install-action from 2.61.8 to 2.62.16 URL: https://github.com/apache/datafusion-sandbox/pull/23 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] chore(deps): bump taiki-e/install-action from 2.61.8 to 2.62.16 [datafusion-sandbox]

2025-10-03 Thread via GitHub
dependabot[bot] commented on PR #23: URL: https://github.com/apache/datafusion-sandbox/pull/23#issuecomment-3365346334 Superseded by #24. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

[I] Java.lang.NoSuchMethodError thrown by Comet Shim [datafusion-comet]

2025-10-03 Thread via GitHub
lfdversluis opened a new issue, #2520: URL: https://github.com/apache/datafusion-comet/issues/2520 ### Describe the bug Following up from #2504, I compiled comet in CentOS 7 to build it against glibc 2.17. Running a simple query, tasks are getting lost due to some metrics function no

Re: [I] Support `reverse` function with `ArrayType` input [datafusion-comet]

2025-10-03 Thread via GitHub
cfmcgrady commented on issue #2478: URL: https://github.com/apache/datafusion-comet/issues/2478#issuecomment-3345240223 > FYI: https://datafusion.apache.org/user-guide/sql/scalar_functions.html#array-reverse @wForget Thanks for pointing this out. I’ll submit a new PR mapping the Spa

Re: [PR] chore(deps): bump taiki-e/install-action from 2.61.8 to 2.62.17 [datafusion-sandbox]

2025-10-03 Thread via GitHub
dependabot[bot] commented on PR #24: URL: https://github.com/apache/datafusion-sandbox/pull/24#issuecomment-3365346309 ### Labels The following labels could not be found: `auto-dependencies`. Please create it before Dependabot can add it to a pull request. Please fix the a

Re: [I] Slow aggregrate query with `array_agg`, Polars is 4 times faster for equal query [datafusion]

2025-10-03 Thread via GitHub
duongcongtoai commented on issue #17446: URL: https://github.com/apache/datafusion/issues/17446#issuecomment-3366427064 ``` Benchmark 1: uv run polar.py sample-1m.parquet Time (mean ± σ): 258.6 ms ± 32.2 ms[User: 514.8 ms, System: 218.1 ms] Range (min … max): 238.5 ms

Re: [I] Quadratic runtime in MinMaxBytesAccumulator [datafusion]

2025-10-03 Thread via GitHub
alamb commented on issue #17897: URL: https://github.com/apache/datafusion/issues/17897#issuecomment-3366527369 That makes sense One thing we could do is to reuse the allocation (put a `mut buffer: Vec` field on) That does still result in non trivial memory and overhead howeve

Re: [PR] Consolidate `apply_schema_adapter_tests` [datafusion]

2025-10-03 Thread via GitHub
adriangb commented on PR #17905: URL: https://github.com/apache/datafusion/pull/17905#issuecomment-3367030105 Thank you @alamb! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [I] `SortMergeJoinExec` fails to allocate memory but should spill instead [datafusion-comet]

2025-10-03 Thread via GitHub
andygrove commented on issue #2452: URL: https://github.com/apache/datafusion-comet/issues/2452#issuecomment-3367039203 The last chart was incorrect. https://github.com/user-attachments/assets/fc942115-8d17-4b5f-8533-507501450d46"; /> Here is where we run into issues: `

Re: [PR] [Branch-50] Backport: Support Decimal32/64 types (#17501) [datafusion]

2025-10-03 Thread via GitHub
alamb commented on PR #17907: URL: https://github.com/apache/datafusion/pull/17907#issuecomment-3367045874 Why does this PR has substantially more lines that the original PR? This PR https://github.com/user-attachments/assets/f4adf867-1ad4-4b64-a563-2b7a765b5d18"; /> Origina

Re: [I] Datafusion-cli installation problem [datafusion]

2025-10-03 Thread via GitHub
HeWhoHeWho commented on issue #17895: URL: https://github.com/apache/datafusion/issues/17895#issuecomment-3365688893 Double checked my MS Build Tools 2022, all relevant tools have been installed with the path manually added just now - still, build failed. This is the full _long_ log o

Re: [I] PROPOSAL Hash Join Spilling Proposal [datafusion]

2025-10-03 Thread via GitHub
comphead commented on issue #17267: URL: https://github.com/apache/datafusion/issues/17267#issuecomment-3366176816 SMJ may require more memory and additional expenses on sorting, so the HJ would be faster in most cases. But SMJ is more robust on limited memory if implemented correctly.

Re: [PR] perf: Faster `string_agg()` aggregate function (1000x speed for no DISTINCT and ORDER case) [datafusion]

2025-10-03 Thread via GitHub
alamb merged PR #17837: URL: https://github.com/apache/datafusion/pull/17837 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Slow aggregrate query with `array_agg`, Polars is 4 times faster for equal query [datafusion]

2025-10-03 Thread via GitHub
duongcongtoai commented on issue #17446: URL: https://github.com/apache/datafusion/issues/17446#issuecomment-3366781389 looks related: https://github.com/apache/datafusion/issues/17445 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] chore: Add memory pool trace logging [datafusion-comet]

2025-10-03 Thread via GitHub
comphead commented on code in PR #2521: URL: https://github.com/apache/datafusion-comet/pull/2521#discussion_r2402894515 ## spark/src/main/scala/org/apache/comet/CometExecIterator.scala: ## @@ -87,9 +87,9 @@ class CometExecIterator( CometSparkSessionExtensions.getCometMem

Re: [PR] [branch-50] Backport: Fix docs.rs build: Replace auto_doc_cfg with doc_cfg [datafusion]

2025-10-03 Thread via GitHub
alamb commented on PR #17890: URL: https://github.com/apache/datafusion/pull/17890#issuecomment-3366826040 🚀 -- than you @AdamGS -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [I] Implement `GroupsAccumulator` for `first_value` aggregate (speed up `first_value` and `DISTINCT ON` queries) [datafusion]

2025-10-03 Thread via GitHub
buraksenn commented on issue #17899: URL: https://github.com/apache/datafusion/issues/17899#issuecomment-3366854137 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] macos-13 is deprecated [datafusion-python]

2025-10-03 Thread via GitHub
timsaucer commented on code in PR #1259: URL: https://github.com/apache/datafusion-python/pull/1259#discussion_r2401989522 ## .github/workflows/build.yml: ## @@ -127,7 +127,7 @@ jobs: build-macos-x86_64: needs: [generate-license] name: Mac x86_64 -runs-on: maco

Re: [PR] Case evaluation improvements [datafusion]

2025-10-03 Thread via GitHub
alamb commented on PR #17898: URL: https://github.com/apache/datafusion/pull/17898#issuecomment-3365752397 🤖: Benchmark completed Details ``` group case_improvements main - -

Re: [I] Limit is not pushed down SortPreservingMergeExec [datafusion]

2025-10-03 Thread via GitHub
Dandandan closed issue #17894: Limit is not pushed down SortPreservingMergeExec URL: https://github.com/apache/datafusion/issues/17894 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] chore(deps): bump taiki-e/install-action from 2.62.16 to 2.62.17 [datafusion]

2025-10-03 Thread via GitHub
alamb merged PR #17896: URL: https://github.com/apache/datafusion/pull/17896 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] [branch-50] Backport: Fix docs.rs build: Replace auto_doc_cfg with doc_cfg [datafusion]

2025-10-03 Thread via GitHub
alamb commented on PR #17890: URL: https://github.com/apache/datafusion/pull/17890#issuecomment-3366258560 Updated to get changes from https://github.com/apache/datafusion/pull/17892 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] feat: Add `backtrace` feature to simplify enabling native backtraces in `CometNativeException` [datafusion-comet]

2025-10-03 Thread via GitHub
comphead commented on PR #2515: URL: https://github.com/apache/datafusion-comet/pull/2515#issuecomment-3366308072 > > Thanks @andygrove > > Its been a while since `backtraces` introduced in DF and I was thinking to replace Comet errors with DFs? So they would have backtrace capabilities

Re: [PR] [Branch-50] Backport: Support Decimal32/64 types (#17501) [datafusion]

2025-10-03 Thread via GitHub
AdamGS commented on PR #17907: URL: https://github.com/apache/datafusion/pull/17907#issuecomment-3367099381 Ok now it seems better, its not a perfect match because some of the changes were based upon https://github.com/apache/datafusion/pull/17459, which wasn't backported. -- This is an

Re: [PR] Push Down Filter Subexpressions in Nested Loop Joins as Projections [datafusion]

2025-10-03 Thread via GitHub
alamb commented on PR #17906: URL: https://github.com/apache/datafusion/pull/17906#issuecomment-3367176309 🤖 `./gh_compare_branch.sh` [Benchmark Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch.sh) Running Linux aal-dev 6.14.0-1016-gcp #17~24.04.1-Ubun

Re: [PR] Push Down Filter Subexpressions in Nested Loop Joins as Projections [datafusion]

2025-10-03 Thread via GitHub
alamb commented on PR #17906: URL: https://github.com/apache/datafusion/pull/17906#issuecomment-3367176235 🤖: Benchmark completed Details ``` Comparing HEAD and feature_nl-join-projection-push-down Benchmark tpch_mem_sf1.json

Re: [PR] chore: Add memory pool trace logging [datafusion-comet]

2025-10-03 Thread via GitHub
andygrove commented on PR #2521: URL: https://github.com/apache/datafusion-comet/pull/2521#issuecomment-3367189413 moving to draft while I work on the Python scripts -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] fix: optimizer `common_sub_expression_eliminate` fails in a window function [datafusion]

2025-10-03 Thread via GitHub
dqkqd commented on PR #17852: URL: https://github.com/apache/datafusion/pull/17852#issuecomment-3367791885 Thank @Jefffrey I added the tests to `window.slt`, however, this test passed even without the change. I verified with `datafusion-cli` and it failed. I'm not sure what's

Re: [I] Datafusion-cli installation problem [datafusion]

2025-10-03 Thread via GitHub
HeWhoHeWho commented on issue #17895: URL: https://github.com/apache/datafusion/issues/17895#issuecomment-3367901824 Very unfortunate that NASM is prohibited from installing in my Windows work machine. Is there any workaround I can navigate through the installation process without t

[PR] Opt array agg [datafusion]

2025-10-03 Thread via GitHub
duongcongtoai opened a new pull request, #17915: URL: https://github.com/apache/datafusion/pull/17915 ## Which issue does this PR close? - Closes #. ## Rationale for this change ## What changes are included in this PR? ## Are these changes t

Re: [PR] Added support for SQLite triggers [datafusion-sqlparser-rs]

2025-10-03 Thread via GitHub
iffyio commented on PR #2037: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/2037#issuecomment-3367962986 @LucaCappelletti94 could you take a look at [this comment](https://github.com/apache/datafusion-sqlparser-rs/pull/2037#discussion_r2362178757)? -- This is an automated m

Re: [PR] Added support for SQLite triggers [datafusion-sqlparser-rs]

2025-10-03 Thread via GitHub
LucaCappelletti94 commented on PR #2037: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/2037#issuecomment-3367965425 Hi @iffyio, I replied to it here: https://github.com/apache/datafusion-sqlparser-rs/pull/2037#discussion_r2379300179 -- This is an automated message from the

Re: [PR] chore(deps): bump substrait from 0.58.0 to 0.60.1 [datafusion]

2025-10-03 Thread via GitHub
alamb commented on PR #17841: URL: https://github.com/apache/datafusion/pull/17841#issuecomment-3366268009 Needs update to new prost that is coming in arrow 57. Let's close this for now See - https://github.com/apache/datafusion/pull/17888 -- This is an automated message from th

Re: [PR] fix: optimizer `common_sub_expression_eliminate` fails in a window function [datafusion]

2025-10-03 Thread via GitHub
Jefffrey commented on PR #17852: URL: https://github.com/apache/datafusion/pull/17852#issuecomment-3367794464 > Thank @Jefffrey > > I added the tests to `window.slt`, however, this test passed even without the change. I verified with `datafusion-cli` and it failed. I'm not sure what's

Re: [PR] chore: Add memory pool trace logging [datafusion-comet]

2025-10-03 Thread via GitHub
parthchandra commented on code in PR #2521: URL: https://github.com/apache/datafusion-comet/pull/2521#discussion_r2402994029 ## native/core/src/execution/memory_pools/logging_pool.rs: ## @@ -0,0 +1,85 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more

Re: [PR] feat: Support distributed plan in `EXPLAIN` command [datafusion-ballista]

2025-10-03 Thread via GitHub
danielhumanmod commented on PR #1309: URL: https://github.com/apache/datafusion-ballista/pull/1309#issuecomment-3367946791 > Sorry, for late reply @danielhumanmod I'm not quite sure, i guess all the metrics are collected at the scheduler side, so scheduler should have it all once job finis

Re: [PR] Feat: [datafusion-spark] Migrate avg from comet to datafusion-spark and add tests. [datafusion]

2025-10-03 Thread via GitHub
Jefffrey commented on code in PR #17871: URL: https://github.com/apache/datafusion/pull/17871#discussion_r2403661517 ## datafusion/spark/src/function/aggregate/avg.rs: ## @@ -0,0 +1,337 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor lic

Re: [PR] fix: optimizer `common_sub_expression_eliminate` fails in a window function [datafusion]

2025-10-03 Thread via GitHub
Jefffrey commented on PR #17852: URL: https://github.com/apache/datafusion/pull/17852#issuecomment-3367817219 Raised #17914 btw just for tracking (don't know if this is happening in other SLT files, but I assume so) -- This is an automated message from the Apache Git Service. To respond t

Re: [PR] Moved constraint variant outside of `TableConstraint` enum [datafusion-sqlparser-rs]

2025-10-03 Thread via GitHub
iffyio commented on code in PR #2054: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/2054#discussion_r2403806157 ## src/ast/table_constraints/check_constraint.rs: ## @@ -0,0 +1,67 @@ +// Licensed to the Apache Software Foundation (ASF) under one Review Comment:

Re: [PR] feat: convert_array_to_scalar_vec respects null elements [datafusion]

2025-10-03 Thread via GitHub
vegarsti commented on PR #17891: URL: https://github.com/apache/datafusion/pull/17891#issuecomment-3367964880 > Thanks @vegarsti I would love to see if there any performance degradations, you can find benches in the project. > > Maybe we can have a separate test for this issues?

Re: [PR] Add support for ClickHouse CSE. [datafusion-sqlparser-rs]

2025-10-03 Thread via GitHub
iffyio commented on code in PR #2024: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/2024#discussion_r2403803942 ## src/dialect/mod.rs: ## @@ -596,6 +596,20 @@ pub trait Dialect: Debug + Any { false } +/// Returns true if the dialect supports Co

Re: [PR] Moved constraint variant outside of `TableConstraint` enum [datafusion-sqlparser-rs]

2025-10-03 Thread via GitHub
LucaCappelletti94 commented on code in PR #2054: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/2054#discussion_r2403807532 ## src/ast/table_constraints/check_constraint.rs: ## @@ -0,0 +1,67 @@ +// Licensed to the Apache Software Foundation (ASF) under one Review