Re: [PR] feat(spark): implement `StringView` for `SparkConcat` [datafusion]

2026-01-25 Thread via GitHub
aryan-212 commented on PR #19984: URL: https://github.com/apache/datafusion/pull/19984#issuecomment-3796405117 Added them @Weijun-H, thanks for reviewing 🙇 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] chore(deps): bump taiki-e/install-action from 2.66.7 to 2.67.9 [datafusion]

2026-01-25 Thread via GitHub
Jefffrey merged PR #19987: URL: https://github.com/apache/datafusion/pull/19987 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] chore(deps): bump quote from 1.0.43 to 1.0.44 [datafusion]

2026-01-25 Thread via GitHub
Jefffrey merged PR #19992: URL: https://github.com/apache/datafusion/pull/19992 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] chore(deps): bump nix from 0.30.1 to 0.31.1 [datafusion]

2026-01-25 Thread via GitHub
Jefffrey merged PR #19991: URL: https://github.com/apache/datafusion/pull/19991 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] chore(deps): bump sysinfo from 0.37.2 to 0.38.0 [datafusion]

2026-01-25 Thread via GitHub
Jefffrey merged PR #19990: URL: https://github.com/apache/datafusion/pull/19990 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

[PR] feat: optimize CASE WHEN for divide-by-zero protection pattern [datafusion]

2026-01-25 Thread via GitHub
CuteChuanChuan opened a new pull request, #19994: URL: https://github.com/apache/datafusion/pull/19994 ## Which issue does this PR close? - Closes #11570. ## Rationale for this change The `CaseExpr` implementation is expensive. A common usage pattern (particularly in TPC

Re: [I] `ReturnFieldArgs.scalar_arguments` type doesn't match with `arg_fields` [datafusion]

2026-01-25 Thread via GitHub
Trikooo commented on issue #19982: URL: https://github.com/apache/datafusion/issues/19982#issuecomment-3796429610 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [PR] feat(spark): implement `StringView` for `SparkConcat` [datafusion]

2026-01-25 Thread via GitHub
Jefffrey commented on code in PR #19984: URL: https://github.com/apache/datafusion/pull/19984#discussion_r2725387672 ## datafusion/spark/src/function/string/concat.rs: ## @@ -89,10 +93,21 @@ impl ScalarUDFImpl for SparkConcat { ) } fn return_field_from_args(&s

Re: [PR] fix: add parentheses to nested binary expression Display [datafusion]

2026-01-25 Thread via GitHub
AndreaBozzo commented on code in PR #19916: URL: https://github.com/apache/datafusion/pull/19916#discussion_r2725391890 ## datafusion/substrait/tests/cases/roundtrip_logical_plan.rs: ## @@ -632,12 +632,12 @@ async fn roundtrip_inlist_5() -> Result<()> { plan, @r#"

Re: [PR] fix: add parentheses to nested binary expression Display [datafusion]

2026-01-25 Thread via GitHub
AndreaBozzo commented on code in PR #19916: URL: https://github.com/apache/datafusion/pull/19916#discussion_r2725391890 ## datafusion/substrait/tests/cases/roundtrip_logical_plan.rs: ## @@ -632,12 +632,12 @@ async fn roundtrip_inlist_5() -> Result<()> { plan, @r#"

Re: [PR] [RFC] Add lambda support and array_transform udf [datafusion]

2026-01-25 Thread via GitHub
keen85 commented on PR #18921: URL: https://github.com/apache/datafusion/pull/18921#issuecomment-3796510353 @gstvg any updates on this one? 😇 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [I] Add documentation/examples on how to use substrait plans with Ballista [datafusion-ballista]

2026-01-25 Thread via GitHub
milenkovicm closed issue #1368: Add documentation/examples on how to use substrait plans with Ballista URL: https://github.com/apache/datafusion-ballista/issues/1368 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] feat: Creating SubstraitSchedulerClient and standalone Substrait examples [datafusion-ballista]

2026-01-25 Thread via GitHub
milenkovicm merged PR #1376: URL: https://github.com/apache/datafusion-ballista/pull/1376 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubsc

Re: [PR] feat: Extract `execution_graph` extract to a trait [datafusion-ballista]

2026-01-25 Thread via GitHub
milenkovicm commented on PR #1361: URL: https://github.com/apache/datafusion-ballista/pull/1361#issuecomment-3796582082 @sqlbenchmark run tpch -s 10 -i 3 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] Support JSON arrays reader/parse for datafusion [datafusion]

2026-01-25 Thread via GitHub
zhuqi-lucas commented on PR #19924: URL: https://github.com/apache/datafusion/pull/19924#issuecomment-3796594181 Thank you @alamb and @martin-g for review, i have redesigned the PR now, it has readable name now, and it should use less memory and also performance should be better. -- Thi

Re: [PR] feat: Add support for round-robin partitioning in native shuffle [datafusion-comet]

2026-01-25 Thread via GitHub
andygrove merged PR #3076: URL: https://github.com/apache/datafusion-comet/pull/3076 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [I] Add round-robin partitioning support to native shuffle [datafusion-comet]

2026-01-25 Thread via GitHub
andygrove closed issue #3067: Add round-robin partitioning support to native shuffle URL: https://github.com/apache/datafusion-comet/issues/3067 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] feat: Add support for round-robin partitioning in native shuffle [datafusion-comet]

2026-01-25 Thread via GitHub
andygrove commented on PR #3076: URL: https://github.com/apache/datafusion-comet/pull/3076#issuecomment-3796600084 Thanks for the reviews @comphead and @wForget -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] feat: optimise copying in `left` for Utf8 and LargeUtf8 [datafusion]

2026-01-25 Thread via GitHub
theirix commented on code in PR #19980: URL: https://github.com/apache/datafusion/pull/19980#discussion_r2725486396 ## datafusion/functions/src/unicode/left.rs: ## @@ -121,61 +125,175 @@ impl ScalarUDFImpl for LeftFunc { /// Returns first n characters in the string, or when n

Re: [PR] feat: optimise copying in `left` for Utf8 and LargeUtf8 [datafusion]

2026-01-25 Thread via GitHub
theirix commented on code in PR #19980: URL: https://github.com/apache/datafusion/pull/19980#discussion_r2725485749 ## datafusion/functions/src/unicode/left.rs: ## @@ -121,61 +125,175 @@ impl ScalarUDFImpl for LeftFunc { /// Returns first n characters in the string, or when n

[PR] fix: compile issue after unsuccessful merge [datafusion-ballista]

2026-01-25 Thread via GitHub
milenkovicm opened a new pull request, #1402: URL: https://github.com/apache/datafusion-ballista/pull/1402 # Which issue does this PR close? Closes #. # Rationale for this change There is compilation issue after merging #1376 # What changes are included in this PR

Re: [PR] feat: Extract `execution_graph` extract to a trait [datafusion-ballista]

2026-01-25 Thread via GitHub
sqlbenchmark commented on PR #1361: URL: https://github.com/apache/datafusion-ballista/pull/1361#issuecomment-3796620993 ## Ballista TPC-H Benchmark Results **PR:** #1361 - execution_graph extract to a trait **PR Commit:** `56b4aee` **Base Commit:** `91a712c` (main) **Scale F

Re: [PR] feat: optimise copying in `left` for Utf8 and LargeUtf8 [datafusion]

2026-01-25 Thread via GitHub
theirix commented on code in PR #19980: URL: https://github.com/apache/datafusion/pull/19980#discussion_r2725488603 ## datafusion/functions/src/unicode/left.rs: ## @@ -121,61 +125,175 @@ impl ScalarUDFImpl for LeftFunc { /// Returns first n characters in the string, or when n

Re: [I] [Feature] Support Spark expression: string_to_map [datafusion-comet]

2026-01-25 Thread via GitHub
unknowntpo commented on issue #3168: URL: https://github.com/apache/datafusion-comet/issues/3168#issuecomment-3796632552 May I take this issue ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] feat(spark): implement `StringView` for `SparkConcat` [datafusion]

2026-01-25 Thread via GitHub
aryan-212 commented on code in PR #19984: URL: https://github.com/apache/datafusion/pull/19984#discussion_r2725501017 ## datafusion/spark/src/function/string/concat.rs: ## @@ -53,9 +53,13 @@ impl Default for SparkConcat { impl SparkConcat { pub fn new() -> Self { +

Re: [PR] feat: implement protobuf converter trait to allow control over serialization and deserialization processes [datafusion]

2026-01-25 Thread via GitHub
timsaucer commented on code in PR #19437: URL: https://github.com/apache/datafusion/pull/19437#discussion_r2725504109 ## docs/source/library-user-guide/upgrading.md: ## @@ -154,6 +154,48 @@ The builder pattern is more efficient as it computes properties once during `bui Note

Re: [PR] feat: optimise copying in `left` for Utf8 and LargeUtf8 [datafusion]

2026-01-25 Thread via GitHub
alamb-ghbot commented on PR #19980: URL: https://github.com/apache/datafusion/pull/19980#issuecomment-3796180367 🤖: Benchmark completed Details ``` Comparing HEAD and left-speedup-bytes Benchmark clickbench_extended.json -

[PR] feat: implement `StringView` for `SparkConcat` [datafusion]

2026-01-25 Thread via GitHub
aryan-212 opened a new pull request, #19984: URL: https://github.com/apache/datafusion/pull/19984 ## Which issue does this PR close? - This PR is part of the [Utf8View support](https://github.com/apache/datafusion/issues/10918) epic. It adds `Utf8View` support in the Spark-co

[PR] example of stateless execution [datafusion]

2026-01-25 Thread via GitHub
askalt opened a new pull request, #19985: URL: https://github.com/apache/datafusion/pull/19985 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] example of stateless execution [datafusion]

2026-01-25 Thread via GitHub
askalt commented on code in PR #19985: URL: https://github.com/apache/datafusion/pull/19985#discussion_r2725148726 ## datafusion/physical-plan/src/projection.rs: ## @@ -313,19 +333,30 @@ impl ExecutionPlan for ProjectionExec { partition: usize, context: Arc,

Re: [PR] example of stateless execution for ProjectionExec [datafusion]

2026-01-25 Thread via GitHub
askalt closed pull request #19986: example of stateless execution for ProjectionExec URL: https://github.com/apache/datafusion/pull/19986 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] example of stateless execution for `ProjectionExec` [datafusion]

2026-01-25 Thread via GitHub
askalt closed pull request #19985: example of stateless execution for `ProjectionExec` URL: https://github.com/apache/datafusion/pull/19985 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[PR] example of stateless execution for ProjectionExec [datafusion]

2026-01-25 Thread via GitHub
askalt opened a new pull request, #19986: URL: https://github.com/apache/datafusion/pull/19986 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [I] Stateless execution plans for plan caching [datafusion]

2026-01-25 Thread via GitHub
askalt commented on issue #19351: URL: https://github.com/apache/datafusion/issues/19351#issuecomment-3796239651 > [@alamb](https://github.com/alamb) As I understand it, [this](https://github.com/askalt/datafusion/commit/07ebbd4d321e4753cf566c40192b124654ff7455) implementation doesn't seem

Re: [I] Parallelize `list_files_for_scan` [datafusion]

2026-01-25 Thread via GitHub
Tushar7012 commented on issue #19971: URL: https://github.com/apache/datafusion/issues/19971#issuecomment-3796243862 Thanks for the assignment and the pointer to PR #19969! I've reviewed your `infer_schema` parallelization approach and understand the pattern now - using tokio spawnin

Re: [PR] perf: Optimize ArrowBytesViewMap with direct view access [datafusion]

2026-01-25 Thread via GitHub
Tushar7012 commented on PR #19975: URL: https://github.com/apache/datafusion/pull/19975#issuecomment-3796247698 Thanks for reviewing the benchmarks! Great to hear the speedup is reproducible. Regarding the builder overhead optimization for other cases - would you like me to explore t

Re: [PR] Change GitHub actions dependabot schedule to weekly [datafusion]

2026-01-25 Thread via GitHub
Weijun-H merged PR #19981: URL: https://github.com/apache/datafusion/pull/19981 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] chore(deps): bump taiki-e/install-action from 2.66.7 to 2.67.4 [datafusion]

2026-01-25 Thread via GitHub
dependabot[bot] closed pull request #19952: chore(deps): bump taiki-e/install-action from 2.66.7 to 2.67.4 URL: https://github.com/apache/datafusion/pull/19952 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] chore(deps): bump taiki-e/install-action from 2.66.7 to 2.67.4 [datafusion]

2026-01-25 Thread via GitHub
dependabot[bot] commented on PR #19952: URL: https://github.com/apache/datafusion/pull/19952#issuecomment-3796261723 Superseded by #19987. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

[PR] chore(deps): bump taiki-e/install-action from 2.66.7 to 2.67.9 [datafusion]

2026-01-25 Thread via GitHub
dependabot[bot] opened a new pull request, #19987: URL: https://github.com/apache/datafusion/pull/19987 Bumps [taiki-e/install-action](https://github.com/taiki-e/install-action) from 2.66.7 to 2.67.9. Release notes Sourced from https://github.com/taiki-e/install-action/releases";>t

[PR] chore(deps): bump setuptools from 80.9.0 to 80.10.1 in /docs [datafusion]

2026-01-25 Thread via GitHub
dependabot[bot] opened a new pull request, #19988: URL: https://github.com/apache/datafusion/pull/19988 Bumps [setuptools](https://github.com/pypa/setuptools) from 80.9.0 to 80.10.1. Changelog Sourced from https://github.com/pypa/setuptools/blob/main/NEWS.rst";>setuptools's change

Re: [PR] Change GitHub actions dependabot schedule to weekly [datafusion]

2026-01-25 Thread via GitHub
Weijun-H commented on PR #19981: URL: https://github.com/apache/datafusion/pull/19981#issuecomment-3796261363 This is very reasonable idea, thanks @Jefffrey and @Dandandan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

[PR] chore(deps): bump nix from 0.30.1 to 0.31.1 [datafusion]

2026-01-25 Thread via GitHub
dependabot[bot] opened a new pull request, #19991: URL: https://github.com/apache/datafusion/pull/19991 Bumps [nix](https://github.com/nix-rust/nix) from 0.30.1 to 0.31.1. Changelog Sourced from https://github.com/nix-rust/nix/blob/master/CHANGELOG.md";>nix's changelog. [0.3

[PR] chore(deps): bump pbjson-types from 0.8.0 to 0.9.0 in the proto group [datafusion]

2026-01-25 Thread via GitHub
dependabot[bot] opened a new pull request, #19989: URL: https://github.com/apache/datafusion/pull/19989 Bumps the proto group with 1 update: [pbjson-types](https://github.com/influxdata/pbjson). Updates `pbjson-types` from 0.8.0 to 0.9.0 Commits See full diff in https://

[PR] chore(deps): bump sysinfo from 0.37.2 to 0.38.0 [datafusion]

2026-01-25 Thread via GitHub
dependabot[bot] opened a new pull request, #19990: URL: https://github.com/apache/datafusion/pull/19990 Bumps [sysinfo](https://github.com/GuillaumeGomez/sysinfo) from 0.37.2 to 0.38.0. Changelog Sourced from https://github.com/GuillaumeGomez/sysinfo/blob/main/CHANGELOG.md";>sysinf

[PR] chore(deps): bump quote from 1.0.43 to 1.0.44 [datafusion]

2026-01-25 Thread via GitHub
dependabot[bot] opened a new pull request, #19992: URL: https://github.com/apache/datafusion/pull/19992 Bumps [quote](https://github.com/dtolnay/quote) from 1.0.43 to 1.0.44. Release notes Sourced from https://github.com/dtolnay/quote/releases";>quote's releases. 1.0.44

[PR] chore(deps): bump uuid from 1.19.0 to 1.20.0 [datafusion]

2026-01-25 Thread via GitHub
dependabot[bot] opened a new pull request, #19993: URL: https://github.com/apache/datafusion/pull/19993 Bumps [uuid](https://github.com/uuid-rs/uuid) from 1.19.0 to 1.20.0. Release notes Sourced from https://github.com/uuid-rs/uuid/releases";>uuid's releases. v1.20.0 What'

Re: [PR] chore(deps): bump setuptools from 80.9.0 to 80.10.1 in /docs [datafusion]

2026-01-25 Thread via GitHub
Jefffrey merged PR #19988: URL: https://github.com/apache/datafusion/pull/19988 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] Feat: create map function [datafusion-comet]

2026-01-25 Thread via GitHub
kazantsev-maksim closed pull request #3223: Feat: create map function URL: https://github.com/apache/datafusion-comet/pull/3223 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] PostgreSQL: Add support for `*` (descendant) option in TRUNCATE [datafusion-sqlparser-rs]

2026-01-25 Thread via GitHub
iffyio merged PR #2181: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/2181 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

Re: [I] [Bug] hour/minute/second expressions incorrectly apply timezone conversion to TimestampNTZ inputs [datafusion-comet]

2026-01-25 Thread via GitHub
andygrove commented on issue #3180: URL: https://github.com/apache/datafusion-comet/issues/3180#issuecomment-3796655530 @vigneshsiva11 We would eventually like to fully support `TimestampNTZ`. You may be interested in reviewing https://github.com/apache/datafusion-comet/pull/3253 -- Thi

Re: [I] [Feature] Support Spark expression: string_to_map [datafusion-comet]

2026-01-25 Thread via GitHub
andygrove commented on issue #3168: URL: https://github.com/apache/datafusion-comet/issues/3168#issuecomment-3796653098 > May I take this issue ? Yes, I assigned the issue to you. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Fix hour/minute/second handling for TimestampNTZ [datafusion-comet]

2026-01-25 Thread via GitHub
codecov-commenter commented on PR #3265: URL: https://github.com/apache/datafusion-comet/pull/3265#issuecomment-3796683424 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/3265?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] feat(spark): implement `StringView` for `SparkConcat` [datafusion]

2026-01-25 Thread via GitHub
aryan-212 commented on code in PR #19984: URL: https://github.com/apache/datafusion/pull/19984#discussion_r2725506023 ## datafusion/spark/src/function/string/concat.rs: ## @@ -89,10 +93,21 @@ impl ScalarUDFImpl for SparkConcat { ) } fn return_field_from_args(&

Re: [PR] feat: optimise copying in `left` for Utf8 and LargeUtf8 [datafusion]

2026-01-25 Thread via GitHub
theirix commented on PR #19980: URL: https://github.com/apache/datafusion/pull/19980#issuecomment-379092 Run benchmarks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] feat: optimise copying in `left` for Utf8 and LargeUtf8 [datafusion]

2026-01-25 Thread via GitHub
alamb-ghbot commented on PR #19980: URL: https://github.com/apache/datafusion/pull/19980#issuecomment-379172 🤖 Hi @theirix, thanks for the request (https://github.com/apache/datafusion/pull/19980#issuecomment-379092). [`scrape_comments.py`](https://github.com/alamb/datafusion-benchm

Re: [I] Parallelize `list_files_for_scan` [datafusion]

2026-01-25 Thread via GitHub
alamb commented on issue #19971: URL: https://github.com/apache/datafusion/issues/19971#issuecomment-3796671713 FYI I think @BlakeOrth has worked on this before -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] feat: optimise copying in `left` for Utf8 and LargeUtf8 [datafusion]

2026-01-25 Thread via GitHub
theirix commented on PR #19980: URL: https://github.com/apache/datafusion/pull/19980#issuecomment-3796678559 Thank you for the review! > Could you help me understand which changes here make it O(1)? It's for memory complexity. We avoid an extra copy of the string into `chars_b

Re: [PR] feat(spark): implement `StringView` for `SparkConcat` [datafusion]

2026-01-25 Thread via GitHub
aryan-212 commented on code in PR #19984: URL: https://github.com/apache/datafusion/pull/19984#discussion_r2725541812 ## datafusion/spark/src/function/string/concat.rs: ## @@ -89,10 +93,21 @@ impl ScalarUDFImpl for SparkConcat { ) } fn return_field_from_args(&

Re: [PR] Fix ClickBench EventDate handling by casting UInt16 days-since-epoch to DATE via `hits` view [datafusion]

2026-01-25 Thread via GitHub
alamb-ghbot commented on PR #19881: URL: https://github.com/apache/datafusion/pull/19881#issuecomment-3796728026 🤖: Benchmark completed Details ``` Comparing HEAD and eventdate-handling-18982 Benchmark clickbench_extended.json ---

Re: [PR] Fix ClickBench EventDate handling by casting UInt16 days-since-epoch to DATE via `hits` view [datafusion]

2026-01-25 Thread via GitHub
alamb-ghbot commented on PR #19881: URL: https://github.com/apache/datafusion/pull/19881#issuecomment-3796728109 🤖 `./gh_compare_branch.sh` [gh_compare_branch.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_branch.sh) Running Linux aal-dev 6.14.0-1018-gc

Re: [PR] DataFusion 52 release post [datafusion-site]

2026-01-25 Thread via GitHub
alamb commented on code in PR #135: URL: https://github.com/apache/datafusion-site/pull/135#discussion_r2725567315 ## content/blog/2026-01-08-datafusion-52.0.0.md: ## @@ -0,0 +1,405 @@ +--- +layout: post +title: Apache DataFusion 52.0.0 Released +date: 2026-01-08 +author: pmc +c

Re: [PR] DataFusion 52 release post [datafusion-site]

2026-01-25 Thread via GitHub
zhuqi-lucas commented on code in PR #135: URL: https://github.com/apache/datafusion-site/pull/135#discussion_r2725569258 ## content/blog/2026-01-08-datafusion-52.0.0.md: ## @@ -0,0 +1,405 @@ +--- +layout: post +title: Apache DataFusion 52.0.0 Released +date: 2026-01-08 +author:

Re: [PR] Fix ClickBench EventDate handling by casting UInt16 days-since-epoch to DATE via `hits` view [datafusion]

2026-01-25 Thread via GitHub
alamb-ghbot commented on PR #19881: URL: https://github.com/apache/datafusion/pull/19881#issuecomment-3796745052 🤖: Benchmark completed Details ``` Comparing HEAD and eventdate-handling-18982 Benchmark clickbench_partitioned.json

[PR] perf: improve performance of `array_remove`, `array_remove_n` and `array_remove_all` functions [datafusion]

2026-01-25 Thread via GitHub
lyne7-sc opened a new pull request, #19996: URL: https://github.com/apache/datafusion/pull/19996 ## Which issue does this PR close? - Part of https://github.com/apache/datafusion-comet/issues/2986 ## Rationale for this change The current implementation of

Re: [I] [Incompatibility] Document array_join null handling differences [datafusion-comet]

2026-01-25 Thread via GitHub
andygrove commented on issue #3178: URL: https://github.com/apache/datafusion-comet/issues/3178#issuecomment-3796656830 > .take @vbhavh 👍 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] fix: Add JDK to Docker image for release build [datafusion-comet]

2026-01-25 Thread via GitHub
andygrove commented on PR #3262: URL: https://github.com/apache/datafusion-comet/pull/3262#issuecomment-3796669193 @hsiang-c you can merge latest from main to fix the CI failures -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [PR] feat: optimise copying in `left` for Utf8 and LargeUtf8 [datafusion]

2026-01-25 Thread via GitHub
theirix commented on code in PR #19980: URL: https://github.com/apache/datafusion/pull/19980#discussion_r2725523328 ## datafusion/functions/src/unicode/left.rs: ## @@ -121,61 +125,175 @@ impl ScalarUDFImpl for LeftFunc { /// Returns first n characters in the string, or when n

Re: [I] Parallelize `infer_schema` [datafusion]

2026-01-25 Thread via GitHub
alamb commented on issue #19970: URL: https://github.com/apache/datafusion/issues/19970#issuecomment-3796673581 FWIW the actual clickbench runs (on benchmark.clickhouse.com) don't count the time to `CREATE EXTERNAL TABLE` So if the schema inference isn't part of of the actual query,

Re: [PR] fix: compile issue after unsuccessful merge [datafusion-ballista]

2026-01-25 Thread via GitHub
milenkovicm merged PR #1402: URL: https://github.com/apache/datafusion-ballista/pull/1402 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubsc

Re: [I] Potential invalid ClickBench results [datafusion]

2026-01-25 Thread via GitHub
alamb commented on issue #18982: URL: https://github.com/apache/datafusion/issues/18982#issuecomment-3796675121 not casting to string will also be much faster I suspect (and let pushdown happen more 🤔 ) -- This is an automated message from the Apache Git Service. To respond to the messag

Re: [PR] Fix ClickBench EventDate handling by casting UInt16 days-since-epoch to DATE via `hits` view [datafusion]

2026-01-25 Thread via GitHub
alamb commented on PR #19881: URL: https://github.com/apache/datafusion/pull/19881#issuecomment-3796675514 run benchmark clickbench_partitioned -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Fix ClickBench EventDate handling by casting UInt16 days-since-epoch to DATE via `hits` view [datafusion]

2026-01-25 Thread via GitHub
alamb-ghbot commented on PR #19881: URL: https://github.com/apache/datafusion/pull/19881#issuecomment-3796675504 🤖 `./gh_compare_branch.sh` [gh_compare_branch.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_branch.sh) Running Linux aal-dev 6.14.0-1018-gc

Re: [PR] Fix ClickBench EventDate handling by casting UInt16 days-since-epoch to DATE via `hits` view [datafusion]

2026-01-25 Thread via GitHub
alamb commented on PR #19881: URL: https://github.com/apache/datafusion/pull/19881#issuecomment-3796675300 run benchmarks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] Planning time for queries with many columns with union and order by is very slow [datafusion]

2026-01-25 Thread via GitHub
Omega359 commented on issue #17261: URL: https://github.com/apache/datafusion/issues/17261#issuecomment-3796721056 ``` Benchmarking logical_plan_optimize: Warming up for 3. s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 19863.9s, or reduc

Re: [PR] chore: Add take/untake workflow for issue self-assignment [datafusion-comet]

2026-01-25 Thread via GitHub
codecov-commenter commented on PR #3270: URL: https://github.com/apache/datafusion-comet/pull/3270#issuecomment-3796720436 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/3270?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] Feat: add support for `elt` expression [datafusion-comet]

2026-01-25 Thread via GitHub
kazantsev-maksim commented on code in PR #3269: URL: https://github.com/apache/datafusion-comet/pull/3269#discussion_r2725561731 ## spark/src/main/scala/org/apache/comet/serde/strings.scala: ## @@ -289,6 +289,16 @@ object CometRegExpReplace extends CometExpressionSerde[RegExpRe

[PR] ci: Consolidate Spark SQL test jobs to reduce CI time [datafusion-comet]

2026-01-25 Thread via GitHub
andygrove opened a new pull request, #3271: URL: https://github.com/apache/datafusion-comet/pull/3271 ## Summary Merges three near-identical Spark SQL test jobs into a single consolidated job with a config matrix: - `spark-sql-auto-scan` (was 20 jobs) - `spark-sql-native-nat

[PR] Feat: add support for `elt` expression [datafusion-comet]

2026-01-25 Thread via GitHub
kazantsev-maksim opened a new pull request, #3269: URL: https://github.com/apache/datafusion-comet/pull/3269 ## Which issue does this PR close? N/A ## Rationale for this change ## What changes are included in this PR? ## How are these changes tested? -- This

Re: [PR] Fix ClickBench EventDate handling by casting UInt16 days-since-epoch to DATE via `hits` view [datafusion]

2026-01-25 Thread via GitHub
alamb commented on code in PR #19881: URL: https://github.com/apache/datafusion/pull/19881#discussion_r2725539871 ## datafusion/sqllogictest/test_files/clickbench.slt: ## @@ -64,10 +82,10 @@ SELECT COUNT(DISTINCT "SearchPhrase") FROM hits; 1 -query II +query DD SELECT

Re: [I] [GSoC] GSoC 2026 - Mentoring Org Application Discussion [datafusion-comet]

2026-01-25 Thread via GitHub
andygrove commented on issue #3082: URL: https://github.com/apache/datafusion-comet/issues/3082#issuecomment-3796695773 This is a great idea, but I agree with @mbutrovich that we need to find willing mentors first. I also do not have time for this, unfortunately. -- This is an automated

[I] Update ClickBench runner scripts to treat EventDate as a Date (not Int16)) [datafusion]

2026-01-25 Thread via GitHub
alamb opened a new issue, #19995: URL: https://github.com/apache/datafusion/issues/19995 ### Describe the bug - Part of https://github.com/apache/datafusion/issues/18982 @nuno-faria found that ClickBench is treating EventDate as a INT16 rather than Date, resulting in questiona

Re: [PR] Fix ClickBench EventDate handling by casting UInt16 days-since-epoch to DATE via `hits` view [datafusion]

2026-01-25 Thread via GitHub
alamb commented on PR #19881: URL: https://github.com/apache/datafusion/pull/19881#issuecomment-3796697435 Filed a ticket to track updating clickbench scripts: - https://github.com/apache/datafusion/issues/19995 -- This is an automated message from the Apache Git Service. To respond to

Re: [I] Update ClickBench runner scripts to treat EventDate as a Date (not Int16)) [datafusion]

2026-01-25 Thread via GitHub
alamb commented on issue #19995: URL: https://github.com/apache/datafusion/issues/19995#issuecomment-3796697170 FYI @waynexia as this may be related to your work - #19826 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[PR] chore: Add take/untake workflow for issue self-assignment [datafusion-comet]

2026-01-25 Thread via GitHub
andygrove opened a new pull request, #3270: URL: https://github.com/apache/datafusion-comet/pull/3270 ## Summary - Adds a GitHub Actions workflow that allows contributors to self-assign issues by commenting `take` and unassign by commenting `untake` - Updates the contributor guide to d

Re: [PR] DataFusion 52 release post [datafusion-site]

2026-01-25 Thread via GitHub
alamb commented on code in PR #135: URL: https://github.com/apache/datafusion-site/pull/135#discussion_r2725552009 ## content/blog/2026-01-08-datafusion-52.0.0.md: ## @@ -0,0 +1,405 @@ +--- +layout: post +title: Apache DataFusion 52.0.0 Released +date: 2026-01-08 +author: pmc +c

Re: [PR] ci: Consolidate Spark SQL test jobs to reduce CI time [datafusion-comet]

2026-01-25 Thread via GitHub
codecov-commenter commented on PR #3271: URL: https://github.com/apache/datafusion-comet/pull/3271#issuecomment-3796739092 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/3271?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] feat: Native columnar to row conversion (Phase 1) [datafusion-comet]

2026-01-25 Thread via GitHub
andygrove merged PR #3221: URL: https://github.com/apache/datafusion-comet/pull/3221 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [PR] feat(spark): implement `StringView` for `SparkConcat` [datafusion]

2026-01-25 Thread via GitHub
aryan-212 commented on code in PR #19984: URL: https://github.com/apache/datafusion/pull/19984#discussion_r2725506023 ## datafusion/spark/src/function/string/concat.rs: ## @@ -89,10 +93,21 @@ impl ScalarUDFImpl for SparkConcat { ) } fn return_field_from_args(&

Re: [PR] feat: Native columnar to row conversion (Phase 2) [datafusion-comet]

2026-01-25 Thread via GitHub
andygrove commented on code in PR #3266: URL: https://github.com/apache/datafusion-comet/pull/3266#discussion_r2725614700 ## spark/src/main/scala/org/apache/spark/sql/comet/CometNativeColumnarToRowExec.scala: ## @@ -64,6 +74,105 @@ case class CometNativeColumnarToRowExec(child:

Re: [PR] Support different scales for decimal binary math functions [datafusion]

2026-01-25 Thread via GitHub
theirix commented on PR #19874: URL: https://github.com/apache/datafusion/pull/19874#issuecomment-3796809826 > I'm having trouble understanding the rationale here; `log`, `power` and `round` at most have one decimal input, and only `round` preserves the decimal type whereas the others will

Re: [PR] feat(spark): implement `StringView` for `SparkConcat` [datafusion]

2026-01-25 Thread via GitHub
aryan-212 commented on code in PR #19984: URL: https://github.com/apache/datafusion/pull/19984#discussion_r2725610650 ## datafusion/spark/src/function/string/concat.rs: ## @@ -89,10 +93,21 @@ impl ScalarUDFImpl for SparkConcat { ) } fn return_field_from_args(&

Re: [PR] feat: Native columnar to row conversion (Phase 2) [datafusion-comet]

2026-01-25 Thread via GitHub
andygrove commented on code in PR #3266: URL: https://github.com/apache/datafusion-comet/pull/3266#discussion_r2725615807 ## spark/src/main/scala/org/apache/spark/sql/comet/CometNativeColumnarToRowExec.scala: ## @@ -64,6 +74,105 @@ case class CometNativeColumnarToRowExec(child:

Re: [PR] feat: optimize CASE WHEN for divide-by-zero protection pattern [datafusion]

2026-01-25 Thread via GitHub
Dandandan commented on PR #19994: URL: https://github.com/apache/datafusion/pull/19994#issuecomment-3796824639 run benchmark tpcds -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] feat: optimize CASE WHEN for divide-by-zero protection pattern [datafusion]

2026-01-25 Thread via GitHub
alamb-ghbot commented on PR #19994: URL: https://github.com/apache/datafusion/pull/19994#issuecomment-3796824732 🤖 `./gh_compare_branch.sh` [gh_compare_branch.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_branch.sh) Running Linux aal-dev 6.14.0-1018-gc

Re: [PR] feat(spark): implement `StringView` for `SparkConcat` [datafusion]

2026-01-25 Thread via GitHub
aryan-212 commented on code in PR #19984: URL: https://github.com/apache/datafusion/pull/19984#discussion_r2725615170 ## datafusion/spark/src/function/string/concat.rs: ## @@ -53,9 +53,13 @@ impl Default for SparkConcat { impl SparkConcat { pub fn new() -> Self { +

Re: [PR] chore(deps): bump pbjson-types from 0.8.0 to 0.9.0 in the proto group [datafusion]

2026-01-25 Thread via GitHub
dependabot[bot] commented on PR #19989: URL: https://github.com/apache/datafusion/pull/19989#issuecomment-3796828521 This pull request was built based on a group rule. Closing it will not ignore any of these versions in future pull requests. To ignore these dependencies, configure [ig

Re: [PR] chore(deps): bump pbjson-types from 0.8.0 to 0.9.0 in the proto group [datafusion]

2026-01-25 Thread via GitHub
Jefffrey closed pull request #19989: chore(deps): bump pbjson-types from 0.8.0 to 0.9.0 in the proto group URL: https://github.com/apache/datafusion/pull/19989 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] chore(deps): bump pbjson-types from 0.8.0 to 0.9.0 in the proto group [datafusion]

2026-01-25 Thread via GitHub
Jefffrey commented on PR #19989: URL: https://github.com/apache/datafusion/pull/19989#issuecomment-3796828486 Needs substrait version bump -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] feat: Native columnar to row conversion (Phase 2) [datafusion-comet]

2026-01-25 Thread via GitHub
andygrove commented on PR #3266: URL: https://github.com/apache/datafusion-comet/pull/3266#issuecomment-3796828831 @wForget I've now added the fix for closing the `ColumnarBatch` after conversion (commit a5857078f6f). The issue was that after calling `converter.convert(batch)`, the b

  1   2   3   >